How did Facebook manipulate the Hive storage format to enable it to deal with a data warehouse that stores some 300 petabytes and takes in about 600 terabytes per day? RCFile (record-columnar file format) wasn’t enough, so enter ORCFile.
When the average person thinks about 10,000 Blu-ray discs, they likely imagine an impressive movie collection, but when Facebook Vice President of Infrastructure Engineering Jay Parikh and Director of Infrastructure Jason Taylor thought about 10,000 Blu-ray discs, data storage came to mind.
Frank Frankovsky, vice president of hardware design and supply chain operations at Facebook and chairman and president of the Open Compute Project, spoke with Arik Hesseldahl and Mike Isaac of AllThingsD about how the social network configures its hardware to deal with the massive amounts of data it handles.
Facebook uses so many resources just to save the 240 billion-plus photos that are on the social network. Now the company is utilizing cold storage at its Prineville, Ore., data center to make sure older photos can be as easily accessed as the ones users uploaded five minutes ago.