How did Facebook manipulate the Hive storage format to enable it to deal with a data warehouse that stores some 300 petabytes and takes in about 600 terabytes per day? RCFile (record-columnar file format) wasn’t enough, so enter ORCFile.
How is Facebook able to quickly process the sort of queries about users and their friends generated by features such as Graph Search, despite the fact that the relevant data may be stored on several different servers? Software Engineers Alessandro Presta and Alon Shalita offered an example of how the social network uses graph-processing system Apache Giraph to handle those tasks in a post on its engineering blog.
Facebook announced last November that it would open-source interactive query system Presto, and now cloud big data platform provider Qubole is taking the next step, announcing the release of its Presto-as-a-Service Alpha Program via Amazon Web Services.
“Twitter owns social TV. Facebook is trying to get there.” Joseph Pigato, managing director of customer-engagement firm Sparked, spoke those words Wednesday during a panel at mediabistro’s Inside Social Marketing conference in New York, “What We Can Learn from TV’s Top Social Campaigns,” where he was joined by Sesame Workshop Director of New Media Communication Daniel N. Lewis and moderator Natan Edelsburg, vice president of Sawhorse Media and writer for mediabistro’s Lost Remote blog.
Facebook announced that it is open-sourcing its RocksDB embeddable, persistent key-value store, which enables fast storage and global, real-time data fetching of the social network’s massive cache of user data.
Frank Frankovsky, vice president of hardware design and supply chain operations at Facebook and chairman and president of the Open Compute Project, touted the progress made thus far by the Facebook-launched data-storage initiative on the networking-hardware front in a post on the Open Compute Project blog.
Facebook offered some insight into how it handles the more than 300 petabytes of data it stores for its 1.19 billion monthly active users, providing some details on Presto, an interactive query system it created and is open-sourcing, in a note on the Facebook Engineering page.
Facebook wants to know even more about its users than it already does, and an eight-employee group referred to within the company as the AI (artificial intelligence) team is quietly working on incorporating deep learning technology, which uses simulated networks of brain cells to process data, MIT Technology Review reported.