Facebook Built Most Of Timeline In Six Months

Facebook’s new timeline profile took about six months to build, from the initial infrastructure-review meeting to successfully turning on the 100 percent back-end load test.

Infrastructure Engineer Ryan Mack explained how those six months went in a post on the social network. Here are some of the highlights:

  • Timeline was built on four core Facebook technologies:
    • MySQL and InnoDB storage and replication;
    • Multifeed, which already powers the news feed was used for ranking;
    • Thrift communications ; and
    • Memcached caching.
  • In order to support Facebook’s ranking model for timeline, the social network would have had to store its entire data set in cache, including non-displayed low-value data, so “a massive denormalization process was required.”
  • The timeline aggregator began as a modified version of the Multifeed aggregator, but it now runs locally on each database, allowing Facebook to use its disks to maximum capacity without sending data over the network that won’t be displayed.
  • The results of large queries, such as ranking a user’s activity for the year 2010, are small and can be cached for long periods without invalidations.
  • Recent activity changes frequently, but it can also be regenerated quickly.

Mack also said:

Timeline started as a Hackathon project in late 2010 with two full-time engineers, an engineering intern, and a designer building a working demo in a single night.

The full team ramped up in early 2011, and the development team was split into design, front-end engineering, infrastructure engineering, and data migrations. By doing staged and layered prototyping, we achieved an amazing amount of development parallelism, and rarely was any part of the team blocked by another. Early on in the project we were simultaneously:

  1. Designing user interface prototypes with our pre-existing but non-scalable back end
  2. Building production front-end code on a simulation of the scalable back end
  3. Building the scalable back end using samples of denormalized data from a prototype of denormalization migration
  4. Building the framework to run the full-scale denormalization process
  5. Collecting and copying the data necessary for the denormalization
  6. Performing simulated load testing to validate our capacity planning estimates

In retrospect, that’s pretty crazy. We had to move a lot of mountains to go from the initial infrastructure-review meeting to successfully turning on the 100 percent back-end load test in just six months. Done another way, this project could have taken twice as long — and that’s being generous.

Related Stories
Mediabistro Course

Blogging: Analytics, SEO, and Content

Blogging: Analytics, SEO, and ContentWork with the former marketing director at Conde Nast Digital to improve your search rankings, integrate social, and increase traffic to your blog! Starting November 5, Jim Hopkinson will teach you how to analyze KPIs, monetize your blog through ads, sponsorships, and affiliates, and leverage your blog toward a larger platform such as publishing, speaking, or consulting. Register now!