Data warehousing and analytics infrastructure at facebook

15 years 1 months ago

Download borthakur.com

Scalable analysis on large data sets has been core to the functions of a number of teams at Facebook - both engineering and nonengineering. Apart from ad hoc analysis of data and creation of business intelligence dashboards by analysts across the company, a number of Facebook's site features are also based on analyzing large data sets. These features range from simple reporting applications like Insights for the Facebook Advertisers, to more advanced kinds such as friend recommendations. In order to support this diversity of use cases on the ever increasing amount of data, a flexible infrastructure that scales up in a cost effective manner, is critical. We have leveraged, authored and contributed to a number of open source technologies in order to address these requirements at Facebook. These include Scribe, Hadoop and Hive which together form the cornerstones of the log collection, storage and analytics infrastructure at Facebook. In this paper we will present how these systems ...

Ashish Thusoo, Zheng Shao, Suresh Anthony, Dhruba

Real-time Traffic

Business Intelligence Dashboards | Database | Facebook | Large Data | SIGMOD 2010 |

claim paper

» Largescale machine learning at twitter

Post Info
More Details (n/a)

Added	21 May 2011
Updated	21 May 2011
Type	Journal
Year	2010
Where	SIGMOD
Authors	Ashish Thusoo, Zheng Shao, Suresh Anthony, Dhruba Borthakur, Namit Jain, Joydeep Sen Sarma, Raghotham Murthy, Hao Liu

Comments (0)

Sciweavers

Data warehousing and analytics infrastructure at facebook

Business Intelligence Dashboards | Database | Facebook | Large Data | SIGMOD 2010 |

Explore & Download

Productivity Tools

Sciweavers