November 8, 2016

Performance or Capacity

CMG imPACt, Conference by the Computer Measurement Group

We explore the gap between measurement and aggregation approaches used in performance monitoring, which are not always useful for capacity planning, vs approaches used in capacity planning are often meaningless for performance analysis, and discusses ways to reconcile the two tasks.

Alexander Gilgur, Steve Politis
November 2, 2016

DQBarge: Improving Data-Quality Tradeoffs in Large-Scale Internet Services

OSDI 2016

DQBarge is a system that enables better data-quality tradeoffs by propagating critical information along the causal path of request processing.

Jason Flinn, Kaushik Veeraraghavan, Michael Cafarella, Michael Chow
November 2, 2016

Kraken: Leveraging Live Traffic Tests to Identify and Resolve Resource Utilization Bottlenecks in Large Scale Web Services

OSDI 2016

Kraken is a new system that runs load tests by continually shifting live user traffic to one or more data centers.

Kaushik Veeraraghavan, Justin Meza, David Chou, Wonho Kim, Sonia Margulis, Scott Michelson, Rajesh Nishtala, Daniel Obenshain, Dmitri Perelman, Yee Jiun Song
September 5, 2016

Cubrick: Indexing Millions of Records per Second for Interactive Analytics

VLDB 2016

This paper describes the architecture and design of Cubrick, a distributed multidimensional in-memory DBMS suited for interactive analytics over highly dynamic datasets.

Pedro Eugenio Rocha Pedreira, Chris Croswhite, Luis Bona
September 1, 2016

Desugaring Haskell’s Do-Notation into Applicative Operations

ACM SIGPLAN Haskell Sympoisum

In this paper we show how to re-use the very same do-notation to work for Applicatives as well, providing efficiency benefits for some types that are both Monad and Applicative, and syntactic convenience for those that are merely Applicative.

Andrey Mokhov, Edward Kmett, Simon Marlow, Simon Peyton Jones
August 23, 2016

Robotron: Top-down Network Management at Facebook Scale


In this paper, we present Robotron, a system for managing a massive production network in a top-down fashion.

Yu-Wei Eric Sung, Xiaozheng Tie, Starsky H.Y. Wong, James Hongyi Zeng
August 13, 2016

Compressing Graphs and Indexes with Recursive Graph Bisection


Graph reordering is a powerful technique to increase the locality of the representations of graphs, which can be helpful in several applications. We study how the technique can be used to improve compression of graphs and inverted indexes.

Laxman Dhulipala, Igor Kabiljo, Brian Karrer, Giuseppe Ottaviano, Sergey Pupyrev, Alon Shalita
June 25, 2016

Realtime Data Processing at Facebook


Realtime data processing powers many use cases at Facebook, including realtime reporting of the aggregated, anonymized voice of Facebook users, analytics for mobile applications, and insights for Facebook page administrators.

Guoqiang Jerry Chen, Janet Wiener, Shridhar Iyer, Anshul Jaiswal, Ran Lei, Nikhil Simha, Wei Wang, Kevin Wilfong, Tim Williamson, Serhat Yilmaz
June 18, 2016

Dynamo: Facebook’s Data Center-Wide Power Management System

ISCA 2016

In this paper, we describe Dynamo – a data center-wide power management system that monitors the entire power hierarchy and makes coordinated control decisions to safely and efficiently use provisioned data center power.

Qiang Wu, Qingyuan Deng, Lakshmi Ganesh, Chang-Hong Raymond Hsu, Yun Jin, Sanjeev Kumar, Bin Li, Justin Meza, Yee Jiun Song
June 11, 2016

Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical Inference

International Symposium on Computer Architecture

Managing tail latency of requests has become one of the primary challenges for large-scale Internet services. In this paper, we develop a methodology for statistically rigorous performance evaluation for server workloads.

Yunqi Zhang, David Meisner, Jason Mars, Lingjia Tang
May 22, 2016

The Social Ties of Immigrant Communities in the United States


In this paper we study the composition of Facebook social networks of people who have moved from one country to another.

Amaç Herdağdelen, Bogdan State, Lada Adamic, Winter Mason
May 21, 2016

Continuous Deployment at Facebook and OANDA

ICSE 2016: 38th IEEE Conference on Software Engineering

This paper describes the continuous deployment practices at two very different firms: Facebook and OANDA, and shows that continuous deployment does not inhibit productivity or quality even in the face of substantial engineering team and code size growth.

Tony Savor, Mitchell Douglas, Michael Gentili, Laurie Williams, Kent Beck, Michael Stumm
May 14, 2016

The Shortest Path is not Always a Straight Line. Leveraging Semi-Metricity in Graph Analysis.

VLDB 2016

This paper leverages the concept of the metric backbone to improve the efficiency of large-scale graph analytics.

Dionysios Logothetis
March 15, 2016

Social Hash: an Assignment Framework for Optimizing Distributed Systems Operations on Social Networks

USINEX Symposium on Networked Systems Design and Implementation (NSDI 2016)

We describe the social hash framework, which uses graph partitioning techniques to improve the performance of systems within Facebook. We highlight two applications: 1. how routing similar users to the same web cluster improves our cache performance, 2. how co-locating socially similar data on the same host improves the performance of data serving systems.

Alon Shalita, Brian Karrer, Igor Kabiljo, Arun Sharma, Aaron Adcock, Alessandro Presta, Herald Kllapi, Michael Stumm
October 4, 2015

Holistic Configuration Management at Facebook

The 25th ACM Symposium on Operating Systems Principles

This paper gives a comprehensive description of the use cases, design, implementation, and usage statistics of a suite of tools that manage Facebook’s configuration end-to-end, including the frontend products, backend systems, and mobile apps.

Chunqiang (CQ) Tang, Thawan Kooburat, Pradeep Venkat, Akshay Chander, Zhe Wen, Aravind Narayanan, Patrick Dowell, Robert Karl
October 4, 2015

Existential Consistency: Measuring and Understanding Consistency at Facebook

The 25th ACM Symposium on Operating Systems Principles

Replicated storage for large Web services faces a trade-off between stronger forms of consistency and higher performance properties. Stronger consistency prevents anomalies, i.e., unexpected behavior visible to users, and reduces programming complexity.

Haonan Lu, Kaushik Veeraraghavan, Philippe Ajoux, Jim Hunt, Yee Jiun Song, Wendy Tobagus, Sanjeev Kumar, Wyatt Lloyd
August 31, 2015

Cubrick: A Scalable Distributed MOLAP Database for Fast Analytics

41st International Conference on Very Large Databases (Ph.D Workshop)

This paper describes the architecture and design of Cubrick, a distributed multidimensional in-memory database that enables real-time data analysis of large dynamic datasets. Cubrick has a strictly multidimensional data model composed of dimensions, dimensional hierarchies and metrics, supporting sub-second MOLAP operations such as slice and dice, roll-up and drill-down over terabytes of data.

Pedro Eugenio Rocha Pedreira, Luis Erpen de Bona, Chris Croswhite
August 31, 2015

One Trillion Edges: Graph Processing at Facebook-Scale

The 41st International Conference on Very Large Data Bases

Analyzing large graphs provides valuable insights for social networking and web companies in content ranking and recommendations. While numerous graph processing systems have been developed and evaluated on available benchmark graphs of up to 6.6B edges, they often face significant difficulties in scaling to much larger graphs. Industry graphs can be two orders of magnitude larger hundreds of billions or up to one trillion edges.

Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, Sambavi Muthukrishnan
August 17, 2015

Inside the Social Network’s (Datacenter) Network


Large cloud service providers have invested in increasingly larger datacenters to house the computing infrastructure required to support their services.

Arjun Roy, James Hongyi Zeng, Jasmeet Bagga, George Porter, Alex C. Snoeren
August 13, 2015

From Categorical Logic to Facebook Engineering

30th Annual ACM/IEEE Symposium on Logic in Computer Science

I chart a line of development from category-theoretic models of programs and logics to automatic program verification/analysis techniques that are in deployment at Facebook. Our journey takes in a number of concepts from the computer science logician’s toolkit – including categorical logic and model theory, denotational semantics, the Curry-Howard isomorphism, substructural logic, Hoare Logic and Separation Logic, abstract interpretation, compositional program analysis, the frame problem, and abductive inference.

Peter O'Hearn