Explore the latest research from Facebook

All Publications

April 12, 2021 Anubhavnidhi Abhashkumar, Kausik Subramanian, Alexey Andreyev, Hyojeong Kim, Nanda Kishore Salem, Jingyi Yang, Petr Lapukhov, Aditya Akella, James Hongyi Zeng
Paper

Running BGP in Data Centers at Scale

In this paper, we present Facebook’s BGP-based data center routing design and how it marries data center’s stringent requirements with BGP’s functionality. We present the design’s significant artifacts, including the BGP Autonomous System Number (ASN) allocation, route summarization, and our sophisticated BGP policy set.
Paper
October 27, 2020 Timm Böttger, Ghida Ibrahim, Ben Vallis
Paper

How the Internet reacted to Covid-19 – A perspective from Facebook’s Edge Network

The Covid-19 pandemic has led to unprecedented changes in the way people interact with each other, which as a consequence has increased pressure on the Internet. In this paper we provide a perspective of the scale of Internet traffic growth and how well the Internet coped with the increased demand as seen from Facebook’s edge network.
Paper
October 26, 2020 Usama Naseer, Luca Niccolini, Udip Pant, Alan Frindell, Ranjeeth Dasineni, Theophilus A. Benson
Paper

Zero Downtime Release: Disruption-free Load Balancing of a Multi-Billion User Website

In this paper, we leverage different components of the end-to-end networking infrastructure to prevent or mask any disruptions in face of releases. Zero Downtime Release is a collection of mechanisms used at Facebook to shield the end-users from any disruptions, preserve the cluster capacity and robustness of the infrastructure when updates are released globally.
Paper
October 21, 2019 Brandon Schlinker, Italo Cunha, Yi-Ching Chiu, Srikanth Sundaresan, Ethan Katz-Bassett
Paper

Internet Performance from Facebook’s Edge

In this paper, we execute a large-scale, 10-day measurement study of metrics at the TCP and HTTP layers for production user traffic at all of Facebook’s PoPs worldwide, collecting performance measurements for hundreds of trillions of sampled HTTP sessions.
Paper
August 22, 2018 Sean Choi, Boris Burkov, Alex Eckert, Tian Fang, Saman Kazemkhani, Rob Sherwood, Ying Zhang, James Hongyi Zeng
Paper

FBOSS: Building Switch Software at Scale

We present FBOSS, our own data center switch software, that is designed with the basis on our switch-as-a-server and deploy-early-and-iterate principles.
Paper
April 9, 2018 Praveen Kumar, Yang Yuan, Chris Yu, Nate Foster, Robert Kleinberg, Petr Lapukhov, Chiun Lin Lim, Robert Soule
Paper

Semi-Oblivious Traffic Engineering: The Road Not Taken

This paper presents a system that uses a set of paths computed using Räcke’s oblivious routing algorithm, as well as a centralized controller to dynamically adapt sending rates. Although oblivious routing and centralized TE have been studied previously in isolation, their combination is novel and powerful.
Paper
December 12, 2017 Anubhavnidhi Abhashkumar, Joon-Myung Kang, Sujata Banerjee, Aditya Akella, Ying Zhang, Wenfei Wu
Paper

Supporting Diverse Dynamic Intent-based Policies using Janus

In this paper we propose Janus, a system which makes two major contributions to network policy abstractions. First, we extend the prior policy graph abstraction model to represent complex QoS and dynamic stateful/temporal policies. Second, we convert the policy configuration problem into an optimization problem with the goal of maximizing the number of satisfied and configured policies, and minimizing the number of path changes under dynamic environments.
Paper
November 1, 2017 Qiao Zhang, Vincent Liu, James Hongyi Zeng, Arvind Krishnamurthy
Paper

High-Resolution Measurement of Data Center Microbursts

In this study, we explore the fine-grained behaviors of a large production data center using extremely highresolution measurements (10s to 100s of microsecond) of rack-level traffic. Our results show that characterizing network events like congestion and synchronized behavior in data centers does indeed require the use of such measurements.
Paper
September 17, 2017 Arjun Roy, James Hongyi Zeng, Jasmeet Bagga, Alex C. Snoeren
Paper

Passive Realtime Datacenter Fault Detection and Localization

In this article, we present our passive hybrid approach that combines network path information with end-host-based statistics to rapidly detect and pinpoint the location of datacenter network faults inside a production Facebook datacenter.
Paper
August 21, 2017 Brandon Schlinker, Hyojeong Kim, Timothy Cui, Ethan Katz-Bassett, Harsha V. Madhyastha, Italo Cunha, James Quinn, Saif Hasan, Petr Lapukhov, James Hongyi Zeng
Paper

Engineering Egress with Edge Fabric: Steering Oceans of Content to the World

This paper presents Edge Fabric, an SDN-based system we built and deployed to tackle the challenges of point presence for Facebook, which serves over two billion users from dozens of points of presence on six continents.
Paper