Publication

A Social Network Under Social Distancing: Risk-Driven Backbone Management During COVID-19 and Beyond

USENIX Symposium on Networked Systems Design and Implementation (NSDI)


Abstract

As the COVID-19 pandemic reshapes our social landscape, its lessons have far-reaching implications on how online service providers manage their infrastructure to mitigate risks. This paper presents Facebook’s risk-driven backbone management strategy to ensure high service performance throughout the COVID-19 pandemic. We describe Risk Simulation System (RSS), a production system that identifies possible failures and quantifies their potential severity with a set of metrics for network risk. With a year-long risk measurement from RSS we show that our backbone resiliently withstood the COVID-19 stress test, achieving high service availability and low route dilation while efficiently handling traffic surges. We also share our operational practices to mitigate risk throughout the pandemic.

Our findings give insights to further improve risk-driven network management. We argue for incorporating short-term failure statistics in modeling failures. Common failure prediction models based on long-term modeling achieve stable output at the cost of assigning low significance to unique short-term events of extreme importance such as COVID-19. Furthermore, we advocate augmenting network management techniques with non-networking signals. We support this by identifying and analyzing the correlation between network traffic and human mobility.

Related Publications

All Publications

IEEE International Workshop on Spectrum Sharing Technology for Next Generation Communications - June 14, 2021

A Configurable 60GHz Phased Array Platform for Multi-Link mmWave Channel Characterization

Anton Shkel, Alireza Mehrabani, Julius Kusuma

ISCA - June 14, 2021

Enabling Compute-Communication Overlap in Distributed Deep Learning Training Platforms

Saeed Rashidi, Matthew Denton, Srinivas Sridharan, Sudarshan Srinivasan, Amoghavarsha Suresh, Jade Nie, Tushar Krishna

"Optica" Published by the Optical Society (OSA) - February 11, 2021

Low-loss, Centimeter-scale Plasmonic Metasurface for Ultrafast Optoelectronics

Andrew J. Traverso, Jiani Huang, Thibault Peyronel, Guoce Yang, Tobias G. Tiecke, Maiken H. Mikkelsen

NeurIPS - December 6, 2020

High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Qing Feng, Benjamin Letham, Hongzi Mao, Eytan Bakshy

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy