Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications



The application of deep learning techniques resulted in remarkable improvement of machine learning models. In this paper we provide detailed characterizations of deep learning models used in many Facebook social network services. We present computational characteristics of our models, describe high-performance optimizations targeting existing systems, point out their limitations and make suggestions for the future general-purpose/accelerated inference hardware. Also, we highlight the need for better co-design of algorithms, numerics and computing platforms to address the challenges of workloads often run in data centers.

Related Publications

All Publications

Uncertainty and Robustness in Deep Learning Workshop at ICML - June 24, 2021

DAIR: Data Augmented Invariant Regularization

Tianjian Huang, Chinnadhurai Sankar, Pooyan Amini, Satwik Kottur, Alborz Geramifard, Meisam Razaviyayn, Ahmad Beirami

AutoML Workshop at NeurIPS - July 18, 2021

Neural Fixed-Point Acceleration for Convex Optimization

Shobha Venkataraman, Brandon Amos

ESEM - September 23, 2021

Measurement Challenges for Cyber Cyber Digital Twins: Experiences from the Deployment of Facebook’s WW Simulation System

Kinga Bojarczuk, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Mark Lucas, Erik Meijer, Rubmary Rojas, Silvia Sapora

Federated Learning for User Privacy and Data Confidentiality Workshop At ICML - July 24, 2021

Federated Learning with Buffered Asynchronous Aggregation

John Nguyen, Kshitiz Malik, Hongyuan Zhan, Ashkan Yousefpour, Michael Rabbat, Mani Malek, Dzmitry Huba

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy