Research Area
Year Published

132 Results

May 25, 2019

UBIS: Utilization-aware cluster scheduling

International Parallel and Distributed Processing Symposium (IPDPS)

Data center costs are among the major enterprise expenses, and any improvement in data center resource utilization corresponds to significant savings in true dollars. We focus on the problem of scheduling jobs in distributed execution environments to improve resource utilization.

By: Karthik Kambatla, Vamsee Yarlagadda, Íñigo Goiri, Ananth Grama

March 24, 2019

Skyway: Connecting Managed Heaps in Distributed Big Data Systems

International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)

This paper presents Skyway, a JVM-based technique that can directly connect managed heaps of different (local or remote) JVM processes.

By: Khan Nguyen, Lu Fang, Christian Navasca, Guoqing Xu, Brian Demsky, Shan Lu

February 16, 2019

Machine Learning at Facebook: Understanding Inference at the Edge

IEEE International Symposium on High-Performance Computer Architecture (HPCA)

This paper takes a data-driven approach to present the opportunities and design challenges faced by Facebook in order to enable machine learning inference locally on smartphones and other edge platforms.

By: Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang, Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo, Peizhao Zhang

February 13, 2019

SapFix: Automated End-to-End Repair at Scale

International Conference on Software Engineering (ICSE)

We report our experience with SAPFIX: the first deployment of automated end-to-end fault fixing, from test case design through to deployed repairs in production code.

By: Alexandru Marginean, Johannes Bader, Satish Chandra, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, Andrew Scott

February 1, 2019

Separation Logic

Communications of the ACM (CACM)

In joint work with John Reynolds and others we developed Separation Logic as a formalism for reasoning about programs that mutate data structures. From a special logic for heaps it gradually evolved into a general theory for modular reasoning about concurrent as well as sequential programs.

By: Peter O'Hearn

January 14, 2019

A True Positives Theorem for a Static Race Detector

Principles of Programming Languages (POPL)

RacerD is a static race detector that has been proven to be effective in engineering practice: it has seen thousands of data races fixed by developers before reaching production, and has supported the migration of Facebook’s Android app rendering infrastructure from a single-threaded to a multi-threaded architecture.

By: Nikos Gorogiannis, Peter O'Hearn, Ilya Sergey

December 7, 2018

Rethinking floating point for deep learning

Systems for Machine Learning Workshop at NeurIPS 2018

We improve floating point to be more energy efficient than equivalent bit width integer hardware on a 28 nm ASIC process while retaining accuracy in 8 bits with a novel hybrid log multiply/linear add, Kulisch accumulation and tapered encodings from Gustafson’s posit format.

By: Jeff Johnson

December 3, 2018

Training with Low-precision Embedding Tables

Systems for Machine Learning Workshop at NeurIPS 2018

In this work, we focus on building a system to train continuous embeddings in low precision floating point representation. Specifically, our system performs SGD-style model updates in single precision arithmetics, casts the updated parameters using stochastic rounding and stores the parameters in half-precision floating point.

By: Jian Zhang, Jiyan Yang, Hector Yuen

November 24, 2018

Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications


The application of deep learning techniques resulted in remarkable improvement of machine learning models. In this paper we provide detailed characterizations of deep learning models used in many Facebook social network services.

By: Jongsoo Park, Maxim Naumov, Protonu Basu, Summer Deng, Aravind Kalaiah, Daya Khudia, James Law, Parth Malani, Andrey Malevich, Satish Nadathur, Juan Pino, Martin Schatz, Alexander Sidorov, Viswanath Sivakumar, Andrew Tulloch, Xiaodong Wang, Yiming Wu, Hector Yuen, Utku Diril, Dmytro Dzhulgakov, Kim Hazelwood, Bill Jia, Yangqing Jia, Lin Qiao, Vijay Rao, Nadav Rotem, Sungjoo Yoo, Mikhail Smelyanskiy

November 3, 2018

RacerD: Compositional Static Race Detection

Conference on Object-Oriented Programming, Systems, Languages & Applications (OOPSLA)

Automatic static detection of data races is one of the most basic problems in reasoning about concurrency. We present RacerD—a static program analysis for detecting data races in Java programs which is fast, can scale to large code, and has proven effective in an industrial software engineering scenario.

By: Sam Blackshear, Nikos Gorogiannis, Peter O'Hearn, Ilya Sergey