Training Deep Learning Recommendation Model with Quantized Collective Communications

Conference on Knowledge Discovery and Data Mining (KDD)


Deep Learning Recommendation Model (DLRM) captures our representative model architectures developed for click-through-rate (CTR) prediction based on high-dimensional sparse categorical data. Collective communications can account for a significant fraction of time in synchronous training of DLRM at scale. In this work, we explore using fine-grain integer quantization to reduce the communication volume of alltoall and allreduce collectives. We emulate quantized alltoall and allreduce, the latter using ring or recursive-doubling and each with optional carried-forward error compensation. We benchmark accuracy loss of quantized alltoall and allreduce with a representative DLRM model and Kaggle 7D dataset. We show that alltoall forward and backward passes, and dense allreduce can be quantized to 4 bits without accuracy loss compared to full-precision training.

Related Publications

All Publications

Federated Learning for User Privacy and Data Confidentiality Workshop At ICML - July 24, 2021

Federated Learning with Buffered Asynchronous Aggregation

John Nguyen, Kshitiz Malik, Hongyuan Zhan, Ashkan Yousefpour, Michael Rabbat, Mani Malek, Dzmitry Huba

UAI - July 28, 2021

A Nonmyopic Approach to Cost-Constrained Bayesian Optimization

Eric Hans Lee, David Eriksson, Valerio Perrone, Matthias Seeger

ACM MM - October 20, 2021

EVRNet: Efficient Video Restoration on Edge Devices

Sachin Mehta, Amit Kumar, Fitsum Reda, Varun Nasery, Vikram Mulukutla, Rakesh Ranjan, Vikas Chandra

ICCV - October 11, 2021

Egocentric Pose Estimation from Human Vision Span

Hao Jiang, Vamsi Krishna Ithapu

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy