I am a production engineer, supporting the AI infrastructure at Facebook. My research interests include exploring new computer architectures and designing scalable distributed systems. Before joining Facebook, I graduated with a master’s and a bachelor’s degree from University of Illinois at Urbana-Champaign, studying Computer Engineering.
Computer architecture, HW/SW co-design, systems for ML
MLSys - April 8, 2021
CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery
Kiwan Maeng, Shivam Bharuka, Isabel Gao, Mark C. Jeffrey, Vikram Saraph, Bor-Yiing Su, Caroline Trippel, Jiyan Yang, Michael G. Rabbat, Brandon Lucia, Carole-Jean Wu