I joined Facebook in 2009. My work has been in the area of scalable and reliable software infrastructure. My current project focuses on deep learning infrastructure, especially large-scale neural network training using heterogeneous computing platform such as GPU. My previous projects at Facebook involved distributed data caching and consistency, reliability/efficiency optimization, capacity and smart load balancing, data center scale power management, and data warehouse.

Prior to Facebook, I worked at AMD on GPU compilers and GPGPU. I obtained my PhD from Princeton University on compilers and computer architecture.

Interests

Scalable and distributed software systems in general, large-scale machine learning infrastructure, heterogeneous computing on GPGPU and other platforms, Data center efficiency and reliability

Publications

Blog