Bolei Zhou is a PhD student in Computer Science and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology, working with Professor Antonio Torralba. His research interests include computer vision and machine learning. In particular, his current research focuses on developing deep learning models for scene understanding. He received his Bachelor degree from Shanghai Jiao Tong University and his Master degree from the Chinese University of Hong Kong.

Research Summary

Scene understanding lays among the central topics of computer vision research. Given an image of scene, ideal artificial intelligence models should not only recognize the type of the environment, the names of the objects, and the roles of people inside the image, but also reason the interaction of objects and people like our human do. Recent progress in visual recognition comes from the access to labeled image datasets with millions of exemplars and the deep learning models such as convolutional neural networks (CNNs) . Whereas performance on object recognition has benefited from large datasets such as ImageNet, no equivalent dataset existed for scene recognition. Thus, Bolei’s research is directed first towards building up the large-scale image dataset Places Database, which contains 10 million labeled images for over 400 scene categories. Trained on the Places Database, the CNNs achieve the state-of-the art performance for scene recognition. To understand the superior performance of CNNs, Bolei explored the nature of the representations inside those CNNs, by visualizing the internal units and annotating their semantics. The empirical analysis of the CNNs further lead to the development of various techniques, which allow the CNNs to localize informative objects and annotate multi-scale visual concepts on the image for a wide variety of tasks. For detailed information, please visit his webpage at