Mesh R-CNN

International Conference on Computer Vision (ICCV)


Rapid advances in 2D perception have led to systems that accurately detect objects in real-world images. However, these systems make predictions in 2D, ignoring the 3D structure of the world. Concurrently, advances in 3D shape prediction have mostly focused on synthetic benchmarks and isolated objects. We unify advances in these two areas. We propose a system that detects objects in real-world images and produces a triangle mesh giving the full 3D shape of each detected object. Our system, called Mesh R-CNN, augments Mask R-CNN with a mesh prediction branch that outputs meshes with varying topological structure by first predicting coarse voxel representations which are converted to meshes and refined with a graph convolution network operating over the mesh’s vertices and edges. We validate our mesh prediction branch on ShapeNet, where we out-perform prior work on single-image shape prediction. We then deploy our full Mesh R-CNN system on Pix3D, where we jointly detect objects and predict their 3D shapes.

Related Publications

All Publications

CVPR - June 19, 2021

Robust Audio-Visual Instance Discrimination

Pedro Morgado, Ishan Misra, Nuno Vasconcelos

CVPR - June 19, 2021

Audio-Visual Instance Discrimination with Cross-Modal Agreement

Pedro Morgado, Nuno Vasconcelos, Ishan Misra

NeurIPS - December 1, 2019

Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation

Benet Oriol Sabat, Cristian Canton Ferrer, Xavier Giro-i-Nieto

arXiv - June 19, 2021

Fast and Accurate Model Scaling

Piotr Dollár, Mannat Singh, Ross Girshick

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy