Publication

Neural Volumes: Learning Dynamic Renderable Volumes from Images

SIGGRAPH


Abstract

Modeling and rendering of dynamic scenes is challenging, as natural scenes often contain complex phenomena such as thin structures, evolving topology, translucency, scattering, occlusion, and biological motion. Mesh-based reconstruction and tracking often fail in these cases, and other approaches (e.g., light field video) typically rely on constrained viewing conditions, which limit interactivity. We circumvent these difficulties by presenting a learning-based approach to representing dynamic objects inspired by the integral projection model used in tomographic imaging. The approach is supervised directly from 2D images in a multi-view capture setting and does not require explicit reconstruction or tracking of the object. Our method has two primary components: an encoder-decoder network that transforms input images into a 3D volume representation, and a differentiable ray-marching operation that enables end-to-end training. By virtue of its 3D representation, our construction extrapolates better to novel viewpoints compared to screen-space rendering techniques. The encoder-decoder architecture learns a latent representation of a dynamic scene that enables us to produce novel content sequences not seen during training. To overcome memory limitations of voxel-based representations, we learn a dynamic irregular grid structure implemented with a warp field during ray-marching. This structure greatly improves the apparent resolution and reduces grid-like artifacts and jagged motion. Finally, we demonstrate how to incorporate surface-based representations into our volumetric-learning framework for applications where the highest resolution is required, using facial performance capture as a case in point.

Related Publications

All Publications

CVPR - June 19, 2021

SimPoE: Simulated Character Control for 3D Human Pose Estimation

Ye Yuan, Shih-En Wei, Tomas Simon, Kris Kitani, Jason Saragih

ICLR - May 3, 2021

Support-Set Bottlenecks for Video-Text Representation Learning

Mandela Patrick, Po-Yao Huang, Florian Metze, Andrea Vedaldi, Alexander Hauptmann, Yuki M. Asano, João Henriques

CVPR - June 19, 2021

Pixel Codec Avatars

Shugao Ma, Tomas Simon, Jason Saragih, Dawei Wang, Yuecheng Li, Fernando De la Torre, Yaser Sheikh

CVPR - June 1, 2021

Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans

Bindita Chaudhuri, Nikolaos Sarafianos, Linda Shapiro, Tony Tung

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy