Order-Aware Generative Modeling Using the 3D-Craft Dataset

International Conference on Computer Vision (ICCV)


Research on 2D and 3D generative models typically focuses on the final artifact being created, e.g., an image or a 3D structure. Unlike 2D image generation, the generation of 3D objects in the real world is commonly constrained by the process and order in which the object is constructed. For instance, gravity needs to be taken into account when building a block tower.

In this paper, we explore the prediction of ordered actions to construct 3D objects. Instead of predicting actions based on physical constraints, we propose learning through observing human actions. To enable large-scale data collection, we use the Minecraft1 environment. We introduce 3D-Craft, a new dataset of 2,500 Minecraft houses each built by human players sequentially from scratch. To learn from these human action sequences, we propose an order-aware 3D generative model called VoxelCNN. In contrast to other 3D generative models which either have no explicit order (e.g. holistic generation with 3DGAN [35]), or follow a simple heuristic order (e.g. raster-scan), VoxelCNN is trained to imitate human building order with spatial awareness. We also transferred the order to other dataset such as ShapeNet[10]. The 3D-Craft dataset, models, and benchmark system will be made publicly available, which may inspire new directions for future research exploration.

Related Publications

All Publications

Interspeech - October 12, 2021

LiRA: Learning Visual Speech Representations from Audio through Self-supervision

Pingchuan Ma, Rodrigo Mira, Stavros Petridis, Björn W. Schuller, Maja Pantic

CVPR - June 20, 2021

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

M. Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc Van Gool, Rainer Stiefelhagen

ICML - July 18, 2021

Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

David Eriksson, Pierce I-Jen Chuang, Samuel Daulton, Peng Xia, Akshat Shrivastava, Arun Babu, Shicong Zhao, Ahmed Aly, Ganesh Venkatesh, Maximilian Balandat

3DV - November 18, 2021

Recovering Real-World Reflectance Properties and Shading From HDR Imagery

Bjoern Haefner, Simon Green, Alan Oursland, Daniel Andersen, Michael Goesele, Daniel Cremers, Richard Newcombe, Thomas Whelan

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy