Habitat: A Platform for Embodied AI Research

International Conference on Computer Vision (ICCV)


We present Habitat, a platform for research in embodied artificial intelligence (AI). Habitat enables training embodied agents (virtual robots) in highly efficient photorealistic 3D simulation. Specifically, Habitat consists of: (i) Habitat-Sim: a flexible, high-performance 3D simulator with configurable agents, sensors, and generic 3D dataset handling. Habitat-Sim is fast – when rendering a scene from Matterport3D, it achieves several thousand frames per second (fps) running single-threaded, and can reach over 10,000 fps multi-process on a single GPU. (ii) Habitat-API: a modular high-level library for end-to-end development of embodied AI algorithms – defining tasks (e.g. navigation, instruction following, question answering), configuring, training, and benchmarking embodied agents.

These large-scale engineering contributions enable us to answer scientific questions requiring experiments that were till now impracticable or ‘merely’ impractical. Specifically, in the context of point-goal navigation: (1) we revisit the comparison between learning and SLAM approaches from two recent works and find evidence for the opposite conclusion – that learning outperforms SLAM if scaled to an order of magnitude more experience than previous investigations, and (2) we conduct the first cross-dataset generalization experiments {train, test} × {Matterport3D, Gibson} for multiple sensors {blind, RGB, RGBD, D} and find that only agents with depth (D) sensors generalize across datasets. We hope that our open-source platform and these findings will advance research in embodied AI.

Supplemental Materials

Related Publications

All Publications

Plan2vec: Unsupervised Representation Learning by Latent Plans

Ge Yang, Amy Zhang, Ari Morcos, Joelle Pineau, Pieter Abbeel, Roberto Calandra

Learning for Dynamics & Control (L4DC) - June 10, 2020

Objective Mismatch in Model-based Reinforcement Learning

Nathan Lambert, Brandon Amos, Omry Yadan, Roberto Calandra

Learning for Dynamics & Control (L4DC) - June 10, 2020

EGO-TOPO: Environment Affordances from Egocentric Video

Tushar Nagarajan, Yanghao Li, Christoph Feichtenhofer, Kristen Grauman

CVPR - June 14, 2020

Listen to Look: Action Recognition by Previewing Audio

Ruohan Gao, Tae-Hyun Oh, Kristen Grauman, Lorenzo Torresani

CVPR - June 14, 2020

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy