ECCV 2020
SoundSpaces: Audio-Visual Navigation in 3D Environments
Moving around in the world is naturally a multisensory experience, but today’s embodied agents are deaf—restricted to solely their visual perception of the environment. We introduce audio-visual navigation for complex, acoustically and visually realistic 3D environments. By both seeing and hearing, the agent must learn to navigate to a sounding object. We propose a multi-modal deep reinforcement learning approach to train navigation policies end-to-end from a stream of egocentric audio-visual observations, allowing the agent to (1) discover elements of the geometry of the physical space indicated by the reverberating audio and (2) detect and follow sound-emitting targets. (Read more)
All The Latest

October 14, 2020
Innovation comes through exploration and challenging our assumptions

August 31, 2020
Are Labels Necessary for Neural Architecture Search?

August 24, 2020