Mixed Source Sound Field Translation for Virtual Binaural Application with Perceptual Validation

IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)


Non-interactive and linear experiences like cinema film offer high quality surround sound audio to enhance immersion, however, the perspective is usually fixed to the recording microphone position. With the rise of virtual reality, there is a demand for recording and recreating real-world experiences that allow users to move throughout the reproduction. Sound field translation achieves this by building an equivalent environment of virtual sources to recreate the recording spatially. However, the technique remains to restrict the maximum distance a user can translate away from the recording microphone’s perspective due to the discrete sampling by commercial higher order microphones only being capable of recording an acoustic sweet-spot. In this paper, we propose a method for binaurally reproducing a microphone recording in a virtual application that allows the user to freely translate their body further beyond the recording position. The method incorporates a mixture of near-field and far-field sources in a sparsely expanded virtual environment to maintain a perceptually accurate reproduction. We perceptually validate the method through a Multiple Stimulus with Hidden Reference and Anchor (MUSHRA) experiment. Compared to the planewave benchmark, the proposed method offers both improved source localizability and robustness to spectral distortions at translated listening positions. A cross-examination with numerical simulations demonstrated that the sparse expansion relaxes the inherent sweet-spot constraint, leading to the improved localizability for sparse environments. Additionally, the proposed method is seen to better reproduce the intensity and binaural room impulse response spectra of near-field environments, further supporting the perceptual results.

Related Publications

All Publications

ICASSP - June 7, 2021

Applied Methods for Sparse Sampling of Head-related Transfer Functions

Lior Arbel, Zamir Ben-Hur, David Lou Alon, Boaz Rafaely

ICASSP - June 6, 2021

On the Predictability of HRTFs from Ear Shapes Using Deep Networks

Yaxuan Zhou, Hao Jiang, Vamsi Krishna Ithapu

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy