The importance of time-frequency averaging for binaural speaker localization in reverberant environments



A common approach to overcoming the effect of reverberation in speaker localization is to identify the time-frequency (TF) bins in which the direct path is dominant, and then to use only these bins for estimation. Various direct-path dominance (DPD) tests have been proposed for identifying the direct-path bins. However, for a two-microphone binaural array, tests that do not employ averaging over TF bins seem to fail. In this paper, this anomaly is studied by comparing two DPD tests, in which only one has been designed to employ averaging over TF bins. An analysis of these tests shows that, in the binaural case, a TF bin that is dominated by multiple reflections may be similar to a bin with a single source. This insight can explain the high false alarm rate encountered with tests that do not employ averaging. Also, it is shown that incorporating averaging over TF bins can reduce the false alarm rate. A simulation study is presented that verifies the importance of TF averaging for a reliable selection of direct-path bins in the binaural case.

Related Publications

All Publications

ICASSP - June 6, 2021

Multi-Channel Speech Enhancement Using Graph Neural Networks

Panagiotis Tzirakis, Anurag Kumar, Jacob Donley

SIGGRAPH - August 17, 2020

Consistent Video Depth Estimation

Xuan Luo, Jia-Bin Huang, Richard Szeliski, Kevin Matzen, Johannes Kopf

ICSE - November 23, 2020

Predictive Test Selection

Mateusz Machalica, Alex Samylkin, Meredith Porth, Satish Chandra

3DV - November 25, 2020

MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video

Donglai Xiang, Fabian Prada, Chenglei Wu, Jessica Hodgins

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy