Direction of Arrival Estimation in Highly Reverberant Environments Using Soft Time-Frequency Mask

IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

By: Vladimir Tourbabin, Jacob Donley, Boaz Rafaely, Ravish Mehra

Abstract

A recent approach to improving the robustness of sound localization in reverberant environments is based on pre-selection of time-frequency pixels that are dominated by direct sound. This approach is equivalent to applying a binary time-frequency mask prior to the localization stage. Although the binary mask approach was shown to be effective, it may not exploit the information available in the captured signal to its full extent. In an attempt to overcome this limitation, it is hereby proposed to employ a soft mask instead of the binary mask. The proposed weighting scheme is based directly on a metric of the direct-to-reverberant sound ratio in each individual time-frequency pixel. Evaluation using simulated reverberant speech recordings indicates substantial improvement in the localization performance when using the proposed soft mask weighting.