Do Sound Event Representations Generalize To Other Audio Tasks? A Case Study In Audio Transfer Learning



Transfer learning is critical for efficient information transfer across multiple related learning problems. A simple yet effective transfer learning approach utilizes deep neural networks trained on a large-scale task for feature extraction; the extracted representations are then used to learn related downstream tasks. In this paper, we investigate the transfer learning capacity of audio representations obtained from neural networks trained on a large-scale sound event detection dataset. We evaluate these representations across a wide range of other audio tasks by training simple linear classifiers for them, and show that such a simple mechanism of transfer learning is already powerful enough to achieve high performance on the downstream tasks. We also provide insights into the properties of the learned sound event representations that enable such efficient information transfer.

Related Publications

All Publications

EMNLP - October 31, 2021

Evaluation Paradigms in Question Answering

Pedro Rodriguez, Jordan Boyd-Graber

ASRU - December 13, 2021

Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems

Yangyang Xia, Buye Xu, Anurag Kumar

IROS - September 1, 2021

Success Weighted by Completion Time: A Dynamics-Aware Evaluation Criteria for Embodied Navigation

Naoki Yokoyama, Sehoon Ha, Dhruv Batra

EMNLP - November 16, 2020

Abusive Language Detection using Syntactic Dependency Graphs

Kanika Narang, Chris Brew

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy