Statistical Testing on ASR Performance via Blockwise Bootstrap



A common question being raised in automatic speech recognition (ASR) evaluations is how reliable is an observed word error rate (WER) improvement comparing two ASR systems, where statistical hypothesis testing and confidence interval (CI) can be utilized to tell whether this improvement is real or only due to random chance. The bootstrap resampling method has been popular for such significance analysis which is intuitive and easy to use. However, this method fails in dealing with dependent data, which is prevalent in speech world – for example, ASR performance on utterances from the same speaker could be correlated. In this paper we present blockwise bootstrap approach – by dividing evaluation utterances into nonoverlapping blocks, this method resamples these blocks instead of original data. We show that the resulting variance estimator of absolute WER difference between two ASR systems is consistent under mild conditions. We also demonstrate the validity of blockwise bootstrap method on both synthetic and real-world speech data.

Related Publications

All Publications

AISTATS - April 13, 2021

Aligning Time Series on Incomparable Spaces

Samuel Cohen, Giulia Luise, Alexander Terenin, Brandon Amos, Marc Peter Deisenroth

SIGGRAPH - August 9, 2021

Mixture of Volumetric Primitives for Efficient Neural Rendering

Stephen Lombardi, Tomas Simon, Gabriel Schwartz, Michael Zollhoefer, Yaser Sheikh, Jason Saragih

AISTATS - April 13, 2021

Continual Learning using a Bayesian Nonparametric Dictionary of Weight Factors

Nikhil Mehta, Kevin J Liang, Vinay K Verma, Lawrence Carin

NeurIPS - December 6, 2020

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs

Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy