Efficient Wait-k Models for Simultaneous Machine Translation



Simultaneous machine translation consists in starting output generation before the entire input sequence is available. Wait-k decoders offer a simple but efficient approach for this problem. They first read k source tokens, after which they alternate between producing a target token and reading another source token. We investigate the behavior of wait-k decoding in low resource settings for spoken corpora using IWSLT datasets. We improve training of these models using unidirectional encoders, and training across multiple values of k. Experiments with Transformer and 2D-convolutional architectures show that our wait-k models generalize well across a wide range of latency levels. We also show that the 2D-convolution architecture is competitive with Transformers for simultaneous translation of spoken language.

Related Publications

All Publications

ICASSP - June 6, 2021

Multi-Channel Speech Enhancement Using Graph Neural Networks

Panagiotis Tzirakis, Anurag Kumar, Jacob Donley

ICASSP - May 11, 2019

Unsupervised Polyglot Text-To-Speech

Eliya Nachmani, Lior Wolf

Clinical NLP Workshop at EMNLP - November 12, 2020

Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art

Patrick Lewis, Myle Ott, Jingfei Du, Veslin Stoyanov

Interspeech - October 30, 2020

Interactive Text-to-Speech System via Joint Style Analysis

Yang Gao, Weiyi Zheng, Zhaojun Yang, Thilo Koehler, Christian Fuegen, Qing He

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy