Research Area
Year Published

124 Results

November 1, 2019

Simple and Effective Noisy Channel Modeling for Neural Machine Translation

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence. This makes decoding decisions based on partial source prefixes even though the full source is available. We pursue an alternative approach based on standard sequence to sequence models which utilize the entire source.

By: Kyra Yee, Yann Dauphin, Michael Auli

September 17, 2019

Unsupervised Singing Voice Conversion


We present a deep learning method for singing voice conversion. The proposed network is not conditioned on the text or on the notes, and it directly converts the audio of one singer to the voice of another. Training is performed without any form of supervision: no lyrics or any kind of phonetic features, no notes, and no matching samples between singers.

By: Eliya Nachmani, Lior Wolf

September 16, 2019

Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR


End-to-end approaches to automatic speech recognition, such as Listen-Attend-Spell (LAS), blend all components of a traditional speech recognizer into a unified model. Although this simplifies training and decoding pipelines, a unified model is hard to adapt when mismatch exists between training and test data, especially if this information is dynamically changing.

By: Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen

September 15, 2019

Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions


We propose a fully convolutional sequence-to-sequence encoder architecture with a simple and efficient decoder. Our model improves WER on LibriSpeech while being an order of magnitude more efficient than a strong RNN baseline.

By: Awni Hannun, Ann Lee, Qiantong Xu, Ronan Collobert

September 15, 2019

Who Needs Words? Lexicon-Free Speech Recognition


Lexicon-free speech recognition naturally deals with the problem of out-of-vocabulary (OOV) words. In this paper, we show that character-based language models (LM) can perform as well as word-based LMs for speech recognition, in word error rates (WER), even without restricting the decoding to a lexicon.

By: Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

September 15, 2019

wav2vec: Unsupervised Pre-training for Speech Recognition


We explore unsupervised pre-training for speech recognition by learning representations of raw audio. wav2vec is trained on large amounts of unlabeled audio data and the resulting representations are then used to improve acoustic model training. We pre-train a simple multi-layer convolutional neural network optimized via a noise contrastive binary classification task.

By: Steffen Schneider, Alexei Baevski, Ronan Collobert, Michael Auli

September 10, 2019

Bridging the Gap Between Relevance Matching and Semantic Matching for Short Text Similarity Modeling

Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose a novel model, HCAN (Hybrid Co-Attention Network), that comprises (1) a hybrid encoder module that includes ConvNet-based and LSTM-based encoders, (2) a relevance matching module that measures soft term matches with importance weighting at multiple granularities, and (3) a semantic matching module with co-attention mechanisms that capture context-aware semantic relatedness.

By: Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, Jimmy Lin

August 2, 2019

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

Conference on Machine Translation (WMT) at ACL

Following the WMT 2018 Shared Task on Parallel Corpus Filtering (Koehn et al., 2018), we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2% and 10% of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali– English and Sinhala–English. Eleven participants from companies, national research labs, and universities participated in this task.

By: Philipp Koehn, Francisco (Paco) Guzman, Vishrav Chaudhary, Juan Pino

August 2, 2019

Low-Resource Corpus Filtering using Multilingual Sentence Embeddings

Association for Computational Linguistics (ACL)

In this paper, we describe our submission to the WMT19 low-resource parallel corpus filtering shared task. Our main approach is based on the LASER toolkit (Language-Agnostic SEntence Representations), which uses an encoder-decoder architecture trained on a parallel corpus to obtain multilingual sentence representations.

By: Vishrav Chaudhary, Yuqing Tang, Francisco (Paco) Guzman, Holger Schwenk, Philipp Koehn

August 1, 2019

Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue

Annual Meeting of the Association for Computational Linguistics (ACL)

In this paper, we (1) propose using tree-structured semantic representations, like those used in traditional rule-based NLG systems, for better discourse-level structuring and sentence-level planning; (2) introduce a challenging dataset using this representation for the weather domain; (3) introduce a constrained decoding approach for Seq2Seq models that leverages this representation to improve semantic correctness; and (4) demonstrate promising results on our dataset and the E2E dataset.

By: Anusha Balakrishnan, Jinfeng Rao, Kartikeya Upasani, Michael White, Rajen Subba