All Research Areas
Research Areas
Year Published

27 Results

April 3, 2017

Bag of Tricks for Efficient Text Classification

European Chapter of the Association for Computational Linguistics (EACL)

This paper explores a simple and efficient baseline for text classification.

By: Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov
November 1, 2016

Neural Text Generation from Structured Data with Application to the Biography Domain

Empirical Methods in Natural Language Processing (EMNLP)

This paper introduces a neural model for concept-to-text generation that scales to large, rich domains.

By: Remi Lebret, David Grangier, Michael Auli
October 28, 2016

Bilingual Methods for Adaptive Training Data Selection for Machine Translation

Association for the Machine Translation in Americas

We propose a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual

By: Boxing Chen, Roland Kuhn, George Foster, Colin Cherry, Fei Huang
September 8, 2016

Joint Learning of Speaker and Phonetic Similarities with Siamese Networks

Interspeech 2016

We scale up the feasibility of jointly learning specialized speaker and phone embeddings architectures to the 360 hours of the Librispeech corpus by implementing a sampling method to efficiently select pairs of words from the dataset and improving the loss function.

By: Neil Zeghidour, Gabriel Synnaeve, Nicolas Usunier, Emmanuel Dupoux
August 11, 2016

Semi-Supervised Convolutional Networks for Translation Adaptation with Tiny Amount of In-domain Data

Conference on Natural Language Learning

We propose a method which uses semi-supervised convolutional neural networks (CNNs) to select in-domain training data for statistical machine translation.

By: Boxing Chen, Fei Huang
August 10, 2016

Neural Network-Based Word Alignment through Score Aggregation

Association for Computational Linguistics Conference on Machine Translation

We present a simple neural network for word alignment that builds source and target word window representations to compute alignment scores for sentence pairs.

By: Joel Legrand, Michael Auli, Ronan Collobert
June 8, 2016

Key-Value Memory Networks for Directly Reading Documents

EMNLP 2016

This paper introduces a new method, Key-Value Memory Networks, that makes reading documents more viable by utilizing different encodings in the addressing and output stages of the memory read operation.

By: Alexander Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, Jason Weston
April 13, 2016

Abstractive Summarization with Attentive RNN

NAACL 2016

Abstractive sentence summarization generates a shorter version of a given sentence while attempting to preserve its meaning. We introduce a conditional recurrent neural network (RNN) which generates a summary of an input sentence.

By: Sumit Chopra, Michael Auli, Alexander M. Rush
April 1, 2016

The Goldilocks Principle: Reading Children’s Books with Explicit Memory Representations

ICLR 2016

We introduce a new test of how well language models capture meaning in children’s books.

By: Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston
September 17, 2015

Improved Arabic Dialect Classification with Social Media Data

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

Arabic dialect classification has been an important and challenging problem for Arabic language processing, especially for social media text analysis and machine translation. In this paper we propose an approach to improving Arabic dialect classification with semi-supervised learning: multiple classifiers are trained with weakly supervised, strongly supervised, and unsupervised data. Their combination yields significant and consistent improvement on two different test sets.

By: Fei Huang