Filter by Research Area
Filter by Research Area
Year Published

46 Results

July 13, 2018

A Multi-lingual Multi-task Architecture for Low-resource Sequence Labeling

Association for Computational Linguistics (ACL)

We propose a multi-lingual multi-task architecture to develop supervised models with a minimal amount of labeled data for sequence labeling.

By: Ying Lin, Shengqi Yang, Veselin Stoyanov, Heng Ji
July 11, 2018

Fitting New Speakers Based on a Short Untranscribed Sample

International Conference on Machine Learning (ICML)

We present a method that is designed to capture a new speaker from a short untranscribed audio sample.

By: Eliya Nachmani, Adam Polyak, Yaniv Taigman, Lior Wolf
June 18, 2018

Don’t Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering

Computer Vision and Pattern Recognition (CVPR)

A number of studies have found that today’s Visual Question Answering (VQA) models are heavily driven by superficial correlations in the training data and lack sufficient image grounding. To encourage development of models geared towards the latter, we propose a new setting for VQA where for every question type, train and test sets have different prior distributions of answers.

By: Aishwarya Agrawal, Dhruv Batra, Devi Parikh, Aniruddha Kembhavi
June 1, 2018

Colorless Green Recurrent Networks Dream Hierarchically

North American Chapter of the Association for Computational Linguistics (NAACL)

Recurrent neural networks (RNNs) have achieved impressive results in a variety of linguistic processing tasks, suggesting that they can induce non-trivial properties of language. We investigate here to what extent RNNs learn to track abstract hierarchical syntactic structure. We test whether RNNs trained with a generic language modeling objective in four languages (Italian, English, Hebrew, Russian) can predict long-distance number agreement in various constructions.

By: Kristina Gulordava, Piotr Bojanowski, Edouard Grave, Tal Linzen, Marco Baroni
May 7, 2018

Advances in Pre-Training Distributed Word Representations

Language Resources and Evaluation Conference (LREC)

In this paper, we show how to train high-quality word vector representations by using a combination of known tricks that are however rarely used together.

By: Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, Armand Joulin
May 7, 2018

A Corpus for Multilingual Document Classification in Eight Languages

Language Resources and Evaluation Conference (LREC)

In this paper, we propose a new subset of the Reuters corpus with balanced class priors for eight languages. By adding Italian, Russian, Japanese and Chinese, we cover languages which are very different with respect to syntax, morphology, etc. We provide strong baselines for all language transfer directions using multilingual word and sentence embeddings respectively. Our goal is to offer a freely available framework to evaluate cross-lingual document classification, and we hope to foster by these means, research in this important area.

By: Holger Schwenk, Xian Li
April 30, 2018

Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent

International Conference on Learning Representations (ICLR)

In this work we propose an interactive learning procedure called Mechanical Turker Descent (MTD) and use it to train agents to execute natural language commands grounded in a fantasy text adventure game. In MTD, Turkers compete to train better agents in the short term, and collaborate by sharing their agents’ skills in the long term.

By: Zhilin Yang, Saizheng Zhang, Jack Urbanek, Will Feng, Alexander H. Miller, Arthur Szlam, Douwe Kiela, Jason Weston
April 15, 2018

Learning Filterbanks from Raw Speech for Phone Recognition

International Conference on Acoustics, Speech and Signal Processing (ICASSP)

We train a bank of complex filters that operates on the raw waveform and is fed into a convolutional neural network for end-to-end phone recognition.

By: Neil Zeghidour, Nicolas Usunier, Iasonas Kokkinos, Thomas Schatz, Gabriel Synnaeve, Emmanuel Dupoux
April 15, 2018

Towards End-to-End Spoken Language Understanding

International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018)

Spoken language understanding system is traditionally designed as a pipeline of a number of components.

By: Dmitriy Serdyuk, Yongqiang Wang, Christian Fuegen, Anuj Kumar, Baiyang Liu, Yoshua Bengio
March 14, 2018

Learning to Compute Word Embeddings On the Fly


We provide a method for predicting embeddings of rare words on the fly from small amounts of auxiliary data with a network trained end-to-end for the downstream task. We show that this improves results against baselines where embeddings are trained on the end task for reading comprehension, recognizing textual entailment and language modeling.

By: Dzmitry Bahdanau, Tom Bosc, Stanislaw Jastrzebski, Edward Grefenstette, Pascal Vincent, Yoshua Bengio