Research Area
Year Published

82 Results

June 16, 2019

Towards VQA Models That Can Read

Conference Computer Vision and Pattern Recognition (CVPR)

Studies have shown that a dominant class of questions asked by visually impaired users on images of their surroundings involves reading text in the image. But today’s VQA models can not read! Our paper takes a first step towards addressing this problem.

By: Amanpreet Singh, Vivek Natarajan, Meet Shah, Yu Jiang, Xinlei Chen, Dhruv Batra, Devi Parikh, Marcus Rohrbach

June 7, 2019

Cycle-Consistency for Robust Visual Question Answering

Computer Vision and Pattern Recognition (CVPR)

Despite significant progress in Visual Question Answering over the years, robustness of today’s VQA models leave much to be desired. We introduce a new evaluation protocol and associated dataset (VQA-Rephrasings) and show that state-of-the-art VQA models are notoriously brittle to linguistic variations in questions.

By: Meet Shah, Xinlei Chen, Marcus Rohrbach, Devi Parikh

June 3, 2019

Pay less attention with Lightweight and Dynamic Convolutions

International Conference on Learning Representations (ICLR)

Self-attention is a useful mechanism to build generative models for language and images. It determines the importance of context elements by comparing each element to the current time step. In this paper, we show that a very lightweight convolution can perform competitively to the best reported self-attention results.

By: Felix Wu, Angela Fan, Alexei Baevski, Yann Dauphin, Michael Auli

June 2, 2019

The emergence of number and syntax units in LSTM language models

Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)

We present here a detailed study of the inner mechanics of number tracking in LSTMs at the single neuron level. We discover that long-distance number information is largely managed by two “number units”.

By: Yair Lakretz, Germán Kruszewski, Theo Desbordes, Dieuwke Hupkes, Stanislas Dehaene, Marco Baroni

June 2, 2019

Simple Attention-Based Representation Learning for Ranking Short Social Media Posts

North American Chapter of the Association for Computational Linguistics (NAACL)

This paper explores the problem of ranking short social media posts with respect to user queries using neural networks. Instead of starting with a complex architecture, we proceed from the bottom up and examine the effectiveness of a simple, word-level Siamese architecture augmented with attention-based mechanisms for capturing semantic “soft” matches between query and post tokens.

By: Peng Shi, Jinfeng Rao, Jimmy Lin

June 1, 2019

Neural Models of Text Normalization for Speech Applications

Computational Linguistics

Machine learning, including neural network techniques, have been applied to virtually every domain in natural language processing. One problem that has been somewhat resistant to effective machine learning solutions is text normalization for speech applications such as text-to-speech synthesis (TTS).

By: Hao Zhang, Richard Sproat, Axel H. Ng, Felix Stahlberg, Xiaochang Peng, Kyle Gorman, Brian Roark

May 31, 2019

Abusive Language Detection with Graph Convolutional Networks

North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT)

Abuse on the Internet represents a significant societal problem of our time. Previous research on automated abusive language detection in Twitter has shown that community-based profiling of users is a promising technique for this task. However, existing approaches only capture shallow properties of online communities by modeling follower–following relationships.

By: Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova

May 17, 2019

Unsupervised Hyper-alignment for Multilingual Word Embeddings

International Conference on Learning Representations (ICLR)

We consider the problem of aligning continuous word representations, learned in multiple languages, to a common space. It was recently shown that, in the case of two languages, it is possible to learn such a mapping without supervision. This paper extends this line of work to the problem of aligning multiple languages to a common space.

By: Jean Alaux, Edouard Grave, Marco Cuturi, Armand Joulin

May 10, 2019

Learning to Detect Dysarthria from Raw Speech

International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Speech classifiers of paralinguistic traits traditionally learn from diverse hand-crafted low-level features, by selecting the relevant information for the task at hand. We explore an alternative to this selection, by learning jointly the classifier, and the feature extraction.

By: Juliette Millet, Neil Zeghidour

May 6, 2019

Multiple-Attribute Text Rewriting

International Conference on Learning Representations (ICLR)

The dominant approach to unsupervised “style transfer” in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its “style”. In this paper, we show that this condition is not necessary and is not always met in practice, even with domain adversarial training that explicitly aims at learning such disentangled representations.

By: Guillaume Lample, Sandeep Subramanian, Eric Michael Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau