Research Area
Year Published

1012 Results

November 4, 2019

Countering Language Drift via Visual Grounding

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Emergent multi-agent communication protocols are very different from natural language and not easily interpretable by humans. We find that agents that were initially pretrained to produce natural language can also experience detrimental language drift: when a nonlinguistic reward is used in a goal-based task, e.g. some scalar success metric, the communication protocol may easily and radically diverge from natural language.

By: Jason Lee, Kyunghyun Cho, Douwe Kiela

November 4, 2019

Emergent Linguistic Phenomena in Multi-Agent Communication Games

Conference on Empirical Methods in Natural Language Processing (EMNLP)

We describe a multi-agent communication framework for examining high-level linguistic phenomena at the community-level. We demonstrate that complex linguistic behavior observed in natural language can be reproduced in this simple setting: i) the outcome of contact between communities is a function of inter- and intra-group connectivity; ii) linguistic contact either converges to the majority protocol, or in balanced cases leads to novel creole languages of lower complexity; and iii) a linguistic continuum emerges where neighboring languages are more mutually intelligible than farther removed languages.

By: Laura Graesser, Kyunghyun Cho, Douwe Kiela

November 4, 2019

Language Models as Knowledge Bases?

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as “fill-in-the-blank” cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models

By: Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

November 4, 2019

VizSeq: A Visual Analysis Toolkit for Text Generation Tasks

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Automatic evaluation of text generation tasks (e.g. machine translation, text summarization, image captioning and video description) usually relies heavily on task-specific metrics, such as BLEU (Papineni et al., 2002) and ROUGE (Lin, 2004). They, however, are abstract numbers and are not perfectly aligned with human assessment. This suggests inspecting detailed examples as a complement to identify system error patterns. In this paper, we present VizSeq, a visual analysis toolkit for instance-level and corpus-level system evaluation on a wide variety of text generation tasks.

By: Changhan Wang, Anirudh Jain, Danlu Chen, Jiatao Gu

November 4, 2019

Insertion-based Decoding with automatically Inferred Generation Order

Transactions of the Association for Computational Linguistics (TACL)

Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal. In this work, we propose a novel decoding algorithm – InDIGO – which supports flexible sequence generation in arbitrary orders through insertion operations. We extend Transformer, a state-of-the-art sequence generation model, to efficiently implement the proposed approach, enabling it to be trained with either a pre-defined generation order or adaptive orders obtained from beam-search.

By: Jiatao Gu, Qi Lu, Kyunghyun Cho

November 4, 2019

Learning Programmatic Idioms for Scalable Semantic Parsing

Conference on Empirical Methods in Natural Language Processing (EMNLP)

In this paper, we introduce an iterative method to extract code idioms from large source code corpora by repeatedly collapsing most-frequent depth-2 subtrees of their syntax trees, and train semantic parsers to apply these idioms during decoding.

By: Srinivasan Iyer, Alvin Cheung, Luke Zettlemoyer

November 3, 2019

Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Inputs

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Query-based open-domain NLP tasks require information synthesis from long and diverse web results. Current approaches extractively select portions of web text as input to Sequence-to-Sequence models using methods such as TF-IDF ranking. We propose constructing a local graph structured knowledge base for each query, which compresses the web search information and reduces redundancy.

By: Angela Fan, Claire Gardent, Chloé Braud, Antoine Bordes

November 3, 2019

Improving Generative Visual Dialog by Answering Diverse Questions

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Prior work on training generative Visual Dialog models with reinforcement learning (Das et al., 2017b) has explored a Q-BOT-A-BOT image-guessing game and shown that this ‘self-talk’ approach can lead to improved performance at the downstream dialog-conditioned image-guessing task. However, this improvement saturates and starts degrading after a few rounds of interaction, and does not lead to a better Visual Dialog model.

By: Vishvak Murahari, Prithvijit Chattopadhyay, Dhruv Batra, Devi Parikh, Abhishek Das

November 3, 2019

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

Transactions of the Association for Computational Linguistics (TACL)

We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts. Our system uses a single BiLSTM encoder with a shared BPE vocabulary for all languages, which is coupled with an auxiliary decoder and trained on publicly available parallel corpora. This enables us to learn a classifier on top of the resulting embeddings using English annotated data only, and transfer it to any of the 93 languages without any modification.

By: Mikel Artetxe, Holger Schwenk

November 3, 2019

Don’t Forget the Long Tail! A Comprehensive Analysis of Morphological Generalization in Bilingual Lexicon Induction

Conference on Empirical Methods in Natural Language Processing (EMNLP)

Human translators routinely have to translate rare inflections of words—due to the Zipfian distribution of words in a language. When translating from Spanish, a good translator would have no problem identifying the proper translation of a statistically rare inflection such as hablarámos. Note the lexeme itself, hablar, is relatively common. In this work, we investigate whether state-of-the-art bilingual lexicon inducers are capable of learning this kind of generalization.

By: Paula Czarnowska, Sebastian Ruder, Edouard Grave, Ryan Cotterell, Ann Copestake