Filter by Research Area
Filter by Research Area
Year Published

236 Results

March 14, 2018

Learning to Compute Word Embeddings On the Fly

ArXive

We provide a method for predicting embeddings of rare words on the fly from small amounts of auxiliary data with a network trained end-to-end for the downstream task. We show that this improves results against baselines where embeddings are trained on the end task for reading comprehension, recognizing textual entailment and language modeling.

By: Dzmitry Bahdanau, Tom Bosc, Stanislaw Jastrzebski, Edward Grefenstette, Pascal Vincent, Yoshua Bengio
March 12, 2018

Geometrical Insights for Implicit Generative Modeling

arXiv

Learning algorithms for implicit generative models can optimize a variety of criteria that measure how the data distribution differs from the implicit model distribution, including the Wasserstein distance, the Energy distance, and the Maximum Mean Discrepancy criterion.

By: Leon Bottou, Martin Arjovsky, David Lopez-Paz, Maxime Oquab
February 2, 2018

Efficient K-Shot Learning with Regularized Deep Networks

AAAI Conference on Artificial Intelligence (AAAI)

The problem of sub-optimality and over-fitting, is due in part to the large number of parameters (≈ 106) used in a typical deep convolutional neural network. To address these problems, we propose a simple yet effective regularization method for fine-tuning pre-trained deep networks for the task of k-shot learning.

By: Donghyun Yoo, Haoqi Fan, Vishnu Naresh Boddeti, Kris M. Kitani
February 2, 2018

StarSpace: Embed All The Things!

Conference on Artificial Intelligence (AAAI)

We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings.

By: Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston
February 2, 2018

Efficient Large-Scale Multi-Modal Classification

Conference on Artificial Intelligence (AAAI)

We investigate various methods for performing multi-modal fusion and analyze their trade-offs in terms of classification accuracy and computational efficiency.

By: Douwe Kiela, Edouard Grave, Armand Joulin, Tomas Mikolov
December 4, 2017

Unbounded Cache Model for Online Language Modeling with Open Vocabulary

Neural Information Processing Systems (NIPS)

In this paper, we propose an extension of continuous cache models, which can scale to larger contexts. In particular, we use a large scale non-parametric memory component that stores all the hidden activations seen in the past.

By: Edouard Grave, Moustapha Cisse, Armand Joulin
December 4, 2017

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

Neural Information Processing Systems (NIPS)

We present a novel training framework for neural sequence models, particularly for grounded dialog generation.

By: Jiasen Lu, Anitha Kannan, Jianwei Yang, Devi Parikh, Dhruv Batra
December 4, 2017

Gradient Episodic Memory for Continual Learning

Neural Information Processing Systems (NIPS)

One major obstacle towards AI is the poor ability of models to solve new problems quicker, and without forgetting previously acquired knowledge. To better understand this issue, we study the problem of continual learning, where the model observes, once and one by one, examples concerning a sequence of tasks.

By: David Lopez-Paz, Marc'Aurelio Ranzato
December 4, 2017

One-Sided Unsupervised Domain Mapping

Neural Information Processing Systems (NIPS)

In this work, we present a method of learning GAB without learning GBA. This is done by learning a mapping that maintains the distance between a pair of samples.

By: Sagie Benaim, Lior Wolf
December 4, 2017

Poincaré Embeddings for Learning Hierarchical Representations

Neural Information Processing Systems (NIPS)

In this work, we introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space – or more precisely into an n-dimensional Poincaré ball.

By: Maximilian Nickel, Douwe Kiela