February 4, 2018

StarSpace: Embed All The Things!

Conference on Artificial Intelligence (AAAI)

We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings.

Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston
February 2, 2018

Efficient Large-Scale Multi-Modal Classification

Conference on Artificial Intelligence (AAAI)

We investigate various methods for performing multi-modal fusion and analyze their trade-offs in terms of classification accuracy and computational efficiency.

Douwe Kiela, Edouard Grave, Armand Joulin, Tomas Mikolov
December 4, 2017

VAIN: Attentional Multi-agent Predictive Modeling

Neural Information Processing Systems (NIPS)

In this paper we introduce VAIN, a novel attentional architecture for multi-agent predictive modeling that scales linearly with the number of agents. Multi-agent predictive modeling is an essential step for understanding physical, social and team-play systems.

Yedid Hoshen
December 4, 2017

Gradient Episodic Memory for Continual Learning

Neural Information Processing Systems (NIPS)

One major obstacle towards AI is the poor ability of models to solve new problems quicker, and without forgetting previously acquired knowledge. To better understand this issue, we study the problem of continual learning, where the model observes, once and one by one, examples concerning a sequence of tasks.

David Lopez-Paz, Marc'Aurelio Ranzato
December 4, 2017

Poincaré Embeddings for Learning Hierarchical Representations

Neural Information Processing Systems (NIPS)

In this work, we introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space – or more precisely into an n-dimensional Poincaré ball.

Maximilian Nickel, Douwe Kiela
December 4, 2017

Houdini: Fooling Deep Structured Visual and Speech Recognition Models with Adversarial Examples

Neural Information Processing Systems (NIPS)

We introduce a novel flexible approach named Houdini for generating adversarial examples specifically tailored for the final performance measure of the task considered, be it combinatorial and non-decomposable.

Moustapha Cisse, Yossi Adi, Natalia Neverova, Joseph Keshet
December 4, 2017

Attentive Explanations: Justifying Decisions and Pointing to the Evidence

Interpretable Machine Learning Symposium at NIPS

In this work, we emphasize the importance of model explanation in various forms such as visual pointing and textual justification.

Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, Marcus Rohrbach
December 4, 2017

On the Optimization Landscape of Tensor Decompositions

Neural Information Processing Systems (NIPS)

In this paper, we analyze the optimization landscape of the random over-complete tensor decomposition problem, which has many applications in unsupervised learning, especially in learning latent variable models. In practice, it can be efficiently solved by gradient ascent on a non-convex objective.

Rong Ge, Tengyu Ma
December 4, 2017

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

Neural Information Processing Systems (NIPS)

We present a novel training framework for neural sequence models, particularly for grounded dialog generation.

Jiasen Lu, Anitha Kannan, Jianwei Yang, Devi Parikh, Dhruv Batra
December 4, 2017

Fader Networks: Manipulating Images by Sliding Attributes

Neural Information Processing Systems (NIPS)

This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space.

Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato
December 4, 2017

One-Sided Unsupervised Domain Mapping

Neural Information Processing Systems (NIPS)

In this work, we present a method of learning GAB without learning GBA. This is done by learning a mapping that maintains the distance between a pair of samples.

Sagie Benaim, Lior Wolf
December 4, 2017

ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games

Neural Information Processing Systems (NIPS)

In this paper, we propose ELF, an Extensive, Lightweight and Flexible platform for fundamental reinforcement learning research.

Yuandong Tian, Qucheng Gong, Wenling Shang, Yuxin Wu, Larry Zitnick
December 4, 2017

Unbounded Cache Model for Online Language Modeling with Open Vocabulary

Neural Information Processing Systems (NIPS)

In this paper, we propose an extension of continuous cache models, which can scale to larger contexts. In particular, we use a large scale non-parametric memory component that stores all the hidden activations seen in the past.

Edouard Grave, Moustapha Cisse, Armand Joulin
October 22, 2017

Inferring and Executing Programs for Visual Reasoning

International Conference on Computer Vision (ICCV)

Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer.

Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, Larry Zitnick, Ross Girshick
October 22, 2017

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

International Conference on Computer Vision (ICCV)

We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable.

Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra
October 22, 2017

Learning Visual N-Grams from Web Data

International Conference on Computer Vision (ICCV)

This paper explores the training of image-recognition systems on large numbers of images and associated user comments, without using manually labeled images.

Ang Li, Allan Jabri, Armand Joulin, Laurens van der Maaten
October 22, 2017

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

International Conference on Computer Vision (ICCV)

We introduce the first goal-driven training for visual question answering and dialog agents. Specifically, we pose a cooperative ‘image guessing’ game between two agents – Q-BOT and A-BOT– who communicate in natural language dialog so that Q-BOT can select an unseen image from a lineup of images.

Abhishek Das, Satwik Kottur, José M.F. Moura, Stefan Lee, Dhruv Batra
October 22, 2017

Transitive Invariance for Self-supervised Visual Representation Learning

International Conference on Computer Vision (ICCV)

In this paper, we propose to exploit different self-supervised approaches to learn representations invariant to (i) inter-instance variations (two objects in the same class should have similar features) and (ii) intra-instance variations (viewpoint, pose, deformations, illumination, etc.).

Xiaolong Wang, Kaiming He, Abhinav Gupta
October 22, 2017

Predicting Deeper into the Future of Semantic Segmentation

International Conference on Computer Vision (ICCV)

The ability to predict and therefore to anticipate the future is an important attribute of intelligence. We introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future.

Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann LeCun
October 22, 2017

Mask R-CNN

International Conference on Computer Vision (ICCV)

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.

Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick