Research Area
Year Published

512 Results

December 9, 2019

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

Neural Information Processing Systems (NeurIPS)

We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-agnostic joint representations of image content and natural language. We extend the popular BERT architecture to a multi-modal two-stream model, processing both visual and textual inputs in separate streams that interact through co-attentional transformer layers.

By: Jiasen Lu, Dhruv Batra, Devi Parikh, Stefan Lee

December 9, 2019

Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

Neural Information Processing Systems (NeurIPS)

Multi-simulator training has contributed to the recent success of Deep Reinforcement Learning by stabilizing learning and allowing for higher training throughputs. We propose Gossip-based Actor-Learner Architectures (GALA) where several actor-learners (such as A2C agents) are organized in a peer-to-peer communication topology, and exchange information through asynchronous gossip in order to take advantage of a large number of distributed simulators.

By: Mahmoud Assran, Joshua Romoff, Nicolas Ballas, Joelle Pineau, Michael Rabbat

December 9, 2019

Cold Case: the Lost MNIST Digits

Neural Information Processing Systems (NeurIPS)

Although the popular MNIST dataset [LeCun et al., 1994] is derived from the NIST database [Grother and Hanaoka, 1995], the precise processing steps for this derivation have been lost to time. We propose a reconstruction that is accurate enough to serve as a replacement for the MNIST dataset, with insignificant changes in accuracy. We trace each MNIST digit to its NIST source and its rich metadata such as writer identifier, partition identifier, etc.

By: Chhavi Yadav, Leon Bottou

December 9, 2019

Fixing the train-test resolution discrepancy

Neural Information Processing Systems (NeurIPS)

Data-augmentation is key to the training of neural networks for image classification. This paper first shows that existing augmentations induce a significant discrepancy between the size of the objects seen by the classifier at train and test time: in fact, a lower train resolution improves the classification at test time!

By: Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

December 8, 2019

Learning to Perform Local Rewriting for Combinatorial Optimization

Neural Information Processing Systems (NeurIPS)

Search-based methods for hard combinatorial optimization are often guided by heuristics. Tuning heuristics in various conditions and situations is often time-consuming. In this paper, we propose NeuRewriter that learns a policy to pick heuristics and rewrite the local components of the current solution to iteratively improve it until convergence.

By: Xinyun Chen, Yuandong Tian

December 8, 2019

Differentiable Convex Optimization Layers

Neural Information Processing Systems (NeurIPS)

Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-specific languages (DSLs) for convex optimization.

By: Akshay Agrawal, Brandon Amos, Shane Barratt, Stephen Boyd, Steven Diamond, J. Zico Kolter

December 8, 2019

Levenshtein Transformer

Neural Information Processing Systems (NeurIPS)

Modern neural sequence generation models are built to either generate tokens step-by-step from scratch or (iteratively) modify a sequence of tokens bounded by a fixed length. In this work, we develop Levenshtein Transformer, a new partially autoregressive model devised for more flexible and amenable sequence generation. Unlike previous approaches, the basic operations of our model are insertion and deletion. The combination of them facilitates not only generation but also sequence refinement allowing dynamic length changes.

By: Jiatao Gu, Changhan Wang, Jake (Junbo) Zhao

December 8, 2019

RUBi: Reducing Unimodal Biases for Visual Question Answering

Neural Information Processing Systems (NeurIPS)

Visual Question Answering (VQA) is the task of answering questions about an image. Some VQA models often exploit unimodal biases to provide the correct answer without using the image information. As a result, they suffer from a huge drop in performance when evaluated on data outside their training set distribution. This critical issue makes them unsuitable for real-world settings. We propose RUBi, a new learning strategy to reduce biases in any VQA model. It reduces the importance of the most biased examples, i.e. examples that can be correctly classified without looking at the image.

By: Remi Cadene, Corentin Dancette, Hedi Ben-younes, Matthieu Cord, Devi Parikh

December 8, 2019

Cross-channel Communication Networks

Neural Information Processing Systems (NeurIPS)

Convolutional neural networks process input data by sending channel-wise feature response maps to subsequent layers. While a lot of progress has been made by making networks deeper, information from each channel can only be propagated from lower levels to higher levels in a hierarchical feed-forward manner. When viewing each filter in the convolutional layer as a neuron, those neurons are not communicating explicitly within each layer in CNNs. We introduce a novel network unit called Cross-channel Communication (C3) block, a simple yet effective module to encourage the neuron communication within the same layer.

By: Jianwei Yang, Zhile Ren, Hongyuan Zhu, Ji Lin, Chuang Gan, Devi Parikh

December 8, 2019

Hyperbolic Graph Neural Networks

Neural Information Processing Systems (NeurIPS)

We develop a scalable algorithm for modeling the structural properties of graphs, comparing Euclidean and hyperbolic geometry. In our experiments, we show that hyperbolic GNNs can lead to substantial improvements on various benchmark datasets.

By: Qi Liu, Maximilian Nickel, Douwe Kiela