February 7, 2017

Exploring Normalization in Deep Residual Networks with Concatenated Rectified Linear Units


This paper analyzes the role of Batch Normalization (BatchNorm) layers on ResNets in the hope of improving the current architecture and better incorporating other normalization techniques, such as Normalization Propagation (NormProp), into ResNets.

Wenling Shang, Justin Chiu, Kihyuk Sohn
December 6, 2016

Feedback Neural Network for Weakly Supervised Geo-Semantic Segmentation


We propose a novel neural network architecture to perform weakly-supervised learning by suppressing irrelevant neuron activations. When applied to a practical challenge of transforming satellite images into a map of settlements and individual buildings it delivers results that show superior performance and efficiency.

Xianming Liu, Amy Zhang, Tobias Tiecke, Andreas Gros, Thomas S. Huang
December 6, 2016

Population Density Estimation with Deconvolutional Neural Networks

Workshop on Large Scale Computer Vision at NIPS 2016

This work is part of the Internet.org initiative to provide connectivity all over the world. Population density data is helpful in driving a variety of technology decisions, but currently, a microscopic dataset of population doesn’t exist. Current state of the art population density datasets are at ~1000km2 resolution. To create a better dataset, we have obtained 1PB of satellite imagery at 50cm/pixel resolution to feed through our building classification pipeline.

Amy Zhang, Andreas Gros, Tobias Tiecke, Xianming Liu
November 30, 2016

Semantic Segmentation using Adversarial Networks

Workshop on Adversarial Training at NIPS 2016

Adversarial training has been shown to produce state of the art results for generative image modeling. In this paper we propose an adversarial training approach to train semantic segmentation models.

Pauline Luc, Camille Couprie, Soumith Chintala, Jakob Verbeek
November 1, 2016

Neural Text Generation from Structured Data with Application to the Biography Domain

EMNLP 2016: Conference on Empirical Methods in Natural Language Processing

This paper introduces a neural model for concept-to-text generation that scales to large, rich domains.

Remi Lebret, David Grangier, Michael Auli
October 10, 2016

Learning to Refine Object Segments

European Conference on Computer Vision

In this work we propose to augment feedforward nets for object segmentation with a novel top-down refinement approach.

Pedro O. Pinheiro, Tsung-Yi Lin, Ronan Collobert, Piotr Dollar
October 10, 2016

Revisiting Visual Question Answering Baselines

European Conference on Computer Vision 2016

This paper questions the value of common practices and develops a simple alternative model based on binary classification.

Allan Jabri, Armand Joulin, Laurens van der Maaten
October 10, 2016

Polysemous Codes

European Conference on Computer Vision 2016 (ECCV)

This paper considers the problem of approximate nearest neighbor search in the compressed domain.

Matthijs Douze, Hervé Jégou, Florent Perronnin
October 8, 2016

Learning Visual Features from Large Weakly Supervised Data

European Conference on Computer Vision

In this paper, we explore the potential of leveraging massive, weakly-labeled image collections for learning good visual features.

Armand Joulin, Laurens van der Maaten, Allan Jabri, Nicolas Vasilache
September 18, 2016

A MultiPath Network for Object Detection


We test three modifications to the standard Fast R-CNN object detector to determine if they can overcome the object detection challenges in a COCO object detection dataset.

Sergey Zagoruyko, Adam Lerer, Tsung-Yi Lin, Pedro O. Pinheiro, Sam Gross, Soumith Chintala, Piotr Dollar
September 15, 2016

Efficient Softmax Approximation for GPU’s


We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.

Edouard Grave, Armand Joulin, Moustapha Cisse, David Grangier, Hervé Jégou
September 8, 2016

Joint Learning of Speaker and Phonetic Similarities with Siamese Networks

Interspeech 2016

We scale up the feasibility of jointly learning specialized speaker and phone embeddings architectures to the 360 hours of the Librispeech corpus by implementing a sampling method to efficiently select pairs of words from the dataset and improving the loss function.

Neil Zeghidour, Gabriel Synnaeve, Nicolas Usunier, Emmanuel Dupoux
August 16, 2016

Synergy of Monotonic Rules


This article describes a method for constructing a special rule (we call it synergy rule) that uses as its input information the outputs (scores) of several monotonic rules which solve the same pattern recognition problem.

Vladimir Vapnik, Rauf Izmailov
August 10, 2016

Neural Network-based Word Alignment through Score Aggregation

Association for Computational Linguistics Conference on Machine Translation

We present a simple neural network for word alignment that builds source and target word window representations to compute alignment scores for sentence pairs.

Michael Auli, Ronan Collobert, Joel Legrand
July 15, 2016

Enriching Word Vectors with Subword Information

ArXiv PrePrint

In this paper, we propose a new approach based on the skip-gram model, where each word is represented as a bag of character n-grams.

Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov
June 27, 2016

Unsupervised Learning of Edges


Data-driven approaches for edge detection have proven effective and achieve top results on modern benchmarks. However, all current data-driven edge detectors require manual supervision for training in the form of hand-labeled region segments or object boundaries.

Yin Li, Manohar Paluri, James M. Rehg, Piotr Dollar
June 20, 2016

Combining Two and Three-Way Embeddings Models for Link Prediction in Knowledge Bases

Journal of Artificial Intelligence Research, JAIR.org

This paper tackles the problem of endogenous link prediction for Knowledge Base completion.

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Yves Grandvalet
June 19, 2016

Recurrent Orthogonal Networks and Long-Memory Tasks

International Conference on Machine Learning

This paper analyzes two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps and explicitly construct RNN solutions to these problems.

Mikael Henaff, Arthur Szlam, Yann LeCun
June 18, 2016

Learning Physical Intuition of Block Towers by Example

International Conference on Machine Learning

Wooden blocks are a common toy for infants, allowing them to develop motor skills and gain intuition about the physical behavior of the world. In this paper, we explore the ability of deep feed-forward models to learn such intuitive physics.

Adam Lerer, Sam Gross, Rob Fergus