All Research Areas
Research Areas
Year Published

61 Results

February 22, 2017

Automatic Alt-text: Computer-generated Image Descriptions for Blind Users on a Social Network Service

CSCW

This paper covers the design and deployment of an automatic alt-text (AAT), a system that applies computer vision technology to identify faces, objects, and themes from photos to generate photo alt-text for screen reader users on Facebook.

By: Shaomei Wu, Jeffrey Wieland, Omid Farivar, Julie Schiller
February 7, 2017

Exploring Normalization in Deep Residual Networks with Concatenated Rectified Linear Units

AAAI-17

This paper analyzes the role of Batch Normalization (BatchNorm) layers on ResNets in the hope of improving the current architecture and better incorporating other normalization techniques, such as Normalization Propagation (NormProp), into ResNets.

By: Wenling Shang, Justin Chiu, Kihyuk Sohn
December 6, 2016

Population Density Estimation with Deconvolutional Neural Networks

Workshop on Large Scale Computer Vision at NIPS 2016

This work is part of the Internet.org initiative to provide connectivity all over the world. Population density data is helpful in driving a variety of technology decisions, but currently, a microscopic dataset of population doesn’t exist. Current state of the art population density datasets are at ~1000km2 resolution. To create a better dataset, we have obtained 1PB of satellite imagery at 50cm/pixel resolution to feed through our building classification pipeline.

By: Amy Zhang, Andreas Gros, Tobias Tiecke, Xianming Liu
November 30, 2016

Semantic Segmentation using Adversarial Networks

Workshop on Adversarial Training at NIPS 2016

Adversarial training has been shown to produce state of the art results for generative image modeling. In this paper we propose an adversarial training approach to train semantic segmentation models.

By: Pauline Luc, Camille Couprie, Soumith Chintala, Jakob Verbeek
October 10, 2016

Learning to Refine Object Segments

European Conference on Computer Vision

In this work we propose to augment feedforward nets for object segmentation with a novel top-down refinement approach.

By: Pedro O. Pinheiro, Tsung-Yi Lin, Ronan Collobert, Piotr Dollar
October 10, 2016

Polysemous Codes

European Conference on Computer Vision 2016 (ECCV)

This paper considers the problem of approximate nearest neighbor search in the compressed domain.

By: Matthijs Douze, Hervé Jégou, Florent Perronnin
October 10, 2016

Revisiting Visual Question Answering Baselines

European Conference on Computer Vision 2016

This paper questions the value of common practices and develops a simple alternative model based on binary classification.

By: Allan Jabri, Armand Joulin, Laurens van der Maaten
October 8, 2016

Learning Visual Features from Large Weakly Supervised Data

European Conference on Computer Vision

In this paper, we explore the potential of leveraging massive, weakly-labeled image collections for learning good visual features.

By: Allan Jabri, Armand Joulin, Laurens van der Maaten, Nicolas Vasilache
September 18, 2016

A MultiPath Network for Object Detection

BMVC

We test three modifications to the standard Fast R-CNN object detector to determine if they can overcome the object detection challenges in a COCO object detection dataset.

By: Sergey Zagoruyko, Adam Lerer, Tsung-Yi Lin, Pedro O. Pinheiro, Sam Gross, Soumith Chintala, Piotr Dollar
September 8, 2016

Joint Learning of Speaker and Phonetic Similarities with Siamese Networks

Interspeech 2016

We scale up the feasibility of jointly learning specialized speaker and phone embeddings architectures to the 360 hours of the Librispeech corpus by implementing a sampling method to efficiently select pairs of words from the dataset and improving the loss function.

By: Neil Zeghidour, Gabriel Synnaeve, Nicolas Usunier, Emmanuel Dupoux