Research Area
Year Published

161 Results

July 21, 2017

Semantic Amodal Segmentation

CVPR 2017

Common visual recognition tasks such as classification, object detection, and semantic segmentation are rapidly reaching maturity, and given the recent rate of progress, it is not unreasonable to conjecture that techniques for many of these problems will approach human levels of performance in the next few years. In this paper we look to the future: what is the next frontier in visual recognition?

By: Yan Zhu, Yuandong Tian, Dimitris Mexatas, Piotr Dollar

July 21, 2017

Aggregated Residual Transformations for Deep Neural Networks

CVPR 2017

We present a simple, highly modularized network architecture for image classification.

By: Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, Kaiming He

July 21, 2017

Feature Pyramid Networks for Object Detection

CVPR 2017

In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost.

By: Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie

July 21, 2017

Learning Features by Watching Objects Move

CVPR 2017

This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired by the human visual system, we explore whether low-level motion-based grouping cues can be used to learn an effective visual representation.

By: Deepak Pathak, Ross Girshick, Piotr Dollar, Trevor Darrell, Bharath Hariharan

July 21, 2017

Link the head to the “beak”: Zero Shot Learning from Noisy Text Description at Part Precision

CVPR 2017

In this paper, we study learning visual classifiers from unstructured text descriptions at part precision with no training images. We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations. F

By: Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal

July 21, 2017

Relationship Proposal Networks

Conference on Computer Vision and Pattern Recognition 2017

In this paper we address the challenges of image scene object recognition by using pairs of related regions in images to train a relationship proposer that at test time produces a manageable number of related regions.

By: Ji Zhang, Mohamed Elhoseiny, Scott Cohen, Walter Chang, Ahmed Elgammal

June 8, 2017

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

Data @ Scale

In this paper, we empirically show that on the ImageNet dataset large minibatches cause optimization difficulties, but when these are addressed the trained networks exhibit good generalization.

By: Priya Goyal, Piotr Dollar, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, Kaiming He

May 16, 2017

Cultural Diffusion and Trends in Facebook Photographs

The International AAAI Conference on Web and Social Media (ICWSM)

Online social media is a social vehicle in which people share various moments of their lives with their friends, such as playing sports, cooking dinner or just taking a selfie for fun, via visual means, i.e., photographs. Our study takes a closer look at the popular visual concepts illustrating various cultural lifestyles from aggregated, de-identified photographs.

By: Quenzeng You, Dario Garcia, Manohar Paluri, Jiebo Luo, Jungseock Joo

February 22, 2017

Automatic Alt-text: Computer-generated Image Descriptions for Blind Users on a Social Network Service

CSCW

This paper covers the design and deployment of an automatic alt-text (AAT), a system that applies computer vision technology to identify faces, objects, and themes from photos to generate photo alt-text for screen reader users on Facebook.

By: Shaomei Wu, Jeffrey Wieland, Omid Farivar, Julie Schiller

December 6, 2016

Population Density Estimation with Deconvolutional Neural Networks

Workshop on Large Scale Computer Vision at NIPS 2016

This work is part of the Internet.org initiative to provide connectivity all over the world. Population density data is helpful in driving a variety of technology decisions, but currently, a microscopic dataset of population doesn’t exist. Current state of the art population density datasets are at ~1000km2 resolution. To create a better dataset, we have obtained 1PB of satellite imagery at 50cm/pixel resolution to feed through our building classification pipeline.

By: Amy Zhang, Xianming Liu, Tobias Tiecke, Andreas Gros