All Research Areas
Research Areas
Year Published

391 Results

October 22, 2017

Unsupervised Creation of Parameterized Avatars

International Conference on Computer Vision (ICCV)

We study the problem of mapping an input image to a tied pair consisting of a vector of parameters and an image that is created using a graphical engine from the vector of parameters. The mapping’s objective is to have the output image as similar as possible to the input image.

By: Lior Wolf, Yaniv Taigman, Adam Polyak
October 22, 2017

Deltille Grids for Geometric Camera Calibration

International Conference on Computer Vision (ICCV)

The recent proliferation of high resolution cameras presents an opportunity to achieve unprecedented levels of precision in visual 3D reconstruction. Yet the camera calibration pipeline, developed decades ago using checkerboards, has remained the de facto standard. In this paper, we ask the question: are checkerboards the optimal pattern for high precision calibration?

By: Hyowon Ha, Michal Perdoch, Hatem Alismail, In So Kweon, Yaser Sheikh
October 22, 2017

Learning to Reason: End-to-End Module Networks for Visual Question Answering

International Conference on Computer Vision (ICCV)

In this paper, we propose End-to-End Module Networks (N2NMNs), which learn to reason by directly predicting instance-specific network layouts without the aid of a parser.

By: Ronghang Hu, Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Kate Saenko
October 22, 2017

Segmentation-Aware Convolutional Networks using Local Attention Masks

International Conference on Computer Vision (ICCV)

We introduce an approach to integrate segmentation information within a convolutional neural network (CNN). This counter-acts the tendency of CNNs to smooth information across regions and increases their spatial precision.

By: Adam W. Harley, Konstantinos G. Derpanis, Iasonas Kokkinos
October 22, 2017

Focal Loss for Dense Object Detection

International Conference on Computer Vision (ICCV)

In this paper, we investigate why one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. We design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors.

By: Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar
October 22, 2017

Low-shot Visual Recognition by Shrinking and Hallucinating Features

International Conference on Computer Vision (ICCV)

Low-shot visual learning—the ability to recognize novel object categories from very few examples—is a hallmark of human visual intelligence. Existing machine learning approaches fail to generalize in the same way. We present a lowshot learning benchmark on complex images that mimics challenges faced by recognition systems in the wild.

By: Bharath Hariharan, Ross Girshick
October 22, 2017

Dense and Low-Rank Gaussian CRFs using Deep Embeddings

International Conference on Computer Vision (ICCV)

In this work we introduce a structured prediction model that endows the Deep Gaussian Conditional Random Field (G-CRF) with a densely connected graph structure.

By: Siddhartha Chandra, Nicolas Usunier, Iasonas Kokkinos
October 22, 2017

Inferring and Executing Programs for Visual Reasoning

International Conference on Computer Vision (ICCV)

Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer.

By: Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, Larry Zitnick, Ross Girshick
October 22, 2017

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

International Conference on Computer Vision (ICCV)

We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable.

By: Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra
October 22, 2017

Learning Visual N-Grams from Web Data

International Conference on Computer Vision (ICCV)

This paper explores the training of image-recognition systems on large numbers of images and associated user comments, without using manually labeled images.

By: Ang Li, Allan Jabri, Armand Joulin, Laurens van der Maaten