Research Area
Year Published

147 Results

December 4, 2017

One-Sided Unsupervised Domain Mapping

Neural Information Processing Systems (NIPS)

In this work, we present a method of learning GAB without learning GBA. This is done by learning a mapping that maintains the distance between a pair of samples.

By: Sagie Benaim, Lior Wolf

November 27, 2017

Casual 3D Photography

SIGGRAPH ASIA

We present an algorithm that enables casual 3D photography. Given a set of input photos captured with a hand-held cell phone or DSLR camera, our algorithm reconstructs a 3D photo, a central panoramic, textured, normal mapped, multi-layered geometric mesh representation. Our geometric representation also allows interacting with the scene using 3D geometry-aware effects, such as adding new objects to the scene and artistic lighting effects.

By: Peter Hedman, Suhib Alsisan, Richard Szeliski, Johannes Kopf

November 27, 2017

Bringing Portraits to Life

SIGGRAPH ASIA

We present a technique to automatically animate a still portrait, making it possible for the subject in the photo to come to life and express various emotions. We use a driving video (of a different subject) and develop means to transfer the expressiveness of the subject in the driving video to the target portrait. In contrast to previous work that requires an input video of the target face to reenact a facial performance, our technique uses only a single target image.

By: Hadar Averbuch-Elor, Daniel Cohen-Or, Johannes Kopf, Michael Cohen

October 22, 2017

Dense and Low-Rank Gaussian CRFs using Deep Embeddings

International Conference on Computer Vision (ICCV)

In this work we introduce a structured prediction model that endows the Deep Gaussian Conditional Random Field (G-CRF) with a densely connected graph structure.

By: Siddhartha Chandra, Nicolas Usunier, Iasonas Kokkinos

October 22, 2017

Predicting Deeper into the Future of Semantic Segmentation

International Conference on Computer Vision (ICCV)

The ability to predict and therefore to anticipate the future is an important attribute of intelligence. We introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future.

By: Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann LeCun

October 22, 2017

Inferring and Executing Programs for Visual Reasoning

International Conference on Computer Vision (ICCV)

Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer.

By: Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, Larry Zitnick, Ross Girshick

October 22, 2017

Low-shot Visual Recognition by Shrinking and Hallucinating Features

International Conference on Computer Vision (ICCV)

Low-shot visual learning—the ability to recognize novel object categories from very few examples—is a hallmark of human visual intelligence. Existing machine learning approaches fail to generalize in the same way. We present a lowshot learning benchmark on complex images that mimics challenges faced by recognition systems in the wild.

By: Bharath Hariharan, Ross Girshick

October 22, 2017

Focal Loss for Dense Object Detection

International Conference on Computer Vision (ICCV)

In this paper, we investigate why one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. We design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors.

By: Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar

October 22, 2017

Deltille Grids for Geometric Camera Calibration

International Conference on Computer Vision (ICCV)

The recent proliferation of high resolution cameras presents an opportunity to achieve unprecedented levels of precision in visual 3D reconstruction. Yet the camera calibration pipeline, developed decades ago using checkerboards, has remained the de facto standard. In this paper, we ask the question: are checkerboards the optimal pattern for high precision calibration?

By: Hyowon Ha, Michal Perdoch, Hatem Alismail, In So Kweon, Yaser Sheikh

October 22, 2017

Mask R-CNN

International Conference on Computer Vision (ICCV)

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.

By: Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick