All Research Areas
Research Areas
Year Published

520 Results

June 18, 2018

DeepMVS: Learning Multi-view Stereopsis

Computer Vision and Pattern Recognition (CVPR)

We present DeepMVS, a deep convolutional neural network (ConvNet) for multi-view stereo reconstruction.

By: Po-Han Huang, Kevin Matzen, Johannes Kopf, Narendra Ahuja, Jia-Bin Huang
June 18, 2018

Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors

Computer Vision and Pattern Recognition (CVPR)

In this paper, we present supervision-by-registration, an unsupervised approach to improve the precision of facial landmark detectors on both images and video. Our key observation is that the detections of the same landmark in adjacent frames should be coherent with registration, i.e., optical flow.

By: Xuanyi Dong, Shoou-I Yu, Xinshuo Weng, Shih-En Wei, Yi Yang, Yaser Sheikh
June 18, 2018

A Closer Look at Spatiotemporal Convolutions for Action Recognition

Computer Vision and Pattern Recognition (CVPR)

In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition. Our motivation stems from the observation that 2D CNNs applied to individual frames of the video have remained solid performers in action recognition.

By: Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, Manohar Paluri
June 18, 2018

Learning Patch Reconstructability for Accelerating Multi-View Stereo

Computer Vision and Pattern Recognition (CVPR)

We present an approach to accelerate multi-view stereo (MVS) by prioritizing computation on image patches that are likely to produce accurate 3D surface reconstructions. Our key insight is that the accuracy of the surface reconstruction from a given image patch can be predicted significantly faster than performing the actual stereo matching.

By: Alex Poms, Chenglei Wu, Shoou-I Yu, Yaser Sheikh
June 18, 2018

Multimodal Explanations: Justifying Decisions and Pointing to the Evidence

Computer Vision and Pattern Recognition (CVPR)

Deep models that are both effective and explainable are desirable in many settings; prior explainable models have been unimodal, offering either image-based visualization of attention weights or text-based generation of post-hoc justifications. We propose a multimodal approach to explanation, and argue that the two modalities provide complementary explanatory strengths.

By: Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, Marcus Rohrbach
June 18, 2018

Embodied Question Answering

Computer Vision and Pattern Recognition (CVPR)

We present a new AI task – Embodied Question Answering (EmbodiedQA) – where an agent is spawned at a random location in a 3D environment and asked a question (‘What color is the car?’). In order to answer, the agent must first intelligently navigate to explore the environment, gather necessary visual information through first-person (egocentric) vision, and then answer the question (‘orange’). 

By: Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra
June 18, 2018

A Holistic Framework for Addressing the World using Machine Learning

Computer Vision and Pattern Recognition (CVPR)

Millions of people are disconnected from basic services due to lack of adequate addressing. We propose an automatic generative algorithm to create street addresses from satellite imagery.

By: Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddala, Sanyam Garg, Barrett Doo
June 18, 2018

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets

Computer Vision and Pattern Recognition (CVPR)

While there have been numerous attempts at modeling motion in videos, an explicit analysis of the effect of temporal information for video understanding is still missing. In this work, we aim to bridge this gap and ask the following question: How important is the motion in the video for recognizing the action?

By: De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Lorenzo Torresani, Manohar Paluri, Li Fei-Fei, Juan Carlos Niebles
June 18, 2018

Improving Landmark Localization with Semi-Supervised Learning

Computer Vision and Pattern Recognition (CVPR)

We present two techniques to improve landmark localization in images from partially annotated datasets. Our primary goal is to leverage the common situation where precise landmark locations are only provided for a small data subset, but where class labels for classification or regression tasks related to the landmarks are more abundantly available.

By: Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz
June 18, 2018

Data Distillation: Towards Omni-Supervised Learning

Computer Vision and Pattern Recognition (CVPR)

We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data.

By: Ilija Radosavovic, Piotr Dollar, Ross Girshick, Georgia Gkioxari, Kaiming He