Research Area
Year Published

164 Results

June 18, 2018

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets

Computer Vision and Pattern Recognition (CVPR)

While there have been numerous attempts at modeling motion in videos, an explicit analysis of the effect of temporal information for video understanding is still missing. In this work, we aim to bridge this gap and ask the following question: How important is the motion in the video for recognizing the action?

By: De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Lorenzo Torresani, Manohar Paluri, Li Fei-Fei, Juan Carlos Niebles

June 18, 2018

CondenseNet: An Efficient DenseNet using Learned Group Convolutions

Computer Vision and Pattern Recognition (CVPR)

In this paper we develop CondenseNet, a novel network architecture with unprecedented efficiency. It combines dense connectivity with a novel module called learned group convolution. 

By: Gao Huang, Shichen Liu, Laurens van der Maaten, Kilian Q. Weinberger

June 18, 2018

Improving Landmark Localization with Semi-Supervised Learning

Computer Vision and Pattern Recognition (CVPR)

We present two techniques to improve landmark localization in images from partially annotated datasets. Our primary goal is to leverage the common situation where precise landmark locations are only provided for a small data subset, but where class labels for classification or regression tasks related to the landmarks are more abundantly available.

By: Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz

June 18, 2018

A Holistic Framework for Addressing the World using Machine Learning

Computer Vision and Pattern Recognition (CVPR)

Millions of people are disconnected from basic services due to lack of adequate addressing. We propose an automatic generative algorithm to create street addresses from satellite imagery.

By: Ilke Demir, Forest Hughes, Aman Raj, Kaunil Dhruv, Suryanarayana Murthy Muddala, Sanyam Garg, Barrett Doo

June 18, 2018

Don’t Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering

Computer Vision and Pattern Recognition (CVPR)

A number of studies have found that today’s Visual Question Answering (VQA) models are heavily driven by superficial correlations in the training data and lack sufficient image grounding. To encourage development of models geared towards the latter, we propose a new setting for VQA where for every question type, train and test sets have different prior distributions of answers.

By: Aishwarya Agrawal, Dhruv Batra, Devi Parikh, Aniruddha Kembhavi

June 18, 2018

DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images

CVPR Workshop - DeepGlobe 2018

Similar to other challenges in computer vision domain such as DAVIS[21] and COCO[33], DeepGlobe proposes three datasets and corresponding evaluation methodologies, coherently bundled in three competitions with a dedicated workshop co-located with CVPR 2018.

By: Ilke Demir, Krzysztof Koperski, David Lindenbaum, Guan Pang, Jing Huang, Saikat Basu, Forest Hughes, Devis Tuia, Ramesh Raskar

June 18, 2018

3D Semantic Segmentation with Submanifold Sparse Convolutional Networks

Computer Vision and Pattern Recognition (CVPR)

We introduce new sparse convolutional operations that are designed to process spatially-sparse data more efficiently, and use them to develop spatially-sparse convolutional networks.

By: Benjamin Graham, Laurens van der Maaten, Martin Engelcke

June 18, 2018

Low-Shot Learning from Imaginary Data

Computer Vision and Pattern Recognition (CVPR)

Humans can quickly learn new visual concepts, perhaps because they can easily visualize or imagine what novel objects look like from different views. Incorporating this ability to hallucinate novel instances of new concepts might help machine vision systems perform better low-shot learning, i.e., learning concepts from few examples. We present a novel approach to low-shot learning that uses this idea.

By: Yu-Xiong Wang, Ross Girshick, Martial Hebert, Bharath Hariharan

June 18, 2018

Audio to Body Dynamics

Computer Vision and Pattern Recognition (CVPR)

We present a method that gets as input an audio of violin or piano playing, and outputs a video of skeleton predictions which are further used to animate an avatar. The key idea is to create an animation of an avatar that moves their hands similarly to how a pianist or violinist would do, just from audio.

By: Eli Shlizerman, Lucio Dery, Hayden Schoen, Ira Kemelmacher Shlizerman

June 17, 2018

Neural Baby Talk

Computer Vision and Pattern Recognition (CVPR)

We introduce a novel framework for image captioning that can produce natural language explicitly grounded in entities that object detectors find in the image.

By: Jiasen Lu, Jianwei Yang, Dhruv Batra, Devi Parikh