Research Area
Year Published

219 Results

June 30, 2019

Variational Training for Large-Scale Noisy-OR Bayesian Networks

Conference on Uncertainty in Artificial Intelligence (UAI)

We propose a stochastic variational inference algorithm for training large-scale Bayesian networks, where noisy-OR conditional distributions are used to capture higher-order relationships. One application is to the learning of hierarchical topic models for text data.

By: Geng Ji, Dehua Cheng, Huazhong Ning, Changhe Yuan, Hanning Zhou, Liang Xiong, Erik B. Sudderth

June 22, 2019

HackPPL: A Universal Probabilistic Programming Language

MAPL at PLDI

This paper overviews the design and implementation choices for the HackPPL toolchain and presents findings by applying it to a representative problem faced by social media companies.

By: Jessica Ai, Nimar S. Arora, Ning Dong, Beliz Gokkaya, Thomas Jiang, Anitha Kubendran, Arun Kumar, Michael Tingley, Narjes Torabi

June 21, 2019

Randomized Value Functions via Multiplicative Normalizing Flows

Conference on Uncertainty in Artificial Intelligence (UAI)

In this work, we leverage recent advances in variational Bayesian neural networks and combine these with traditional Deep Q-Networks (DQN) and Deep Deterministic Policy Gradient (DDPG) to achieve randomized value functions for high-dimensional domains.

By: Ahmed Touati, Harsh Satija, Joshua Romoff, Joelle Pineau, Pascal Vincent

June 20, 2019

Risk-Sensitive Generative Adversarial Imitation Learning

International Conference on Artificial Intelligence and Statistics (AISTATS)

We study risk-sensitive imitation learning where the agent’s goal is to perform at least as well as the expert in terms of a risk profile. We first formulate our risk-sensitive imitation learning setting. We consider the generative adversarial approach to imitation learning (GAIL) and derive an optimization problem for our formulation, which we call it risk-sensitive GAIL (RS-GAIL).

By: Jonathan Lacotte, Mohammad Ghavamzadeh, Yinlam Chow, Marco Pavone

June 18, 2019

Embodied Question Answering in Photorealistic Environments with Point Cloud Perception

Conference Computer Vision and Pattern Recognition (CVPR)

To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task – Embodied Question Answering [1] in photo-realistic environments (Matterport 3D).

By: Erik Wijmans, Samyak Datta, Oleksandr Maksymets, Abhishek Das, Georgia Gkioxari, Stefan Lee, Irfan Essa, Devi Parikh, Dhruv Batra

June 18, 2019

Long-Term Feature Banks for Detailed Video Understanding

Conference Computer Vision and Pattern Recognition (CVPR)

We propose a long-term feature bank—supportive information extracted over the entire span of a video—to augment state-of-the-art video models that otherwise would only view short clips of 2-5 seconds.

By: Chao-Yuan Wu, Christoph Feichtenhofer, Haoqi Fan, Kaiming He, Philipp Krähenbühl, Ross Girshick

June 18, 2019

Grounded Video Description

Conference Computer Vision and Pattern Recognition (CVPR)

Video description is one of the most challenging problems in vision and language understanding due to the large variability both on the video and language side. Models, hence, typically shortcut the difficulty in recognition and generate plausible sentences that are based on priors but are not necessarily grounded in the video. In this work, we explicitly link the sentence to the evidence in the video by annotating each noun phrase in a sentence with the corresponding bounding box in one of the frames of a video.

By: Luowei Zhou, Yannis Kalantidis, Xinlei Chen, Jason J. Corso, Marcus Rohrbach

June 17, 2019

DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition

Conference Computer Vision and Pattern Recognition (CVPR)

Motion has shown to be useful for video understanding, where motion is typically represented by optical flow. However, computing flow from video frames is very time-consuming. Recent works directly leverage the motion vectors and residuals readily available in the compressed video to represent motion at no cost. While this avoids flow computation, it also hurts accuracy since the motion vector is noisy and has substantially reduced resolution, which makes it a less discriminative motion representation.

By: Zheng Shou, Xudong Lin, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Shih-Fu Chang, Zhicheng Yan

June 17, 2019

Graph-Based Global Reasoning Networks

Conference Computer Vision and Pattern Recognition (CVPR)

Globally modeling and reasoning over relations between regions can be beneficial for many computer vision tasks on both images and videos. Convolutional Neural Networks (CNNs) excel at modeling local relations by convolution operations, but they are typically inefficient at capturing global relations between distant regions and require stacking multiple convolution layers. In this work, we propose a new approach for reasoning globally in which a set of features are globally aggregated over the coordinate space and then projected to an interaction space where relational reasoning can be efficiently computed.

By: Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis

June 16, 2019

Building High Resolution Maps for Humanitarian Aid and Development with Weakly- and Semi-Supervised Learning

Computer Vision for Global Challenges Workshop at CVPR

Detailed maps help governments and NGOs plan infrastructure development and mobilize relief around the world. Mapping is an open-ended task with a seemingly endless number of potentially useful features to be mapped. In this work, we focus on mapping buildings and roads. We do so with techniques that could easily extend to other features such as land use and land classification. We discuss real-world use cases of our maps by NGOs and humanitarian organizations around the world—from sustainable infrastructure planning to disaster relief.

By: Derrick Bonafilia, David Yang, James Gill, Saikat Basu