June 9, 2019

Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues

Conference Computer Vision and Pattern Recognition (CVPR)

DensePose supersedes traditional landmark detectors by densely mapping image pixels to body surface coordinates. This power, however, comes at a greatly increased annotation time, as supervising the model requires to manually label hundreds of points per pose instance. In this work, we thus seek methods to significantly slim down the DensePose annotations, proposing more efficient data collection strategies.

By: Natalia Neverova, James Thewlis, Riza Alp Guler, Iasonas Kokkinos, Andrea Vedaldi

June 7, 2019

Cycle-Consistency for Robust Visual Question Answering

Computer Vision and Pattern Recognition (CVPR)

Despite significant progress in Visual Question Answering over the years, robustness of today’s VQA models leave much to be desired. We introduce a new evaluation protocol and associated dataset (VQA-Rephrasings) and show that state-of-the-art VQA models are notoriously brittle to linguistic variations in questions.

By: Meet Shah, Xinlei Chen, Marcus Rohrbach, Devi Parikh

June 4, 2019

Engaging Image Captioning via Personality

Conference Computer Vision and Pattern Recognition (CVPR)

Standard image captioning tasks such as COCO and Flickr30k are factual, neutral in tone and (to a human) state the obvious (e.g., “a man playing a guitar”). While such tasks are useful to verify that a machine understands the content of an image, they are not engaging to humans as captions. With this in mind we define a new task, PERSONALITY-CAPTIONS, where the goal is to be as engaging to humans as possible by incorporating controllable style and personality traits.

By: Kurt Shuster, Samuel Humeau, Hexiang Hu, Antoine Bordes, Jason Weston

May 17, 2019

GLoMo: Unsupervised Learning of Transferable Relational Graphs

Conference on Neural Information Processing Systems (NeurIPS)

Modern deep transfer learning approaches have mainly focused on learning generic feature vectors from one task that are transferable to other tasks, such as word embeddings in language and pretrained convolutional features in vision. However, these approaches usually transfer unary features and largely ignore more structured graphical representations. This work explores the possibility of learning generic latent relational graphs that capture dependencies between pairs of data units (e.g., words or pixels) from large-scale unlabeled data and transferring the graphs to downstream tasks.

By: Zhilin Yang, Jake (Junbo) Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann LeCun

May 6, 2019

Efficient Lifelong Learning with A-GEM

International Conference on Learning Representations (ICLR)

In lifelong learning, the learner is presented with a sequence of tasks, incrementally building a data-driven prior which may be leveraged to speed up learning of a new task. In this work, we investigate the efficiency of current lifelong approaches, in terms of sample complexity, computational and memory cost.

By: Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, Mohamed Elhoseiny

May 6, 2019

Selfless Sequential Learning

International Conference on Learning Representations (ICLR)

Sequential learning, also called lifelong learning, studies the problem of learning tasks in a sequence with access restricted to only the data of the current task. In this paper we look at a scenario with fixed model capacity, and postulate that the learning process should not be selfish, i.e. it should account for future tasks to be added and thus leave enough capacity for them.

By: Rahaf Aljundi, Marcus Rohrbach, Tinne Tuytelaars

May 4, 2019

Quasi-Hyperbolic Momentum and Adam for Deep Learning

International Conference on Learning Representations (ICLR)

Momentum-based acceleration of stochastic gradient descent (SGD) is widely used in deep learning. We propose the quasi-hyperbolic momentum algorithm (QHM) as an extremely simple alteration of momentum SGD, averaging a plain SGD step with a momentum step. We describe numerous connections to and identities with other algorithms, and we characterize the set of two-state optimization algorithms that QHM can recover.

By: Jerry Ma, Denis Yarats

May 1, 2019

Large-scale weakly-supervised pre-training for video action recognition

Conference Computer Vision and Pattern Recognition (CVPR)

Current fully-supervised video datasets consist of only a few hundred thousand videos and fewer than a thousand domain-specific labels. This hinders the progress towards advanced video architectures. This paper presents an in-depth study of using large volumes of web videos for pre-training video models for the task of action recognition.

By: Deepti Ghadiyaram, Matt Feiszli, Du Tran, Xueting Yan, Hen Wang, Dhruv Mahajan

April 28, 2019

Inverse Path Tracing for Joint Material and Lighting Estimation

Computer Vision and Pattern Recognition (CVPR)

Modern computer vision algorithms have brought significant advancement to 3D geometry reconstruction. However, illumination and material reconstruction remain less studied, with current approaches assuming very simplified models for materials and illumination. We introduce Inverse Path Tracing, a novel approach to jointly estimate the material properties of objects and light sources in indoor scenes by using an invertible light transport simulation.

By: Dejan Azinović, Tzu-Mao Li, Anton S. Kaplanyan, Matthias Nießner

March 12, 2019

Convolutional neural networks for mesh-based parcellation of the cerebral cortex

Medical Imaging with Deep Learning (MIDL)

We show experimentally on the Human Connectome Project dataset that the proposed graph convolutional models outperform current state-of-the-art and baselines, highlighting the potential and applicability of these methods to tackle neuroimaging challenges, paving the road towards a better characterization of brain diseases.

By: Guillem Cucurull, Konrad Wagstyl, Arantxa Casanova, Petar Velickovic, Estrid Jakobsen, Michal Drozdzal, Adriana Romero, Alan Evans, Yoshua Bengio