Explore the latest research from Facebook

All Publications

September 7, 2020 Gunjan Aggarwal, Devi Parikh
Paper

Neuro-Symbolic Generative Art: A Preliminary Study

As a preliminary study, we train a generative deep neural network on samples from the symbolic approach. We demonstrate through human studies that subjects find the final artifacts and the creation process using our neurosymbolic approach to be more creative than the symbolic approach 61% and 82% of the time respectively.
Paper
August 25, 2020 Sayna Ebrahimi, Franziska Meier, Roberto Calandra, Trevor Darrell, Marcus Rohrbach
Paper

Adversarial Continual Learning

Continual learning aims to learn new tasks without forgetting previously learned ones. We hypothesize that representations learned to solve each task in a sequence have a shared structure while containing some task-specific properties. We show that shared features are significantly less prone to forgetting and propose a novel hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features required to solve a sequence of tasks.
Paper
August 24, 2020 Hang Chu, Shugao Ma, Fernando De la Torre, Sanja Fidler, Yaser Sheikh
Paper

Expressive Telepresence via Modular Codec Avatars

VR telepresence consists of interacting with another human in a virtual space represented by an avatar. Today most avatars are cartoon-like, but soon the technology will allow video-realistic ones. This paper aims in this direction, and presents Modular Codec Avatars (MCA), a method to generate hyper-realistic faces driven by the cameras in the VR headset.
Paper
August 24, 2020 Kevin Musgrave, Serge Belongie, Ser Nam Lim
Paper

A Metric Learning Reality Check

Deep metric learning papers from the past four years have consistently claimed great advances in accuracy, often more than doubling the performance of decade-old methods. In this paper, we take a closer look at the field to see if this is actually true.
Paper
August 24, 2020 Chih-Yao Ma, Yannis Kalantidis, Ghassan AlRegib, Peter Vajda, Marcus Rohrbach, Zsolt Kira
Paper

Learning to Generate Grounded Visual Captions without Localization Supervision

In this work, we help the model to achieve this via a novel cyclical training regimen that forces the model to localize each word in the image after the sentence decoder generates it, and then reconstruct the sentence from the localized image region(s) to match the ground-truth. Our proposed framework only requires learning one extra fully-connected layer (the localizer), a layer that can be removed at test time.
Paper
August 24, 2020 Zuxuan Wu, Ser Nam Lim, Larry S. Davis, Tom Goldstein
Paper

Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors

We present a systematic study of the transferability of adversarial attacks on state-of-the-art object detection frameworks. Using standard detection datasets, we train patterns that suppress the objectness scores produced by a range of commonly used detectors, and ensembles of detectors.
Paper
August 23, 2020 Samarth Brahmbhatt, Chengcheng Tang, Christopher D. Twigg, Charles C. Kemp, James Hays
Paper

ContactPose: A Dataset of Grasps with Object Contact and Hand Pose

We introduce ContactPose, the first dataset of hand-object contact paired with hand pose, object pose, and RGB-D images. ContactPose has 2306 unique grasps of 25 household objects grasped with 2 functional intents by 50 participants, and more than 2.9 M RGB-D grasp images. Analysis of ContactPose data reveals interesting relationships between hand pose and contact.
Paper
August 23, 2020 Oleksii Sidorov, Ronghang Hu, Marcus Rohrbach, Amanpreet Singh
Paper

TextCaps: a Dataset for Image Captioning with Reading Comprehension

Image descriptions can help visually impaired people to quickly understand the image content. While we made significant progress in automatically describing images and optical character recognition, current approaches are unable to include written text in their descriptions, although text is omnipresent in human environments and frequently critical to understand our surroundings. To study how to comprehend text in the context of an image we collect a novel dataset, TextCaps, with 145k captions for 28k images.
Paper
August 23, 2020 Gyeongsik Moon, Takaaki Shiratori, Kyoung Mu Lee
Paper

DeepHandMesh: A Weakly-Supervised Deep Encoder-Decoder Framework for High-Fidelity Hand Mesh Modeling

Human hands play a central role in interacting with other people and objects. For realistic replication of such hand motions, high-fidelity hand meshes have to be reconstructed. In this study, we firstly propose DeepHandMesh, a weakly-supervised deep encoder-decoder framework for high-fidelity hand mesh modeling. We design our system to be trained in an end-to-end and weakly-supervised manner; therefore, it does not require groundtruth meshes.
Paper
August 23, 2020 Roman Klokov, Edmond Boyer, Jakob Verbeek
Paper

Discrete Point Flow Networks for Efficient Point Cloud Generation

Generative models have proven effective at modeling 3D shapes and their statistical variations. In this paper we investigate their application to point clouds, a 3D shape representation widely used in computer vision for which, however, only few generative models have yet been proposed. We introduce a latent variable model that builds on normalizing flows with affine coupling layers to generate 3D point clouds of an arbitrary size given a latent shape representation.
Paper