Explore the latest research from Facebook

All Publications

December 14, 2021 Donglai Xiang, Fabián Prada, Timur Bagautdinov, Weipeng Xu, Yuan Dong, He Wen, Jessica Hodgins, Chenglei Wu
Paper

Modeling Clothing as a Separate Layer for an Animatable Human Avatar

We then train a new two-layer codec avatar with separate modeling of the upper clothing and the inner body layer. To learn the interaction between the body dynamics and clothing states, we use a temporal convolution network to predict the clothing latent code based on a sequence of input skeletal poses. We show photorealistic animation output for three different actors, and demonstrate the advantage of our clothed-body avatars over the single-layer avatars used in previous work.
Paper
November 18, 2021 Bjoern Haefner, Simon Green, Alan Oursland, Daniel Andersen, Michael Goesele, Daniel Cremers, Richard Newcombe, Thomas Whelan
Paper

Recovering Real-World Reflectance Properties and Shading From HDR Imagery

We propose a method to estimate the bidirectional reflectance distribution function (BRDF) and shading of complete scenes under static illumination given the 3D scene geometry and a corresponding high dynamic range (HDR) video.
Paper
October 20, 2021 Sachin Mehta, Amit Kumar, Fitsum Reda, Varun Nasery, Vikram Mulukutla, Rakesh Ranjan, Vikas Chandra
Paper

EVRNet: Efficient Video Restoration on Edge Devices

In video transmission applications, video signals are transmitted over lossy channels, resulting in low-quality received signals. To restore videos on recipient edge devices in real-time, we introduce an efficient video restoration network, EVRNet.
Paper
October 12, 2021 Pingchuan Ma, Rodrigo Mira, Stavros Petridis, Björn W. Schuller, Maja Pantic
Paper

LiRA: Learning Visual Speech Representations from Audio through Self-supervision

In this work, we propose Learning visual speech Representations from Audio via self-supervision (LiRA). Specifically, we train a ResNet+Conformer model to predict acoustic features from unlabelled visual speech.
Paper
October 11, 2021 Hao Jiang, Vamsi Krishna Ithapu
Paper

Egocentric Pose Estimation from Human Vision Span

In this paper, we tackle the egopose estimation from a more natural human vision span, where camera wearer can be seen in the peripheral view and depending on the head pose the wearer may become invisible or has a limited partial view.
Paper
October 11, 2021 Ronghang Hu, Nikhila Ravi, Alexander C. Berg, Deepak Pathak
Paper

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image

We present Worldsheet, a method for novel view synthesis using just a single RGB image as input. The main insight is that simply shrink-wrapping a planar mesh sheet onto the input image, consistent with the learned intermediate depth, captures underlying geometry sufficient to generate photorealistic unseen views with large viewpoint changes.
Paper
October 11, 2021 Garvita Tiwari, Nikolaos Sarafianos, Tony Tung, Gerard Pons-Moll
Paper

Neural-GIF: Neural Generalized Implicit Functions for Animating People in Clothing

We present Neural Generalized Implicit Functions (Neural-GIF), to animate people in clothing as a function of the body pose. Given a sequence of scans of a subject in various poses, we learn to animate the character for new poses.
Paper
October 11, 2021 Nathan Inkawhich, Kevin J Liang, Jingyang Zhang, Huanrui Yang, Hai Li, Yiran Chen
Paper

Can Targeted Adversarial Examples Transfer When the Source and Target Models Have No Label Space Overlap?

We design blackbox transfer-based targeted adversarial attacks for an environment where the attacker’s source model and the target blackbox model may have disjoint label spaces and training datasets. This scenario significantly differs from the “standard” blackbox setting, and warrants a unique approach to the attacking process. Our methodology begins with the construction of a class correspondence matrix between the whitebox and blackbox label sets.
Paper
October 11, 2021 Yash Kant, Abhinav Moudgil, Dhruv Batra, Devi Parikh, Harsh Agrawal
Paper

Contrast and Classify: Training Robust VQA Models

We propose a novel training paradigm (ConClaT) that optimizes both cross-entropy and contrastive losses. The contrastive loss encourages representations to be robust to linguistic variations in questions while the cross-entropy loss preserves the discriminative power of representations for answer prediction.
Paper
October 11, 2021 Bo Xiong, Haoqi Fan, Kristen Grauman, Christoph Feichtenhofer
Paper

Multiview Pseudo-Labeling for Semi-supervised Learning from Video

Though our method capitalizes on multiple views, it nonetheless trains a model that is shared across appearance and motion input and thus, by design, incurs no additional computation overhead at inference time.
Paper