Explore the latest research from Facebook

All Publications

January 9, 2022 Xiaohui Zhang, Frank Zhang, Chunxi Liu, Kjell Schubert, Julian Chan, Pradyot Prakash, Jun Liu, Ching-Feng Yeh, Fuchun Peng, Yatharth Saraf, Geoffrey Zweig
Paper

Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR

In this work, to measure the accuracy and efficiency for a latency-controlled streaming automatic speech recognition (ASR) application, we perform comprehensive evaluations on three popular training criteria: LF-MMI, CTC and RNN-T.
Paper
December 14, 2021 Donglai Xiang, Fabián Prada, Timur Bagautdinov, Weipeng Xu, Yuan Dong, He Wen, Jessica Hodgins, Chenglei Wu
Paper

Modeling Clothing as a Separate Layer for an Animatable Human Avatar

We then train a new two-layer codec avatar with separate modeling of the upper clothing and the inner body layer. To learn the interaction between the body dynamics and clothing states, we use a temporal convolution network to predict the clothing latent code based on a sequence of input skeletal poses. We show photorealistic animation output for three different actors, and demonstrate the advantage of our clothed-body avatars over the single-layer avatars used in previous work.
Paper
December 10, 2021 Yangyang Xia, Buye Xu, Anurag Kumar
Paper

Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems

In this paper, we explore methods that enable supervised speech enhancement systems to train on real-world degraded speech data. Specifically, we propose a semi-supervised approach for speech enhancement in which we first train a modified vector-quantized variational autoencoder that solves a source separation task.
Paper
December 9, 2021 Benjamin Aubin, Agnieszka Słowik, Martin Arjovsky, Léon Bottou, David Lopez-Paz
Paper

Linear unit-tests for invariance discovery

The purpose of this note is to propose six linear low-dimensional problems—“unit tests”—to evaluate out-of-distribution generalization algorithms.
Paper
December 6, 2021 Tushar Nagarajan, Kristen Grauman
Paper

Shaping embodied agent behavior with activity-context priors from egocentric video

We introduce an approach to discover activitycontext priors from in-the-wild egocentric video captured with human worn cameras.
Paper
December 6, 2021 Boris Knyazev, Michal Drozdzal, Graham Taylor, Adriana Romero-Soriano
Paper

Parameter Prediction for Unseen Deep Architectures

We study if we can use deep learning to directly predict these parameters by exploiting the past knowledge of training other networks.
Paper
December 6, 2021 Samuel Daulton, Maximilian Balandat, Eytan Bakshy
Paper

Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement

Empirically, we demonstrate not only that qNEHVI is substantially more robust to observation noise than existing MOBO approaches, but also that it achieves state-of-the-art optimization performance and competitive wall-times in large-batch environments.
Paper
December 5, 2021 Roberto Dessì, Eugene Kharitonov, Marco Baroni
Paper

Interpretable agent communication from scratch (with a generic visual processor emerging on the side)

Here, we train two deep nets from scratch to perform large-scale referent identification through unsupervised emergent communication. We show that the partially interpretable emergent protocol allows the nets to successfully communicate even about object classes they did not see at training time.
Paper
December 1, 2021 Karthik Abinav Sankararaman, Aleksandrs Slivkins
Paper

Bandits with Knapsacks beyond the Worst Case

Bandits with Knapsacks (BwK) is a general model for multi-armed bandits under supply/budget constraints. While worst-case regret bounds for BwK are well-understood, we present three results that go beyond the worst-case perspective.
Paper
December 1, 2021 Pranay Manocha, Buye Xu, Anurag Kumar
Paper

NORESQA : A Framework for Speech Quality Assessment using Non-Matching References

In this work, we propose a new direction for speech quality assessment. Inspired by human’s innate ability to compare and assess the quality of speech signals even when they have non-matching contents, we propose a novel framework that predicts a subjective relative quality score for the given speech signal with respect to any provided reference without using any subjective data.
Paper