Explore the latest research from Facebook
Filter by
Research Area
- All
- Academic Programs
- AR/VR
- Artificial Intelligence
- Blockchain & Cryptoeconomics
- Computational Photography & Intelligent Cameras
- Computer Vision
- Data Science
- Databases
- Economics & Computation
- Human Computer Interaction & UX
- Machine Learning
- Natural Language Processing & Speech
- Networking & Connectivity
- Security & Privacy
- Systems & Infrastructure
All Publications
Meta Learning via Learned Loss
In this paper, we take the first step towards automating this process, with the view of producing models which train faster and more robustly. Concretely, we present a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures.
Paper
Online Bayesian Persuasion
In Bayesian persuasion, an informed sender has to design a signaling scheme that discloses the right amount of information so as to influence the behavior of a self-interested receiver. This kind of strategic interaction is ubiquitous in real-world economic scenarios. However, the seminal model by Kamenica and Gentzkow makes some stringent assumptions that limit its applicability in practice.
Paper
Triple descent and the two kinds of overfitting: Where & why do they appear?
A recent line of research has highlighted the existence of a “double descent” phenomenon in deep learning, whereby increasing the number of training examples N causes the generalization error of neural networks to peak when N is of the same order as the number of parameters P. In earlier works, a similar phenomenon was shown to exist in simpler models such as linear regression, where the peak instead occurs when N is equal to the input dimension D. Since both peaks coincide with the interpolation threshold, they are often conflated in the literature. In this paper, we show that despite their apparent similarity, these two scenarios are inherently different.
Paper
Best Practices for Data-Efficient Modeling in NLG: How to Train Production-Ready Neural Models with Less Data
In this paper, we present approaches that have helped us deploy data-efficient neural solutions for NLG in conversational systems to production. We describe a family of sampling and modeling techniques to attain production quality with light-weight neural network models using only a fraction of the data that would be necessary otherwise, and show a thorough comparison between each.
Paper
Instance Selection for GANs
In this work we propose a novel approach to improve sample quality: altering the training dataset via instance selection before model training has taken place. By refining the empirical data distribution before training, we redirect model capacity towards high-density regions, which ultimately improves sample fidelity, lowers model capacity requirements, and significantly reduces training time.
Paper
3D Shape Reconstruction from Vision and Touch
When a toddler is presented a new toy, their instinctual behaviour is to pick it up and inspect it with their hand and eyes in tandem, clearly searching over its surface to properly understand what they are playing with. At any instance here, touch provides high fidelity localized information while vision provides complementary global context. However, in 3D shape reconstruction, the complementary fusion of visual and haptic modalities remains largely unexplored. In this paper, we study this problem and present an effective chart-based approach to multi-modal shape understanding which encourages a similar fusion vision and touch information.
Paper
BOTORCH: A Framework for Efficient Monte-Carlo Bayesian Optimization
We introduce BOTORCH, a modern programming framework for Bayesian optimization that combines Monte-Carlo (MC) acquisition functions, a novel sample average approximation optimization approach, auto-differentiation, and variance reduction techniques.
Paper
Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees
In this paper, we provide the first efficient implementation of general multi-step lookahead Bayesian optimization, formulated as a sequence of nested optimization problems within a multi-step scenario tree.
Paper
Learning Optimal Representations with the Decodable Information Bottleneck
We propose the Decodable Information Bottleneck (DIB) that considers information retention and compression from the perspective of the desired predictive family. As a result, DIB gives rise to representations that are optimal in terms of expected test performance and can be estimated with guarantees.
Paper
Deep Transformers with Latent Depth
The Transformer model has achieved state-of-the-art performance in many sequence modeling tasks. However, how to leverage model capacity with large or variable depths is still an open challenge. We present a probabilistic framework to automatically learn which layer(s) to use by learning the posterior distributions of layer selection.
Paper