Research Area
Year Published

532 Results

January 1, 2021

Asynchronous Gradient-Push

IEEE Transactions on Automatic Control

We consider a multi-agent framework for distributed optimization where each agent has access to a local smooth strongly convex function, and the collective goal is to achieve consensus on the parameters that minimize the sum of the agents’ local functions. We propose an algorithm wherein each agent operates asynchronously and independently of the other agents.

By: Mahmoud Assran, Michael Rabbat

May 25, 2020

Differentiable Gaussian Process Motion Planning

International Conference on Robotics and Automation (ICRA)

We propose a method for leveraging past experience to learn how to automatically adapt the parameters of Gaussian Process Motion Planning (GPMP) algorithms. Specifically, we propose a differentiable extension to the GPMP2 algorithm, so that it can be trained end-to-end from data.

By: Mohak Bhardwaj, Byron Boots, Mustafa Mukadam

May 4, 2020

Lookahead converges to stationary points of smooth non-convex functions

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

The Lookahead optimizer [Zhang et al., 2019] was recently proposed and demonstrated to improve performance of stochastic first-order methods for training deep neural networks. Lookahead can be viewed as a two time-scale algorithm, where the fast dynamics (inner optimizer) determine a search direction and the slow dynamics (outer optimizer) perform updates by moving along this direction. We prove that, with appropriate choice of step-sizes, Lookahead converges to a stationary point of smooth non-convex functions.

By: Jianyu Wang, Vinayak Tantia, Nicolas Ballas, Mike Rabbat

May 4, 2020

SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Weakly supervised learning algorithms are critical for scaling audio event detection to several hundreds of sound categories. Such learning models should not only disambiguate sound events efficiently with minimal class-specific annotation but also be robust to label noise, which is more apparent with weak labels instead of strong annotations. In this work, we propose a new framework for designing learning models with weak supervision by bridging ideas from sequential learning and knowledge distillation.

By: Anurag Kumar, Vamsi Krishna Ithapu

April 27, 2020

The Early Phase of Neural Network Training

International Conference on Learning Representations (ICLR)

Recent studies have shown that many important aspects of neural network learning take place within the very earliest iterations or epochs of training. For example, sparse, trainable sub-networks emerge (Frankle et al., 2019), gradient descent moves into a small subspace (Gur-Ari et al., 2018), and the network undergoes a critical period (Achille et al., 2019). Here we examine the changes that deep neural networks undergo during this early phase of training.

By: Jonathan Frankle, David J. Schwab, Ari Morcos

April 26, 2020

SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum

International Conference on Learning Representations (ICLR)

Inspired by the BMUF method of Chen & Huo (2016), we propose a slow momentum (SLOWMO) framework, where workers periodically synchronize and perform a momentum update, after multiple iterations of a base optimization algorithm.

By: Jianyu Wang, Vinayak Tantia, Nicolas Ballas, Mike Rabbat

April 26, 2020

Discovering Motor Programs by Recomposing Demonstrations

International Conference on Learning Representations (ICLR)

In this paper, we present an approach to learn recomposable motor primitives across large-scale and diverse manipulation demonstrations.

By: Tanmay Shankar, Shubham Tulsiani, Lerrel Pinto, Abhinav Gupta

April 25, 2020

And the bit goes down: Revisiting the quantization of neural networks

International Conference on Learning Representations (ICLR)

In this paper, we address the problem of reducing the memory footprint of convolutional network architectures. We introduce a vector quantization method that aims at preserving the quality of the reconstruction of the network outputs rather than its weights.

By: Pierre Stock, Armand Joulin, Rémi Gribonval, Benjamin Graham, Hervé Jégou

April 25, 2020

Permutation Equivariant Models for Compositional Generalization in Language

International Conference on Learning Representations (ICLR)

Humans understand novel sentences by composing meanings and roles of core language components. In contrast, neural network models for natural language modeling fail when such compositional generalization is required. The main contribution of this paper is to hypothesize that language compositionality is a form of group-equivariance. Based on this hypothesis, we propose a set of tools for constructing equivariant sequence-to-sequence models.

By: Jonathan Gordon, David Lopez-Paz, Marco Baroni, Diane Bouchacourt

April 9, 2020

Environment-aware reconfigurable noise suppression

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

The paper proposes an efficient, robust, and reconfigurable technique to suppress various types of noises for any sampling rate. The theoretical analyses, subjective and objective test results show that the proposed noise suppression (NS) solution significantly enhances the speech transmission index (STI), speech intelligibility (SI), signal-to-noise ratio (SNR), and subjective listening experience.

By: Jun Yang, Joshua Bingham