Research Area
Year Published

170 Results

December 17, 2018

Regression-aware decompositions

SIAM Journal on Matrix Analysis and Applications

Linear least-squares regression with a “design” matrix A approximates a given matrix B via minimization of the spectral- or Frobenius-norm discrepancy ||AX − B|| over every conformingly sized matrix X. Also popular is low-rank approximation to B through the “interpolative decomposition,” which traditionally has no supervision from any auxiliary matrix A.

By: Mark Tygert

December 14, 2018

PyText: A seamless path from NLP research to production

We introduce PyText – a deep learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapid experimentation and of serving models at scale.

By: Ahmed Aly, Kushal Lakhotia, Shicong Zhao, Mrinal Mohit, Barlas Oğuz, Abhinav Arora, Sonal Gupta, Christopher Dewan, Stef Nelson-Lindall, Rushin Shah

December 11, 2018

The Costs of Overambitious Seeding of Social Products

International Conference on Complex Networks and their Applications

Product-adoption scenarios are often theoretically modeled as “influence-maximization” (IM) problems, where people influence one another to adopt and the goal is to find a limited set of people to “seed” so as to maximize long-term adoption. In many IM models, if there is no budgetary limit on seeding, the optimal approach involves seeding everybody immediately. Here, we argue that this approach can lead to suboptimal outcomes for “social products” that allow people to communicate with one another.

By: Shankar Iyer, Lada Adamic

December 8, 2018

From Satellite Imagery to Disaster Insights

AI for Social Good Workshop at NeurIPS 2018

The use of satellite imagery has become increasingly popular for disaster monitoring and response. After a disaster, it is important to prioritize rescue operations, disaster response and coordinate relief efforts. These have to be carried out in a fast and efficient manner since resources are often limited in disaster affected areas and it’s extremely important to identify the areas of maximum damage. However, most of the existing disaster mapping efforts are manual which is time-consuming and often leads to erroneous results.

By: Jigar Doshi, Saikat Basu, Guan Pang

December 8, 2018

SGD Implicitly Regularizes Generalization Error

Integration of Deep Learning Theories Workshop at NeurIPS

We derive a simple and model-independent formula for the change in the generalization gap due to a gradient descent update. We then compare the change in the test error for stochastic gradient descent to the change in test error from an equivalent number of gradient descent updates and show explicitly that stochastic gradient descent acts to regularize generalization error by decorrelating nearby updates.

By: Daniel A. Roberts

December 7, 2018

Rethinking floating point for deep learning

Systems for Machine Learning Workshop at NeurIPS 2018

We improve floating point to be more energy efficient than equivalent bit width integer hardware on a 28 nm ASIC process while retaining accuracy in 8 bits with a novel hybrid log multiply/linear add, Kulisch accumulation and tapered encodings from Gustafson’s posit format.

By: Jeff Johnson

December 7, 2018

Causality in Physics and Effective Theories of Agency

Causal Learning Workshop at NeurIPS

We propose to combine reinforcement learning and theoretical physics to describe effective theories of agency. This involves understanding the connection between the physics notion of causality and how intelligent agents can arise as a useful effective description within some environments.

By: Daniel A. Roberts, Max Kleiman-Weiner

December 6, 2018

Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis

Neural Information Processing Systems (NeurIPS)

Optimization algorithms that leverage gradient covariance information, such as variants of natural gradient descent (Amari, 1998), offer the prospect of yielding more effective descent directions. For models with many parameters, the covariance matrix they are based on becomes gigantic, making them inapplicable in their original form.

By: Thomas George, Cesar Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent

December 4, 2018

Non-Adversarial Mapping with VAEs

Neural Information Processing Systems (NeurIPS)

The study of cross-domain mapping without supervision has recently attracted much attention. Much of the recent progress was enabled by the use of adversarial training as well as cycle constraints. The practical difficulty of adversarial training motivates research into non-adversarial methods.

By: Yedid Hoshen

December 4, 2018

A2-Nets: Double Attention Networks

Neural Information Processing Systems (NeurIPS)

Learning to capture long-range relations is fundamental to image/video recognition. Existing CNN models generally rely on increasing depth to model such relations which is highly inefficient. In this work, we propose the “double attention block”, a novel component that aggregates and propagates informative global features from the entire spatio-temporal space of input images/videos, enabling subsequent convolution layers to access features from the entire space efficiently.

By: Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng