Research Area
Year Published

170 Results

December 4, 2018

Improving Semantic Parsing for Task Oriented Dialog

Conversational AI Workshop at NeurIPS 2018

Semantic parsing using hierarchical representations has recently been proposed for task oriented dialog with promising results. In this paper, we present three different improvements to the model: contextualized embeddings, ensembling, and pairwise re-ranking based on a language model.

By: Arash Einolghozati, Panupong Pasupat, Sonal Gupta, Rushin Shah, Mrinal Mohit, Mike Lewis, Luke Zettlemoyer

December 4, 2018

Cross-lingual Transfer Learning for Multilingual Task Oriented Dialog

North American Chapter of the Association for Computational Linguistics (NAACL)

In this paper, we investigate the performance of three different methods for cross-lingual transfer learning, namely (1) translating the training data, (2) using cross-lingual pre-trained embeddings, and (3) a novel method of using a multilingual machine translation encoder as contextual word representations.

By: Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis

December 3, 2018

Fighting Boredom in Recommender Systems with Linear Reinforcement Learning

Neural Information Processing Systems (NeurIPS)

A common assumption in recommender systems (RS) is the existence of a best fixed recommendation strategy. Such strategy may be simple and work at the item level (e.g., in multi-armed bandit it is assumed one best fixed arm/item exists) or implement more sophisticated RS (e.g., the objective of A/B testing is to find the best fixed RS and execute it thereafter). We argue that this assumption is rarely verified in practice, as the recommendation process itself may impact the user’s preferences.

By: Romain Warlop, Alessandro Lazaric, Jérémie Mary

December 3, 2018

Training with Low-precision Embedding Tables

Systems for Machine Learning Workshop at NeurIPS 2018

In this work, we focus on building a system to train continuous embeddings in low precision floating point representation. Specifically, our system performs SGD-style model updates in single precision arithmetics, casts the updated parameters using stochastic rounding and stores the parameters in half-precision floating point.

By: Jian Zhang, Jiyan Yang, Hector Yuen

December 3, 2018

Explore-Exploit: A Framework for Interactive and Online Learning

Systems for Machine Learning Workshop at NeurIPS 2018

We present Explore-Exploit: a framework designed to collect and utilize user feedback in an interactive and online setting that minimizes regressions in end-user experience. This framework provides a suite of online learning operators for various tasks such as personalization ranking, candidate selection and active learning.

By: Honglei Liu, Anuj Kumar, Wenhai Yang, Benoit Dumoulin

December 3, 2018

Forward Modeling for Partial Observation Strategy Games – A StarCraft Defogger

Neural Information Processing Systems (NeurIPS)

We formulate the problem of defogging as state estimation and future state prediction from previous, partial observations in the context of real-time strategy games. We propose to employ encoder-decoder neural networks for this task, and introduce proxy tasks and baselines for evaluation to assess their ability of capturing basic game rules and high-level dynamics.

By: Gabriel Synnaeve, Zeming Lin, Jonas Gehring, Dan Gant, Vegard Mella, Vasil Khalidov, Nicolas Carion, Nicolas Usunier

December 3, 2018

High-Level Strategy Selection under Partial Observability in StarCraft: Brood War

Reinforcement Learning under Partial Observability Workshop at NeurIPS 2018

We consider the problem of high-level strategy selection in the adversarial setting of real-time strategy games from a reinforcement learning perspective, where taking an action corresponds to switching to the respective strategy.

By: Jonas Gehring, Da Ju, Vegard Mella, Daniel Gant, Nicolas Usunier, Gabriel Synnaeve

December 3, 2018

Temporal Regularization in Markov Decision Process

Neural Information Processing Systems (NeurIPS)

This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games.

By: Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup

December 3, 2018

Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes

Neural Information Processing Systems (NeurIPS)

While designing the state space of an MDP, it is common to include states that are transient or not reachable by any policy (e.g., in mountain car, the product space of speed and position contains configurations that are not physically reachable). This results in weakly-communicating or multi-chain MDPs. In this paper, we introduce TUCRL, the first algorithm able to perform efficient exploration-exploitation in any finite Markov Decision Process (MDP) without requiring any form of prior knowledge.

By: Ronan Fruit, Matteo Pirotta, Alessandro Lazaric

December 2, 2018

One-Shot Unsupervised Cross Domain Translation

Neural Information Processing Systems (NeurIPS)

Given a single image x from domain A and a set of images from domain B, our task is to generate the analogous of x in B. We argue that this task could be a key AI capability that underlines the ability of cognitive agents to act in the world and present empirical evidence that the existing unsupervised domain translation methods fail on this task.

By: Sagie Benaim, Lior Wolf