Publication

High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Conference on Neural Information Processing Systems (NeurIPS)


Abstract

Contextual policies are used in many settings to customize system parameters and actions to the specifics of a particular setting. In some real-world settings, such as randomized controlled trials or A/B tests, it may not be possible to measure policy outcomes at the level of context—we observe only aggregate rewards across a distribution of contexts. This makes policy optimization much more difficult because we must solve a high-dimensional optimization problem over the entire space of contextual policies, for which existing optimization methods are not suitable. We develop effective models that leverage the structure of the search space to enable contextual policy optimization directly from the aggregate rewards using Bayesian optimization. We use a collection of simulation studies to characterize the performance and robustness of the models, and show that our approach of inferring a low-dimensional context embedding performs best. Finally, we show successful contextual policy optimization in a real-world video bitrate policy problem.

Related Publications

All Publications

NSDI - April 12, 2021

A Social Network Under Social Distancing: Risk-Driven Backbone Management During COVID-19 and Beyond

Yiting Xia, Ying Zhang, Zhizhen Zhong, Guanqing Yan, Chiun Lin Lim, Satyajeet Singh Ahuja, Soshant Bali, Alexander Nikolaidis, Kimia Ghobadi, Manya Ghobadi

NeurIPS - December 7, 2020

Labelling unlabelled videos from scratch with multi-modal self-supervision

Yuki M. Asano, Mandela Patrick, Christian Rupprecht, Andrea Vedaldi

NeurIPS - December 7, 2020

Adversarial Example Games

Avishek Joey Bose, Gauthier Gidel, Hugo Berard, Andre Cianflone, Pascal Vincent, Simon Lacoste-Julien, William L. Hamilton

NeurIPS - December 7, 2020

Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search

Linnan Wang, Rodrigo Fonseca, Yuandong Tian

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy