Publication

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

International Conference on Computer Vision (ICCV)


Abstract

We introduce the first goal-driven training for visual question answering and dialog agents. Specifically, we pose a cooperative ‘image guessing’ game between two agents – Q-BOT and A-BOT– who communicate in natural language dialog so that Q-BOT can select an unseen image from a lineup of images. We use deep reinforcement learning (RL) to learn the policies of these agents end-to-end – from pixels to multi-agent multi-round dialog to game reward.

We demonstrate two experimental results.

First, as a ‘sanity check’ demonstration of pure RL (from scratch), we show results on a synthetic world, where the agents communicate in ungrounded vocabularies, i.e., symbols with no pre-specified meanings (X, Y, Z). We find that two bots invent their own communication protocol and start using certain symbols to ask/answer about certain visual attributes (shape/color/style). Thus, we demonstrate the emergence of grounded language and communication among ‘visual’ dialog agents with no human supervision.

Second, we conduct large-scale real-image experiments on the VisDial dataset [5], where we perform supervised pretraining with human-dialog data and show that the RL fine-tuned agents significantly outperform their supervised counterparts. Interestingly, the RL Q-BOT learns to ask questions that A-BOT is good at, ultimately resulting in more informative dialog and a better team. Further, pretraining with human-dialog data (in English) ensures human-interpretability and scope for pairing these agents with humans.

Related Publications

All Publications

Journal of Big Data - July 19, 2021

Cumulative deviation of a subpopulation from the full population

Mark Tygert

NeurIPS - July 16, 2021

Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization

Geoff Pleiss, Martin Jankowiak, David Eriksson, Anil Damle, Jacob R. Gardner

ICML - July 19, 2021

Making Paper Reviewing Robust to Bid Manipulation Attacks

Ruihan Wu, Chuan Guo, Felix Wu, Rahul Kidambi, Laurens van der Maaten, Kilian Q. Weinberger

AISTATS - August 31, 2021

Causal Autoregressive Flows

Ilyes Khemakhem, Ricardo P. Monti, Robert Leech, Aapo Hyvärinen

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy