October 22, 2017

Unsupervised Creation of Parameterized Avatars

International Conference on Computer Vision (ICCV)

We study the problem of mapping an input image to a tied pair consisting of a vector of parameters and an image that is created using a graphical engine from the vector of parameters. The mapping’s objective is to have the output image as similar as possible to the input image.

Lior Wolf, Yaniv Taigman, Adam Polyak
October 22, 2017

Learning to Reason: End-to-End Module Networks for Visual Question Answering

International Conference on Computer Vision (ICCV)

In this paper, we propose End-to-End Module Networks (N2NMNs), which learn to reason by directly predicting instance-specific network layouts without the aid of a parser.

Ronghang Hu, Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Kate Saenko
October 22, 2017

Segmentation-Aware Convolutional Networks using Local Attention Masks

International Conference on Computer Vision (ICCV)

We introduce an approach to integrate segmentation information within a convolutional neural network (CNN). This counter-acts the tendency of CNNs to smooth information across regions and increases their spatial precision.

Adam W. Harley, Konstantinos G. Derpanis, Iasonas Kokkinos
October 22, 2017

Focal Loss for Dense Object Detection

International Conference on Computer Vision (ICCV)

In this paper, we investigate why one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. We design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors.

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar
October 22, 2017

Low-shot Visual Recognition by Shrinking and Hallucinating Features

International Conference on Computer Vision (ICCV)

Low-shot visual learning—the ability to recognize novel object categories from very few examples—is a hallmark of human visual intelligence. Existing machine learning approaches fail to generalize in the same way. We present a lowshot learning benchmark on complex images that mimics challenges faced by recognition systems in the wild.

Bharath Hariharan, Ross Girshick
October 22, 2017

Dense and Low-Rank Gaussian CRFs using Deep Embeddings

International Conference on Computer Vision (ICCV)

In this work we introduce a structured prediction model that endows the Deep Gaussian Conditional Random Field (G-CRF) with a densely connected graph structure.

Siddhartha Chandra, Nicolas Usunier, Iasonas Kokkinos
October 22, 2017

Inferring and Executing Programs for Visual Reasoning

International Conference on Computer Vision (ICCV)

Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer.

Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, Larry Zitnick, Ross Girshick
October 22, 2017

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

International Conference on Computer Vision (ICCV)

We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable.

Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra
October 22, 2017

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

International Conference on Computer Vision (ICCV)

We introduce the first goal-driven training for visual question answering and dialog agents. Specifically, we pose a cooperative ‘image guessing’ game between two agents – Q-BOT and A-BOT– who communicate in natural language dialog so that Q-BOT can select an unseen image from a lineup of images.

Abhishek Das, Satwik Kottur, José M.F. Moura, Stefan Lee, Dhruv Batra
October 22, 2017

Predicting Deeper into the Future of Semantic Segmentation

International Conference on Computer Vision (ICCV)

The ability to predict and therefore to anticipate the future is an important attribute of intelligence. We introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future.

Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann LeCun
October 22, 2017

Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training

International Conference on Computer Vision (ICCV)

While strong progress has been made in image captioning recently, machine and human captions are still quite distinct. To address the challenges in this area, we change the training objective of the caption generator from reproducing ground-truth captions to generating a set of captions that is indistinguishable from human written captions.

Rakshith Shetty, Marcus Rohrbach, Lisa Anne Hendricks, Mario Fritz, Bernt Schiele
October 22, 2017

Deltille Grids for Geometric Camera Calibration

International Conference on Computer Vision (ICCV)

The recent proliferation of high resolution cameras presents an opportunity to achieve unprecedented levels of precision in visual 3D reconstruction. Yet the camera calibration pipeline, developed decades ago using checkerboards, has remained the de facto standard. In this paper, we ask the question: are checkerboards the optimal pattern for high precision calibration?

Hyowon Ha, Michal Perdoch, Hatem Alismail, In So Kweon, Yaser Sheikh
October 9, 2017

Labeling And Direction Of Slider Questions

International Journal of Market Research

Using a web survey experiment, this study examines measurement comparability between two radio button questions (fully labelled and endpoint labelled) with slider questions unique to web surveys.

Mingnan Liu
October 8, 2017

Living a Discrete Life in a Continuous World: Reference in Cross-Modal Entity Tracking

Proceedings of IWCS (12th International Conference on Computational Semantics)

This paper (a) introduces a concrete referential task to test both aspects, called cross-modal entity tracking; (b) proposes a neural network architecture that uses external memory to build an entity library inspired in the DRSs of DRT, with a mechanism to dynamically introduce new referents or add information to referents that are already in the library.

Gemma Boleda, Sebastian Pado', Nghia The Pham, Marco Baroni
October 5, 2017

STARDATA: a StarCraft AI Research Dataset

Association for the Advancement of Artificial Intelligence Digital Entertainment Conference

We release a dataset of 65646 StarCraft replays that contains 1535 million frames and 496 million player actions. We provide full game state data along with the original replays that can be viewed in StarCraft.

Zeming Lin, Jonas Gehring, Vasil Khalidov, Gabriel Synnaeve
September 24, 2017

Propagation of Joint Space Quantization Error to Operational Space Coordinates and Their Derivatives

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

In this paper, we characterize encoder error in a robotic system. Given encoder specifications, robot kinematics, and discrete transfer functions mapping coordinates to their derivatives, we describe the propagation of quantization error on joint space coordinates to operational space coordinates, joint space coordinate derivatives, and operational space coordinate derivatives.

Nick Colonnese, Allison M. Okamura
September 20, 2017

Characterizing Large-Scale Production Reliability for 100G Optical Interconnect in Facebook Data Centers

Frontiers in Optics / Laser Science (FiO/LS)

Facebook is deploying cost effective 100G CWDM4 transceivers in data centers. This paper describes the post production performance monitoring system which is being implemented to identify optical interconnect early failure modes.

Abhijit Chakravarty, Srinivasan Giridharan, Matt Kelly, Ashwin Poojary, Vincent Zeng
September 20, 2017

100Gb/s CWDM4 Optical Interconnect at Facebook Data Centers for Bandwidth Enhancement

Frontiers in Optics / Laser Science (FiO/LS)

Facebook has developed 100G data centers from the ground-up by fine tuning optical technologies, optimizing link-budget, limiting operating temperatures and ultimately improving manufacturability. 100G-CWDM4 is an effective technology to enable connectivity over duplex single-mode fiber.

Abhijit Chakravarty, Katharine Schmidtke, Vincent Zeng, Srinivasan Giridharan, Cathie Deal, Reza Niazmand
September 17, 2017

Passive Realtime Datacenter Fault Detection and Localization

USENIX ;login:

In this article, we present our passive hybrid approach that combines network path information with end-host-based statistics to rapidly detect and pinpoint the location of datacenter network faults inside a production Facebook datacenter.

Arjun Roy, James Hongyi Zeng, Jasmeet Bagga, Alex C. Snoeren
September 9, 2017

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

Conference on Empirical Methods on Natural Language Processing (EMNLP)

In this paper, we show how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors (Kiros et al., 2015) on a wide range of transfer tasks.

Alexis Conneau, Douwe Kiela, Holger Schwenk, Loıc Barrault, Antoine Bordes