December 4, 2017

Fader Networks: Manipulating Images by Sliding Attributes

Neural Information Processing Systems (NIPS)

This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space.

Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato
November 27, 2017

Fast Gaze-Contingent Optimal Decompositions for Multifocal Displays


Our goal is to enable interactive optimal decomposition algorithms capable of driving a vergence- and accommodation-tracked multifocal testbed. Ultimately, such a testbed is necessary to establish the requirements for the practical use of multifocal displays, in terms of computational demand and hardware accuracy. To this end, we present an efficient algorithm for optimal decompositions, incorporating insights from vision science.

Oliver Mercier, Yusufu Sulai, Kevin Mackenzie, Marina Zannoli, James Hillis, Derek Nowrouzezahrai, Douglas Lanman
November 27, 2017

Casual 3D Photography


We present an algorithm that enables casual 3D photography. Given a set of input photos captured with a hand-held cell phone or DSLR camera, our algorithm reconstructs a 3D photo, a central panoramic, textured, normal mapped, multi-layered geometric mesh representation. Our geometric representation also allows interacting with the scene using 3D geometry-aware effects, such as adding new objects to the scene and artistic lighting effects.

Peter Hedman, Suhib Alsisan, Richard Szeliski, Johannes Kopf
November 27, 2017

Bringing Portraits to Life


We present a technique to automatically animate a still portrait, making it possible for the subject in the photo to […]

Hadar Averbuch-Elor, Daniel Cohen-Or, Johannes Kopf, Michael Cohen
November 14, 2017

Fine Pointing Concepts for Optical Intersatellite Links

International Conference on Space Optical Systems (ICSOS)

In this paper we present two fine pointing architectures that support high-accuracy pointing, while also featuring simplification of the small-space optical assembly hardware, reduction in the number and complexity of constituent components, full self-calibration capability, and relaxation of otherwise-strict, cost-driving subassembly requirements.

Eric Miller, Kevin Birnbaum, Chien-Chung Chen, Andrew Grier, Matt Hunwardsen, Dominic Jandrain
November 1, 2017

High-Resolution Measurement of Data Center Microbursts

ACM Internet Measurement Conference

In this study, we explore the fine-grained behaviors of a large production data center using extremely highresolution measurements (10s to 100s of microsecond) of rack-level traffic. Our results show that characterizing network events like congestion and synchronized behavior in data centers does indeed require the use of such measurements.

Qiao Zhang, Vincent Liu, James Hongyi Zeng, Arvind Krishnamurthy
November 1, 2017

An Empirical Characterization of IFTTT: Ecosystem, Usage, and Performance

ACM Internet Measurement Conference 2017

IFTTT is a popular trigger-action programming platform whose applets can automate more than 400 services of IoT devices and web applications. We conduct an empirical study of IFTTT using a combined approach of analyzing data collected for 6 months and performing controlled experiments using a custom testbed.

Xianghang Mi, Feng Qian, Ying Zhang, XiaoFeng Wang
October 30, 2017

Crowd Intelligence Enhances Automated Mobile Testing

Automated Software Engineering Conference (ASE)

In this paper we show that information extracted from crowdbased testing can enhance automated mobile testing and introduce POLARIZ, which generates replicable test scripts from crowd-based testing, extracting cross-app ‘motif’ events.

Ke Mao, Mark Harman, Yue Jia
October 29, 2017

SVE: Distributed Video Processing at Facebook Scale

Symposium on Operating Systems Principles (SOSP)

This paper describes the evolution from our initial monolithic encoding script (MES) system to our current Streaming Video Engine (SVE) to upload and process videos. SVE has been in production since the fall of 2015, provides lower latency than MES, supports many diverse video applications, and has proven to be reliable despite faults and overload.

Qi Huang, Petchean Ang, Peter Knowles, Tomasz Nykiel, Iaroslav Tverdokhlib, Amit Yajurvedi, Paul Dapolito IV, Xifan Yan, Maxim Bykov, Chuen Liang, Mohit Talwar, Abhishek Mathur, Sachin Kulkarni, Matthew Burke, Wyatt Lloyd
October 28, 2017

Canopy: An End-to-End Performance Tracing and Analysis System

Symposium on Operating Systems Principles (SOSP)

This paper presents Canopy, Facebook’s end-to-end performance tracing infrastructure. Using Canopy, Facebook engineers can query and analyze performance data in real-time.

Jonathan Kaldor, Jonathan Mace, Michał Bejda, Edison Gao, Wiktor Kuropatwa, Joe O’Neill, Kian Win Ong, Bill Schaller, Pingjia Shan, Brendan Viscomi, Vinod Venkataraman, Kaushik Veeraraghavan, Yee Jiun Song
October 25, 2017

DodecaPen: Accurate 6DoF Tracking of a Passive Stylus

ACM Symposium on User Interface Software and Technology (UIST)

We propose a system for real-time six degrees of freedom (6DoF) tracking of a passive stylus that achieves sub-millimeter accuracy, which is suitable for writing or drawing in mixed reality applications.

Po-Chen Wu, Robert Wang, Kenrick Kin, Christopher Twigg, Shangchen Han, Ming-Hsuan Yang, Shao-Yi Chien
October 22, 2017

Focal Loss for Dense Object Detection

International Conference on Computer Vision (ICCV)

In this paper, we investigate why one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. We design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors.

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar
October 22, 2017

Low-shot Visual Recognition by Shrinking and Hallucinating Features

International Conference on Computer Vision (ICCV)

Low-shot visual learning—the ability to recognize novel object categories from very few examples—is a hallmark of human visual intelligence. Existing machine learning approaches fail to generalize in the same way. We present a lowshot learning benchmark on complex images that mimics challenges faced by recognition systems in the wild.

Bharath Hariharan, Ross Girshick
October 22, 2017

Dense and Low-Rank Gaussian CRFs using Deep Embeddings

International Conference on Computer Vision (ICCV)

In this work we introduce a structured prediction model that endows the Deep Gaussian Conditional Random Field (G-CRF) with a densely connected graph structure.

Siddhartha Chandra, Nicolas Usunier, Iasonas Kokkinos
October 22, 2017

Inferring and Executing Programs for Visual Reasoning

International Conference on Computer Vision (ICCV)

Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer.

Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Judy Hoffman, Li Fei-Fei, Larry Zitnick, Ross Girshick
October 22, 2017

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

International Conference on Computer Vision (ICCV)

We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable.

Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra
October 22, 2017

Learning Visual N-Grams from Web Data

International Conference on Computer Vision (ICCV)

This paper explores the training of image-recognition systems on large numbers of images and associated user comments, without using manually labeled images.

Ang Li, Allan Jabri, Armand Joulin, Laurens van der Maaten
October 22, 2017

Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning

International Conference on Computer Vision (ICCV)

We introduce the first goal-driven training for visual question answering and dialog agents. Specifically, we pose a cooperative ‘image guessing’ game between two agents – Q-BOT and A-BOT– who communicate in natural language dialog so that Q-BOT can select an unseen image from a lineup of images.

Abhishek Das, Satwik Kottur, José M.F. Moura, Stefan Lee, Dhruv Batra
October 22, 2017

Transitive Invariance for Self-supervised Visual Representation Learning

International Conference on Computer Vision (ICCV)

In this paper, we propose to exploit different self-supervised approaches to learn representations invariant to (i) inter-instance variations (two objects in the same class should have similar features) and (ii) intra-instance variations (viewpoint, pose, deformations, illumination, etc.).

Xiaolong Wang, Kaiming He, Abhinav Gupta
October 22, 2017

Predicting Deeper into the Future of Semantic Segmentation

International Conference on Computer Vision (ICCV)

The ability to predict and therefore to anticipate the future is an important attribute of intelligence. We introduce the novel task of predicting semantic segmentations of future frames. Given a sequence of video frames, our goal is to predict segmentation maps of not yet observed video frames that lie up to a second or further in the future.

Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann LeCun