Research Area
Year Published

101 Results

September 4, 2018

Self-Supervised Feature Learning for Semantic Segmentation of Overhead Imagery

British Machine Vision Convention (BMVC)

In this work, we study various self-supervised feature learning techniques for semantic segmentation of overhead imageries.

By: Suriya Singh, Anil Batra, Guan Pang, Lorenzo Torresani, Saikat Basu, Manohar Paluri, C.V. Jawahar
September 3, 2018

QuaterNet: A Quaternion-based Recurrent Model for Human Motion

British Machine Vision Convention (BMVC)

Deep learning for predicting or generating 3D human pose sequences is an active research area. Previous work regresses either joint rotations or joint positions. The former strategy is prone to error accumulation along the kinematic chain, as well as discontinuities when using Euler angle or exponential map parameterizations. The latter requires re-projection onto skeleton constraints to avoid bone stretching and invalid configurations. This work addresses both limitations.

By: Dario Pavllo, David Grangier, Michael Auli
August 26, 2018

Rosetta: Large Scale System for Text Detection and Recognition in Images

Knowledge Discovery in Databases (KDD)

In this paper we present a deployed, scalable optical character recognition (OCR) system, which we call Rosetta, designed to process images uploaded daily at Facebook scale.

By: Fedor Borisyuk, Albert Gordo, Viswanath Sivakumar
August 21, 2018

SE-Sync: A certifiably correct algorithm for synchronization over the special Euclidean group

International Journal of Robotics Research

Many important geometric estimation problems naturally take the form of synchronization over the special Euclidean group: estimate the values of a set of unknown group elements x1, . . . , xn ∈ SE(d) given noisy measurements of a subset of their pairwise relative transforms xi−1xj. Examples of this class include the foundational problems of pose-graph simultaneous localization and mapping (SLAM) (in robotics), camera motion estimation (in computer vision), and sensor network localization (in distributed sensing), among others. This inference problem is typically formulated as a nonconvex maximum-likelihood estimation that is computationally hard to solve in general. Nevertheless, in this paper we present an algorithm that is able to efficiently recover certifiably globally optimal solutions of the special Euclidean synchronization problem in a non-adversarial noise regime.

By: David M. Rosen, Luca Carlone, Afonso S. Bandeira, John J. Leonard
August 14, 2018

Deep Appearance Models for Face Rendering

ACM SIGGRAPH

We introduce a deep appearance model for rendering the human face. Inspired by Active Appearance Models, we develop a data-driven rendering pipeline that learns a joint representation of facial geometry and appearance from a multiview capture setup.

By: Stephen Lombardi, Jason Saragih, Tomas Simon, Yaser Sheikh
August 13, 2018

Unsupervised Generation of Free-Form and Parameterized Avatars

TPAMI: SI: ICCV

We study two problems involving the task of mapping images between different domains. The first problem, transfers an image in one domain to an analog image in another domain. The second problem, extends the previous one by mapping an input image to a tied pair, consisting of a vector of parameters and an image that is created using a graphical engine from this vector of parameters.

By: Adam Polyak, Yaniv Taigman, Lior Wolf
July 29, 2018

Online Optical Marker-based Hand Tracking with Deep Labels

Special Interest Group on Computer Graphics and Interactive Techniques (SIGGRAPH)

We propose a technique that frames the labeling problem as a keypoint regression problem conducive to a solution using convolutional neural networks.

By: Shangchen Han, Beibei Liu, Robert Wang, Yuting Ye, Christopher D. Twigg, Kenrick Kin
July 10, 2018

Optimizing the Latent Space of Generative Networks

International Conference on Machine Learning (ICML)

The goal of this paper is to disentangle the contribution of these two factors to the success of GANs. In particular, we introduce Generative Latent Optimization (GLO), a framework to train deep convolutional generators using simple reconstruction losses.

By: Piotr Bojanowski, Armand Joulin, David Lopez-Paz, Arthur Szlam
July 7, 2018

Reconstructing Scenes with Mirror and Glass Surfaces

ACM SIGGRAPH

We introduce a fully automatic pipeline that allows us to reconstruct the geometry and extent of planar glass and mirror surfaces while being able to distinguish between the two.

By: Thomas Whelan, Michael Goesele, Steven J. Lovegrove, Julian Straub, Simon Green, Richard Szeliski, Steven Butterfield, Shobhit Verma, Richard Newcombe
June 19, 2018

A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts

Computer Vision and Pattern Recognition (CVPR)

Most existing zero-shot learning methods consider the problem as a visual semantic embedding one. Given the demonstrated capability of Generative Adversarial Networks (GANs) to generate images, we instead leverage GANs to imagine unseen categories from text descriptions and hence recognize novel classes with no examples being seen.

By: Yizhe Zhu, Mohamed Elhoseiny, Bingchen Liu, Xi Peng, Ahmed Elgammal