Research Area
Year Published

103 Results

October 1, 2018

The effects of natural scene statistics on text readability in additive displays

Human Factors and Ergonomics Society

The minimum contrast needed for optimal text readability with additive displays (e.g. AR devices) will depend on the spatial structure of the background and text. Natural scenes and text follow similar spectral patterns. Therefore, natural scenes can mask low contrast text – making it difficult to read. In a set of experiments, we determine the minimum viable contrast for readability on an additive display.

By: Daryn R. Blanc-Goldhammer, Kevin J. MacKenzie
Areas: AR/VR

September 9, 2018

DDRNet: Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs

European Conference on Computer Vision (ECCV)

Although plenty of progresses have been made to reduce the noises and boost geometric details, due to the inherent illness and the real-time requirement, the problem is still far from been solved. We propose a cascaded Depth Denoising and Refinement Network (DDRNet) to tackle this problem by leveraging the multi-frame fused geometry and the accompanying high quality color image through a joint training strategy.

By: Shi Yan, Chenglei Wu, Lizhen Wang, Feng Xu, Liang An, Kaiwen Guo, Yebin Liu

September 8, 2018

DeepWrinkles: Accurate and Realistic Clothing Modeling

European Conference on Computer Vision (ECCV)

We present a novel method to generate accurate and realistic clothing deformation from real data capture. Previous methods for realistic cloth modeling mainly rely on intensive computation of physics-based simulation (with numerous heuristic parameters), while models reconstructed from visual observations typically suffer from lack of geometric details.

By: Zorah Lähner, Daniel Cremers, Tony Tung

September 7, 2018

Recycle-GAN: Unsupervised Video Retargeting

European Conference on Computer Vision (ECCV)

We introduce a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i.e., if contents of John Oliver’s speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert’s style.

By: Aayush Bansal, Shugao Ma, Deva Ramanan, Yaser Sheikh

September 3, 2018

QuaterNet: A Quaternion-based Recurrent Model for Human Motion

British Machine Vision Convention (BMVC)

Deep learning for predicting or generating 3D human pose sequences is an active research area. Previous work regresses either joint rotations or joint positions. The former strategy is prone to error accumulation along the kinematic chain, as well as discontinuities when using Euler angle or exponential map parameterizations. The latter requires re-projection onto skeleton constraints to avoid bone stretching and invalid configurations. This work addresses both limitations.

By: Dario Pavllo, David Grangier, Michael Auli

August 26, 2018

Designing Variable Stiffness Profiles To Optimize The Physical Human Robot Interface Of Hand Exoskeletons

The 7th IEEE RAS/EMBS International Conference on Biomedical Robotics and Biomechatronics

The design of comfortable and effective physical human robot interaction (pHRI) interfaces for force transfer is a prominent challenge for coupled human-robot systems. Forces applied by the robot at the fingers create reaction forces on the dorsal surface of the hand, often leading to high pressure concentrations which can cause pain and discomfort. In this paper, the interaction between the pHRI interface and the dorsal surface of the hand is systematically characterized, and a new method for the design of comfortable interfaces is presented.

By: Rohit John Varghese, Gaurav Mukherjee, Raymond King, Sean Keller, Ashish D. Deshpande

August 21, 2018

SE-Sync: A certifiably correct algorithm for synchronization over the special Euclidean group

International Journal of Robotics Research

Many important geometric estimation problems naturally take the form of synchronization over the special Euclidean group: estimate the values of a set of unknown group elements x1, . . . , xn ∈ SE(d) given noisy measurements of a subset of their pairwise relative transforms xi−1xj. Examples of this class include the foundational problems of pose-graph simultaneous localization and mapping (SLAM) (in robotics), camera motion estimation (in computer vision), and sensor network localization (in distributed sensing), among others. This inference problem is typically formulated as a nonconvex maximum-likelihood estimation that is computationally hard to solve in general. Nevertheless, in this paper we present an algorithm that is able to efficiently recover certifiably globally optimal solutions of the special Euclidean synchronization problem in a non-adversarial noise regime.

By: David M. Rosen, Luca Carlone, Afonso S. Bandeira, John J. Leonard

August 14, 2018

Deep Appearance Models for Face Rendering

ACM SIGGRAPH

We introduce a deep appearance model for rendering the human face. Inspired by Active Appearance Models, we develop a data-driven rendering pipeline that learns a joint representation of facial geometry and appearance from a multiview capture setup.

By: Stephen Lombardi, Jason Saragih, Tomas Simon, Yaser Sheikh

August 10, 2018

Effects of virtual acoustics on target-word identification performance in multi-talker environments

ACM Symposium on Applied Perception (SAP)

Many virtual reality applications let multiple users communicate in a multi-talker environment, recreating the classic cocktail-party effect. While there is a vast body of research focusing on the perception and intelligibility of human speech in real-world scenarios with cocktail party effects, there is little work in accurately modeling and evaluating the effect in virtual environments.

By: Atul Rungta, Nicholas Rewkowski, Carl Schissler, Philip Robinson, Ravish Mehra, Dinesh Manocha
Areas: AR/VR

August 10, 2018

Learning to Feel Words: A Comparison of Learning Approaches to Acquire Haptic Words

ACM Symposium on Applied Perception (SAP)

Recent studies have shown that decomposing spoken or written language into phonemes and transcribing each phoneme into a unique vibrotactile pattern enables people to receive lexical messages on the arm. A potential barrier to adopting this new communication system is the time and effort required to learn the association between phonemes and vibrotactile patterns. Therefore, in this study, we compared the learnability and generalizability of different learning approaches, including guided learning, self-directed learning, and a mnemonic device.

By: Jennifer Chen, Robert Turcott, Pablo Castillo, Wahyudinata Setiawan, Frances Lau, Ali Israr
Areas: AR/VR