Next generation virtual assistants are envisioned to handle multimodal inputs (e.g., vision, memories of previous interactions, and the user’s utterances), and perform multimodal actions (e.g., displaying a route while generating the system’s utterance). We introduce Situated Interactive MultiModal Conversations (SIMMC) as a new direction aimed at training agents that take multimodal actions grounded in a co-evolving multimodal input context in addition to the dialog history.
By extending recent advances in vision-based tracking and physically based animation, we present the first algorithm capable of tracking high-fidelity hand deformations through highly self-contacting and self-occluding hand gestures, for both single hands and two hands.
We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input. In contrast to the existing literature, our method does not require a pre-scanned personalized mesh template, and thus can be applied to in-the-wild videos.
Change-based testing is a key component of continuous integration at Facebook. However, a large number of tests coupled with a high rate of changes committed to our monolithic repository make it infeasible to run all potentially impacted tests on each change. We propose a new predictive test selection strategy which selects a subset of tests to exercise for each change submitted to the continuous integration system.
We propose an optimization method for grating vector fields that accounts for the unique selectivity properties of HOEs. We further show how our pipeline can be applied to two distinct HOE fabrication methods.
In this paper, we investigate the utility of remote tactile feedback for freehand text-entry on a mid-air Qwerty keyboard in VR. To that end, we use insights from prior work to design a virtual keyboard along with different forms of tactile feedback, both spatial and non-spatial, for fingers and for wrists.
In this article, suggestions are made about why AR platforms may offer ideal affordances to compensate for hearing loss, and how research-focused AR platforms could help toward better understanding of the role of hearing in everyday life.
In this paper, a method for estimating the SCM of reverberant speech is proposed, based on the selection of time-frequency bins dominated by reverberation. The method is data-based and estimates the SCM for a specific acoustic scene. It is therefore applicable to realistic reverberant fields.
A common approach to overcoming the effect of reverberation in speaker localization is to identify the time-frequency (TF) bins in which the direct path is dominant, and then to use only these bins for estimation. Various direct-path dominance (DPD) tests have been proposed for identifying the direct-path bins. However, for a two-microphone binaural array, tests that do not employ averaging over TF bins seem to fail. In this paper, this anomaly is studied by comparing two DPD tests, in which only one has been designed to employ averaging over TF bins.
This paper describes a method for blind estimation of the DRR which involves fitting a beta distribution to the magnitude-squared coherence between two binaural audio signals, aggregated over time and frequency.