All Research Areas
Research Areas
Year Published

62 Results

June 1, 2018

The Immersive VR Self: Performance, Embodiment and Presence in Immersive Virtual Reality Environments

Book chapter from A Networked Self and Human Augmentics, AI, Sentience

Virtual avatars are a common way to present oneself in online social interactions. From cartoonish emoticons to hyper-realistic humanoids, these online representations help us portray a certain image to our respective audiences.

By: Raz Schwartz, William Steptoe
April 21, 2018

Speech Communication through the Skin: Design of Learning Protocols and Initial Findings

Computer Human Interaction (CHI)

This study reports the design and testing of learning protocols with a system that translates English phonemes to haptic stimulation patterns (haptic symbols).

By: Jaehong Jung, Yang Jiao, Frederico M, Severgnini, Hong Z Tan, Charlotte M. Reed, Ali Israr, Frances Lau, Freddy Abnousi
October 22, 2017

Focal Loss for Dense Object Detection

International Conference on Computer Vision (ICCV)

In this paper, we investigate why one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. We design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors.

By: Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar
October 22, 2017

Mask R-CNN

International Conference on Computer Vision (ICCV)

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.

By: Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick
July 21, 2017

Aggregated Residual Transformations for Deep Neural Networks

CVPR 2017

We present a simple, highly modularized network architecture for image classification.

By: Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, Kaiming He
July 21, 2017

Semantic Amodal Segmentation

CVPR 2017

Common visual recognition tasks such as classification, object detection, and semantic segmentation are rapidly reaching maturity, and given the recent rate of progress, it is not unreasonable to conjecture that techniques for many of these problems will approach human levels of performance in the next few years. In this paper we look to the future: what is the next frontier in visual recognition?

By: Yan Zhu, Yuandong Tian, Dimitris Mexatas, Piotr Dollar
July 21, 2017

Feature Pyramid Networks for Object Detection

CVPR 2017

In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost.

By: Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie
July 21, 2017

Learning Features by Watching Objects Move

CVPR 2017

This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired by the human visual system, we explore whether low-level motion-based grouping cues can be used to learn an effective visual representation.

By: Deepak Pathak, Ross Girshick, Piotr Dollar, Trevor Darrell, Bharath Hariharan
June 8, 2017

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

Data @ Scale

In this paper, we empirically show that on the ImageNet dataset large minibatches cause optimization difficulties, but when these are addressed the trained networks exhibit good generalization.

By: Priya Goyal, Piotr Dollar, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, Kaiming He
May 2, 2017

Better Computer Go Player with Neural Network and Long-Term Prediction

International Conference on Learning Representations (ICLR)

Competing with top human players in the ancient game of Go has been a longterm goal of artificial intelligence. Recent works [Maddison et al. (2015); Clark & Storkey (2015)] show that search is not strictly necessary for machine Go players. A pure pattern-matching approach, based on a Deep Convolutional Neural Network (DCNN) that predicts the next move, can perform as well as Monte Carlo Tree Search (MCTS)-based open source Go engines such as Pachi [Baudis & Gailly (2012)] if its search budget is limited. We extend this idea in our bot named darkforest, which relies on a DCNN designed for long-term predictions.

By: Yuandong Tian, Yan Zhu