Research Area
Year Published

881 Results

August 15, 2019

PHYRE: A New Benchmark for Physical Reasoning

Understanding and reasoning about physics is an important ability of intelligent agents. We develop the PHYRE benchmark for physical reasoning that contains a set of simple classical mechanics puzzles in a 2D physical environment.

By: Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick

August 15, 2019

Automated Hot Text and Huge Pages: An Easy-to-adopt Solution Towards High Performing Services

International Conference on Web Services (ICWS)

We have achieved CPU reduction by applying a solution that firstly identifies hot-text of the (software) binary and then places the binary on huge pages (i.e., 2MB+ memory pages). The solution is wrapped into an automated framework, enabling service owners to effortlessly adopt it.

By: Zhenyun Zhuang, Mark Santaniello, Shumin Zhao, Bikash Sharma, Rajit Kambo

August 12, 2019

Efficient Segmentation: Learning Downsampling Near Semantic Boundaries

International Conference on Computer Vision (ICCV)

Many automated processes such as auto-piloting rely on a good semantic segmentation as a critical component. To speed up performance, it is common to downsample the input frame. However, this comes at the cost of missed small objects and reduced accuracy at semantic boundaries. To address this problem, we propose a new content-adaptive downsampling technique that learns to favor sampling locations near semantic boundaries of target classes.

By: Dmitrii Marin, Zijian He, Peter Vajda, Priyam Chatterjee, Sam Tsai, Fei Yang, Yuri Boykov

August 4, 2019

MSURU: Large Scale E-commerce Image Classification With Weakly Supervised Search Data

Conference on Knowledge Discovery and Data Mining (KDD)

In this paper we present a deployed image recognition system used in a large scale commerce search engine, which we call MSURU. It is designed to process product images uploaded daily to Facebook Marketplace. Social commerce is a growing area within Facebook and understanding visual representations of product content is important for search and recommendation applications on Marketplace.

By: Yina Tang, Fedor Borisyuk, Siddarth Malreddy, Yixuan Li, Yiqun Liu, Sergey Kirshner

August 2, 2019

Low-Resource Corpus Filtering using Multilingual Sentence Embeddings

Association for Computational Linguistics (ACL)

In this paper, we describe our submission to the WMT19 low-resource parallel corpus filtering shared task. Our main approach is based on the LASER toolkit (Language-Agnostic SEntence Representations), which uses an encoder-decoder architecture trained on a parallel corpus to obtain multilingual sentence representations.

By: Vishrav Chaudhary, Yuqing Tang, Francisco (Paco) Guzman, Holger Schwenk, Philipp Koehn

August 2, 2019

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

Conference on Machine Translation (WMT) at ACL

Following the WMT 2018 Shared Task on Parallel Corpus Filtering (Koehn et al., 2018), we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2% and 10% of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali– English and Sinhala–English. Eleven participants from companies, national research labs, and universities participated in this task.

By: Philipp Koehn, Francisco (Paco) Guzman, Vishrav Chaudhary, Juan Pino

August 1, 2019

Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue

Annual Meeting of the Association for Computational Linguistics (ACL)

In this paper, we (1) propose using tree-structured semantic representations, like those used in traditional rule-based NLG systems, for better discourse-level structuring and sentence-level planning; (2) introduce a challenging dataset using this representation for the weather domain; (3) introduce a constrained decoding approach for Seq2Seq models that leverages this representation to improve semantic correctness; and (4) demonstrate promising results on our dataset and the E2E dataset.

By: Anusha Balakrishnan, Jinfeng Rao, Kartikeya Upasani, Michael White, Rajen Subba

August 1, 2019

Exploring Deep Multimodal Fusion of Text and Photo for Hate Speech Classification

Workshop on Abusive Language Online

We present a number of fusion approaches to integrate text and photo signals. We show that augmenting text with image embedding information immediately leads to a boost in performance, while applying additional attention fusion methods brings further improvement.

By: Fan Yang, Xiaochang Peng, Gargi Ghosh, Reshef Shilon, Hao Ma, Eider Moore, Goran Predovic

August 1, 2019

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

Annual Meeting of the Association for Computational Linguistics (ACL)

Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient. This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks.

By: Yi Tay, Aston Zhang, Luu Anh Tuan, Jinfeng Rao, Shuai Zhang, Shuohang Wang, Jie Fu, Siu Cheung Hui

August 1, 2019

Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives

Annual Meeting of the Association for Computational Linguistics (ACL)

This paper tackles the problem of reading comprehension over long narratives where documents easily span over thousands of tokens. We propose a curriculum learning (CL) based Pointer-Generator framework for reading/sampling over large documents, enabling diverse training of the neural model based on the notion of alternating contextual difficulty.

By: Yi Tay, Shuohang Wang, Luu Anh Tuan, Jie Fu, Minh C. Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, Aston Zhang