October 28, 2016

Bilingual Methods for Adaptive Training Data Selection for Machine Translation

Association for the Machine Translation in Americas

We propose a new data selection method which uses semi-supervised convolutional neural networks based on bitokens (Bi-SSCNNs) for training machine translation systems from a large bilingual corpus.us.

Boxing Chen, Roland Kuhn, George Foster, Colin Cherry, Fei Huang
October 10, 2016

Learning to Refine Object Segments

European Conference on Computer Vision

In this work we propose to augment feedforward nets for object segmentation with a novel top-down refinement approach.

Pedro O. Pinheiro, Tsung-Yi Lin, Ronan Collobert, Piotr Dollar
September 18, 2016

A MultiPath Network for Object Detection


We test three modifications to the standard Fast R-CNN object detector to determine if they can overcome the object detection challenges in a COCO object detection dataset.

Sergey Zagoruyko, Adam Lerer, Tsung-Yi Lin, Pedro O. Pinheiro, Sam Gross, Soumith Chintala, Piotr Dollar
August 16, 2016

Synergy of Monotonic Rules


This article describes a method for constructing a special rule (we call it synergy rule) that uses as its input information the outputs (scores) of several monotonic rules which solve the same pattern recognition problem.

Vladimir Vapnik, Rauf Izmailov
August 11, 2016

Semi-supervised Convolutional Networks for Translation Adaptation with Tiny Amount of In-domain Data

Conference on Natural Language Learning

We propose a method which uses semi-supervised convolutional neural networks (CNNs) to select in-domain training data for statistical machine translation.

Boxing Che, Fei Huang
June 8, 2016

Key-Value Memory Networks for Directly Reading Documents

EMNLP 2016

This paper introduces a new method, Key-Value Memory Networks, that makes reading documents more viable by utilizing different encodings in the addressing and output stages of the memory read operation.

Alexander Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, Jason Weston
May 2, 2016

Deep multi-scale video prediction beyond mean square error

ICLR 2016

The paper is about predicting future frames in video sequences given the previous frames.

Michael Mathieu, Camille Couprie, Yann LeCun
May 2, 2016

Metric Learning with Adaptive Density Discrimination


Distance metric learning approaches learn a transformation to a representation space in which distance is in correspondence with a predefined notion of similarity.

Oren Rippel, Manohar Paluri, Piotr Dollar, Lubomir Bourdev
September 17, 2015

Improved Arabic Dialect Classification with Social Media Data

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

Arabic dialect classification has been an important and challenging problem for Arabic language processing, especially for social media text analysis and machine translation. In this paper we propose an approach to improving Arabic dialect classification with semi-supervised learning: multiple classifiers are trained with weakly supervised, strongly supervised, and unsupervised data. Their combination yields significant and consistent improvement on two different test sets.

Fei Huang
June 12, 2015

Web-Scale Training for Face Identification

The IEEE Conference on Computer Vision and Pattern Recognition

We study face recognition and show that three distinct properties have surprising effects on the transferability of deep convolutional networks (CNN)

Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf
December 19, 2014

Predicting the quality of new contributors to the Facebook crowdsourcing system

Neural Information Processing Systems: Crowdsourcing and Machine Learning Workshop

We are interested in improving the quality and coverage of a knowledge graph through crowdsourcing features built into a social networking service. This work presents an approach to model user trust when prior history is lacking.

Julian Eisenschlos
December 4, 2014

Extracting Translation Pairs from Social Network Content

International Workshop on Spoken Language Translation

We describe two methods to collect translation pairs from public Facebook content. We use the extracted translation pairs as additional training data for machine translation systems and we can show significant improvements.

Matthias Eck, Yury Zemlyanskiy, Joy Zhang, Alex Waibel
September 4, 2014

Question Answering with Subgraph Embeddings

Empirical Methods in Natural Language Processing

This paper presents a system which learns to answer questions on a broad range of topics from a knowledge base using few handcrafted features. Our model learns low-dimensional embeddings of words and knowledge base constituents; these representations are used to score natural language questions against candidate answers.

Antoine Bordes, Jason Weston, Sumit Chopra
September 4, 2014

#TagSpace: Semantic Embeddings from Hashtags

Empirical Methods in Natural Language Processing

We describe a convolutional neural network that learns feature representations for short textual posts using hashtags as a supervised signal. The proposed approach is trained on up to 5.5 billion words predicting 100,000 possible hashtags.

Jason Weston, Sumit Chopra, Keith Adams
September 1, 2014

Optimal Crowd-Powered Rating and Filtering Algorithms

VLDB 2014

We focus on crowd-powered filtering, i.e., filtering a large set of items using humans. Filtering is one of the most commonly used building blocks in crowdsourcing applications and systems. While solu…

Aditya Parameswaran, Stephen Boyd, Hector Garcia-Molina, Ashish Gupta, Neoklis Polyzotis, Jennifer Widom
August 24, 2014

Streamed Approximate Counting of Distinct Elements

ACM Conference on Knowledge Discovery and Data Mining (KDD)

Counting the number of distinct elements in a large dataset is a common task in web applications and databases. This problem is difficult in limited memory settings where storing a large hash table ta…

Daniel Ting
August 24, 2014

Practical Lessons from Predicting Clicks on Ads at Facebook

International Workshop on Data Mining for Online Advertising (ADKDD)

Online advertising allows advertisers to only bid and pay for measurable user responses, such as clicks on ads. As a consequence, click prediction systems are central to most online advertising system…

He Xinran, Junfeng Pan, Ou Jin, Tianbing XU, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, Joaquin Quinonero Candela
August 7, 2014

Perceiving, learning, and exploiting object affordances for autonomous pile manipulation

Autonomous Robots

Autonomous manipulation in unstructured environments will enable a large variety of exciting and important applications. Despite its promise, autonomous manipulation remains largely unsolved. Even the…

Dubi Katz, Arun Venkatraman, Moslem Kazemi, J. Andrew Bagnell, Anthony Stentz
June 24, 2014

Collaborative Hashing

Conference on Computer Vision and Pattern Recognition (CVPR)

Hashing technique has become a promising approach for fast similarity search. Most of existing hashing research pursue the binary codes for the same type of entities by preserving their similarities….

Xianglong Liu, Junfeng He, Cheng Deng, Bo Lang
June 24, 2014

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

Conference on Computer Vision and Pattern Recognition (CVPR)

In modern face recognition, the conventional pipeline consists of four stages: detect => align => represent => classify. We revisit both the alignment step and the representation step by employing exp…

Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf