All Research Areas
Research Areas
Year Published

492 Results

December 19, 2014

Video (language) Modeling: a Baseline for Generative Models of Natural Videos

ArXiv PrePrint

In this work, we investigate models of natural high-resolution video sequences. We show that very simple models borrowed by language modeling applications are surprisingly effective at recovering shor…

By: Marc'Aurelio Ranzato, Arthur Szlam, Joan Bruna, Michael Mathieu, Ronan Collobert, Sumit Chopra
December 19, 2014

Ensemble of Generative and Discriminative Techniques for Sentiment Analysis of Movie Reviews

ICLR workshop 2015

In this work we ensemble several models and achieve state of the art accuracy for predicting the sentiment of movie reviews in the IMDB dataset.

By: Gregoire Mesnil, Tomas Mikolov, Marc'Aurelio Ranzato, Yoshua Bengio
December 4, 2014

Extracting Translation Pairs from Social Network Content

International Workshop on Spoken Language Translation

We describe two methods to collect translation pairs from public Facebook content. We use the extracted translation pairs as additional training data for machine translation systems and we can show significant improvements.

By: Matthias Eck, Yury Zemlyanskiy, Joy Zhang, Alex Waibel
October 27, 2014

Characterizing Load Imbalance in Real-World Networked Caches

HotNets 2014: Thirteenth ACM Workshop on Hot Topics in Networks

Modern Web services rely extensively upon a tier of in-memory caches to reduce request latencies and alleviate load on backend servers. Within a given cache, items are typically partitioned across cache servers via consistent hashing, with the goal of balancing the number of items maintained by each cache server. Effects of consistent hashing vary by associated hashing function and partitioning ratio. Most real-world workloads are also skewed, with some items significantly more popular than others. Inefficiency in addressing both issues can create an imbalance in cache-server loads. We analyze the degree of observed load imbalance, focusing on read-only traffic against Facebook’s graph cache tier in Tao. We investigate the principal causes of load imbalance, including data co-location, non-ideal hashing scenarios, and hot-spot temporal effects. We also employ trace-drive analytics to study the benefits and limitations of current load-balancing methods, suggesting areas for future research.

By: Qi Huang, Helga Gudmundsdottir, Ymir Vigfusson, Daniel A. Freedman, Ken Birman, Robbert van Renesse
October 24, 2014

The HipHop Virtual Machine

ACM International Conference on Object Oriented Programming Systems, Languages, and Applications

The HipHop Virtual Machine (HHVM) is a JIT compiler and runtime for PHP. While PHP values are dynamically typed, real programs often have latent types that are useful for optimization once discovered….

By: Keith Adams, Jason Evans, Bertrand Maher, Guilherme Ottoni, Drew Paroski, Brett Simmers, Edwin Smith, Owen Yamauchi
October 7, 2014

f4: Facebook’s Warm BLOB Storage System

Operating Systems Design and Implementation

Facebook’s corpus of photos, videos, and other Binary Large OBjects (BLOBs) that need to be reliably stored and quickly accessible is massive and continues to grow.

By: Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, Sanjeev Kumar
October 6, 2014

The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services

Operating Systems Design and Implementation

Current debugging and optimization methods scale poorly to deal with the complexity of modern Internet services, in which a single request triggers parallel execution of numerous heterogeneous softwar…

By: Mike Chow, David Meisner, Jason Flinn, Daniel Peek, Thomas Wenisch
September 18, 2014

Hierarchical Cascade of Classifiers for Efficient Poselet Evaluation

British Machine Vision Conference

Poselets have been used in a variety of computer vision tasks, such as detection, segmentation, action classification, pose estimation and action recognition, often achieving state-of-the-art performa…

By: David Bo Chen, Pietro Perona, Lubomir Bourdev
September 18, 2014

Mining Energy Traces to Aid in Software Development: An Empirical Case Study

ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

With the advent of increased computing on mobile devices such as phones and tablets, it has become crucial to pay attention to the energy consumption of mobile applications.

By: Ashish Gupta, Thomas Zimmermann, Christian Bird, Nachiappan Nagappan, Thirumalesh Bhat, Syed Emran
September 4, 2014

Question Answering with Subgraph Embeddings

Empirical Methods in Natural Language Processing

This paper presents a system which learns to answer questions on a broad range of topics from a knowledge base using few handcrafted features. Our model learns low-dimensional embeddings of words and knowledge base constituents; these representations are used to score natural language questions against candidate answers.

By: Antoine Bordes, Jason Weston, Sumit Chopra