December 1, 2015

Internet Use and Psychological Well-Being: Effects of Activity and Audience

Communications of the ACM

Two lines of research fifteen years apart demonstrate that talking with close friends online is associated with improvements in social support, depression, and other measures of well-being. Talking with strangers and reading about acquaintances are not.

Robert Kraut, Moira Burke
November 25, 2015

A Roadmap towards Machine Intelligence

ArXiv PrePrint

We describe one possible roadmap how to develop intelligent machines with communication skills that can perform useful tasks for us.

Tomas Mikolov, Armand Joulin, Marco Baroni
November 23, 2015

MazeBase: A Sandbox for Learning from Games

ArXiv PrePrint

Environment for simple 2D maze games, designed as a sandbox for machine learning approaches to reasoning and planning

Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, Rob Fergus
November 23, 2015

Learning Simple Algorithms from Examples

ArXiv PrePrint

We present an approach for learning simple algorithms such as addition or multiplication. Our methods works as a hard attention model on both input and output and it is learn with reinforcement learning.

Wojciech Zaremba, Armand Joulin, Tomas Mikolov, Rob Fergus
November 20, 2015

Sequence Level Training with Recurrent Neural Networks

ICLR 2016

This work aims at improving text generation for applications such as summarization, machine translation and image captioning. The key idea is to learn to predict not just the next word but a whole sequence of words, and to train end-to-end at the sequence level.

Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba
November 19, 2015

Alternative structures for character-level RNNs

ArXiv PrePrint

We present two alternative structures to character level recurrent networks. Character level RNNs are known to be very inefficient while achieving lower performance than word level RNNs.

Piotr Bojanowski, Tomas Mikolov, Armand Joulin
October 4, 2015

Holistic Configuration Management at Facebook

The 25th ACM Symposium on Operating Systems Principles

This paper gives a comprehensive description of the use cases, design, implementation, and usage statistics of a suite of tools that manage Facebook’s configuration end-to-end, including the frontend products, backend systems, and mobile apps.

Chunqiang (CQ) Tang, Thawan Kooburat, Pradeep Venkat, Akshay Chander, Zhe Wen, Aravind Narayanan, Patrick Dowell, Robert Karl
October 4, 2015

Existential Consistency: Measuring and Understanding Consistency at Facebook

The 25th ACM Symposium on Operating Systems Principles

Replicated storage for large Web services faces a trade-off between stronger forms of consistency and higher performance properties. Stronger consistency prevents anomalies, i.e., unexpected behavior visible to users, and reduces programming complexity.

Haonan Lu, Kaushik Veeraraghavan, Philippe Ajoux, Jim Hunt, Yee Jiun Song, Wendy Tobagus, Sanjeev Kumar, Wyatt Lloyd
September 17, 2015

Improved Arabic Dialect Classification with Social Media Data

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

Arabic dialect classification has been an important and challenging problem for Arabic language processing, especially for social media text analysis and machine translation. In this paper we propose an approach to improving Arabic dialect classification with semi-supervised learning: multiple classifiers are trained with weakly supervised, strongly supervised, and unsupervised data. Their combination yields significant and consistent improvement on two different test sets.

Fei Huang
August 31, 2015

Cubrick: A Scalable Distributed MOLAP Database for Fast Analytics

41st International Conference on Very Large Databases (Ph.D Workshop)

This paper describes the architecture and design of Cubrick, a distributed multidimensional in-memory database that enables real-time data analysis of large dynamic datasets. Cubrick has a strictly multidimensional data model composed of dimensions, dimensional hierarchies and metrics, supporting sub-second MOLAP operations such as slice and dice, roll-up and drill-down over terabytes of data.

Pedro Eugenio Rocha Pedreira, Luis Erpen de Bona, Chris Croswhite
August 31, 2015

One Trillion Edges: Graph Processing at Facebook-Scale

The 41st International Conference on Very Large Data Bases

Analyzing large graphs provides valuable insights for social networking and web companies in content ranking and recommendations. While numerous graph processing systems have been developed and evaluated on available benchmark graphs of up to 6.6B edges, they often face significant difficulties in scaling to much larger graphs. Industry graphs can be two orders of magnitude larger hundreds of billions or up to one trillion edges.

Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, Sambavi Muthukrishnan
August 17, 2015

Inside the Social Network’s (Datacenter) Network


Large cloud service providers have invested in increasingly larger datacenters to house the computing infrastructure required to support their services.

Arjun Roy, James Hongyi Zeng, Jasmeet Bagga, George Porter, Alex C. Snoeren
August 13, 2015

From Categorical Logic to Facebook Engineering

30th Annual ACM/IEEE Symposium on Logic in Computer Science

I chart a line of development from category-theoretic models of programs and logics to automatic program verification/analysis techniques that are in deployment at Facebook. Our journey takes in a number of concepts from the computer science logician’s toolkit – including categorical logic and model theory, denotational semantics, the Curry-Howard isomorphism, substructural logic, Hoare Logic and Separation Logic, abstract interpretation, compositional program analysis, the frame problem, and abductive inference.

Peter O'Hearn
August 12, 2015

No Regret Bound for Extreme Bandits

ArXiv PrePrint

A sensible notion of regret in the extreme bandit setting

Robert Nishihara, David Lopez-Paz, Leon Bottou
June 25, 2015

Scale-invariant learning and convolutional networks

ArXiv PrePrint

The conventional classification schemes — notably multinomial logistic regression — used in conjunction with convolutional networks (convnets) are classical in statistics, designed without consideration for the usual coupling with convnets, stochastic gradient descent, and backpropagation. In the specific application to supervised learning for convnets, a simple scale-invariant classification stage turns out to be more robust than multinomial logistic regression, appears to result in slightly lower errors on several standard test sets, has similar computational costs, and features precise control over the actual rate of learning.

Soumith Chintala, Marc'Aurelio Ranzato, Arthur Szlam, Yuandong Tian, Mark Tygert, Wojciech Zaremba
June 22, 2015

End-To-End Memory Networks

NIPS 2015

End-to-end training of Memory Networks on question answering and to language modeling

Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus
June 22, 2015

Learning Longer Memory in Recurrent Neural Networks

Workshop at ICLR 2015

We describe simple extension of recurrent networks that allows them to learn longer term memory. Usefulness is demonstrated in improved performance on standard language modeling tasks.

Tomas Mikolov, Armand Joulin, Sumit Chopra, Michael Mathieu, Marc'Aurelio Ranzato
June 22, 2015

Large-scale Simple Question Answering with Memory Networks

ArXiv PrePrint

This paper studies the impact of multitask and transfer learning for simple question answering; a setting for which the reasoning required to answer is quite easy, as long as one can retrieve the correct evidence given a question, which can be difficult in large-scale conditions.

Antoine Bordes, Nicolas Usunier, Sumit Chopra, Jason Weston
June 22, 2015

Revisiting Memory Errors in Large-Scale Production Data Centers: Analysis and Modeling of New Trends from the Field

IEEE/IFIP International Conference on Dependable Systems and Networks

In this paper, we analyze the memory errors in the entire fleet of servers at Facebook over the course of fourteen months, representing billions of device days.

Justin Meza, Qiang Wu, Sanjeev Kumar, Onur Mutlu
June 22, 2015

An implementation of a randomized algorithm for principal component analysis

ArXiv PrePrint

This paper carefully implements newly popular randomized algorithms for principal component analysis and benchmarks them against the classics.

Arthur Szlam, Mark Tygert, Yuval Kluger