February 27, 2016

Once More with Feeling: Supportive Responses to Social Sharing on Facebook

Computer-Supported Cooperative Work and Social Computing

Using millions of de-identified Facebook status updates with poster-annotated feelings (e.g., feeling thankful or feeling worried), we examine the magnitude and circumstances in which people share positive or negative feelings and characterize the nature of the responses they receive.

Moira Burke, Mike Develin
February 27, 2016

What’s in a Like? Attitudes and Behaviors Around Receiving Likes on Facebook

Computer-Supported Cooperative Work and Social Computing

What social value do Likes on Facebook hold? This research examines people’s attitudes and behaviors related to receiving one-click feedback in social media.

Lauren Scissors, Moira Burke, Steve Wengrovitz
February 22, 2016

Information Evolution in Social Networks

Proc. WSDM'16

Social networks readily transmit information, albeit with less than perfect fidelity. We present a large-scale measurement of this imperfect information copying mechanism by examining the dissemination and evolution of thousands of memes, collectively replicated hundreds of millions of times in the online social network Facebook.

Lada Adamic, Thomas Lento, Eytan Adar, Pauline C. Ng
January 13, 2016

Social Networks and Labor Markets: How Strong Ties Relate to Job Transmission On Facebook’s Social Network

Journal of Labor Economics

This is an observational study of the social networks of 1.4 million US Facebook users who list an employer in their profile.

Moira Burke
January 7, 2016

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

2016 International Conference on Learning Representations

We stabilize Generative Adversarial networks with some architectural constraints and visualize the internals of the networks.

Alec Radford, Luke Metz, Soumith Chintala
December 15, 2015

Learning to Segment Object Candidates


In this paper, we propose a new way to generate object proposals, introducing an approach based on a discriminative convolutional network. Our model obtains substantially higher object recall using fewer proposals. We also show that our model is able to generalize to unseen categories it has not seen during training.

Pedro Oliveira, Ronan Collobert, Piotr Dollar
December 7, 2015

Simple Bag-of-Words Baseline for Visual Question Answering

ArXiv PrePrint

We describe a very simple bag-of-words baseline for visual question answering.

Arthur Szlam, Bolei Zhou, Rob Fergus, Sainbayar Sukhbaatar, Yuandong Tian
December 1, 2015

Internet Use and Psychological Well-Being: Effects of Activity and Audience

Communications of the ACM

Two lines of research fifteen years apart demonstrate that talking with close friends online is associated with improvements in social support, depression, and other measures of well-being. Talking with strangers and reading about acquaintances are not.

Robert Kraut, Moira Burke
November 25, 2015

A Roadmap Towards Machine Intelligence

ArXiv PrePrint

We describe one possible roadmap how to develop intelligent machines with communication skills that can perform useful tasks for us.

Armand Joulin, Marco Baroni, Tomas Mikolov
November 23, 2015

MazeBase: A Sandbox for Learning from Games

ArXiv PrePrint

Environment for simple 2D maze games, designed as a sandbox for machine learning approaches to reasoning and planning

Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, Rob Fergus
November 23, 2015

Learning Simple Algorithms from Examples

ArXiv PrePrint

We present an approach for learning simple algorithms such as addition or multiplication. Our methods works as a hard attention model on both input and output and it is learn with reinforcement learning.

Wojciech Zaremba, Armand Joulin, Tomas Mikolov, Rob Fergus
November 20, 2015

Sequence Level Training with Recurrent Neural Networks

ICLR 2016

This work aims at improving text generation for applications such as summarization, machine translation and image captioning. The key idea is to learn to predict not just the next word but a whole sequence of words, and to train end-to-end at the sequence level.

Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba
November 19, 2015

Alternative Structures for Character-Level RNNs

ArXiv PrePrint

We present two alternative structures to character level recurrent networks. Character level RNNs are known to be very inefficient while achieving lower performance than word level RNNs.

Piotr Bojanowski, Tomas Mikolov, Armand Joulin
October 4, 2015

Holistic Configuration Management at Facebook

The 25th ACM Symposium on Operating Systems Principles

This paper gives a comprehensive description of the use cases, design, implementation, and usage statistics of a suite of tools that manage Facebook’s configuration end-to-end, including the frontend products, backend systems, and mobile apps.

Chunqiang (CQ) Tang, Thawan Kooburat, Pradeep Venkat, Akshay Chander, Zhe Wen, Aravind Narayanan, Patrick Dowell, Robert Karl
October 4, 2015

Existential Consistency: Measuring and Understanding Consistency at Facebook

The 25th ACM Symposium on Operating Systems Principles

Replicated storage for large Web services faces a trade-off between stronger forms of consistency and higher performance properties. Stronger consistency prevents anomalies, i.e., unexpected behavior visible to users, and reduces programming complexity.

Haonan Lu, Kaushik Veeraraghavan, Philippe Ajoux, Jim Hunt, Yee Jiun Song, Wendy Tobagus, Sanjeev Kumar, Wyatt Lloyd
September 17, 2015

Improved Arabic Dialect Classification with Social Media Data

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

Arabic dialect classification has been an important and challenging problem for Arabic language processing, especially for social media text analysis and machine translation. In this paper we propose an approach to improving Arabic dialect classification with semi-supervised learning: multiple classifiers are trained with weakly supervised, strongly supervised, and unsupervised data. Their combination yields significant and consistent improvement on two different test sets.

Fei Huang
August 31, 2015

Cubrick: A Scalable Distributed MOLAP Database for Fast Analytics

41st International Conference on Very Large Databases (Ph.D Workshop)

This paper describes the architecture and design of Cubrick, a distributed multidimensional in-memory database that enables real-time data analysis of large dynamic datasets. Cubrick has a strictly multidimensional data model composed of dimensions, dimensional hierarchies and metrics, supporting sub-second MOLAP operations such as slice and dice, roll-up and drill-down over terabytes of data.

Pedro Eugenio Rocha Pedreira, Luis Erpen de Bona, Chris Croswhite
August 31, 2015

One Trillion Edges: Graph Processing at Facebook-Scale

The 41st International Conference on Very Large Data Bases

Analyzing large graphs provides valuable insights for social networking and web companies in content ranking and recommendations. While numerous graph processing systems have been developed and evaluated on available benchmark graphs of up to 6.6B edges, they often face significant difficulties in scaling to much larger graphs. Industry graphs can be two orders of magnitude larger hundreds of billions or up to one trillion edges.

Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, Sambavi Muthukrishnan
August 17, 2015

Inside the Social Network’s (Datacenter) Network


Large cloud service providers have invested in increasingly larger datacenters to house the computing infrastructure required to support their services.

Arjun Roy, James Hongyi Zeng, Jasmeet Bagga, George Porter, Alex C. Snoeren
August 13, 2015

From Categorical Logic to Facebook Engineering

30th Annual ACM/IEEE Symposium on Logic in Computer Science

I chart a line of development from category-theoretic models of programs and logics to automatic program verification/analysis techniques that are in deployment at Facebook. Our journey takes in a number of concepts from the computer science logician’s toolkit – including categorical logic and model theory, denotational semantics, the Curry-Howard isomorphism, substructural logic, Hoare Logic and Separation Logic, abstract interpretation, compositional program analysis, the frame problem, and abductive inference.

Peter O'Hearn