Research Area
Year Published

620 Results

June 26, 2013

TAO: Facebook’s Distributed Data Store for the Social Graph

USENIX Annual Technical Conference (ATC)

We introduce a simple data model and API tailored for serving the social graph, and TAO, an implementation of this model. TAO is a geographically distributed data store that provides efficient and timely access to the social graph for Facebook’s demanding workload using a fixed set of queries. It is deployed at Facebook, replacing memcache for many data types that fit its model.

By: Nathan Bronson, Zachary Amsden, George Cabrera III, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov, Dmitri Petrov, Lovro Puzar, Yee Jiun Song, Venkat Venkataramani
June 26, 2013

A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster

USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage)

Erasure codes, such as Reed-Solomon (RS) codes, are being increasingly employed in data centers to combat the cost of reliably storing large amounts of data. Although these codes provide optimal stora…

By: K.V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, Kannan Ramchandran
June 23, 2013

Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached

ACM/IEEE International Symposium on Computer Architecture (ISCA)

Distributed in-memory key-value stores, such as memcached, are central to the scalability of modern internet services. Current deployments use commodity servers with high-end processors. However, give…

By: Kevin Lim, David Meisner, Ali Saidi, Parthasarathy Ranganathan, Thomas Wenisch
June 22, 2013

LinkBench: a Database Benchmark based on the Facebook Social Graph

ACM Special Interest Group on Management of Data (SIGMOD/PODS)

Database benchmarks are an important tool for database researchers and practitioners that ease the process of making informed comparisons between different database hardware, software and configuratio…

By: Timothy Armstrong, Nagavamsi Ponnekanti, Dhruba Borthakur, Mark Callaghan
June 17, 2013

MILEAGE: Multiple Instance LEArning with Global Embedding

International Conference on Machine Learning (ICML)

Multiple Instance Learning (MIL) generally represents each example as a collection of instances such that the features for local objects can be better captured, whereas traditional methods typically extract a global feature vector for each example as an integral part. However, there is limited research work on investigating which of the two learning scenarios performs better.

By: Dan Zhang, Jingrui He, Luo Si, Richard Lawrence
May 20, 2013

Machine Learning Paradigms for Speech Recognition: An Overview

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, Bay…

By: Li Deng, Xiao Li
May 13, 2013

Latent Credibility Analysis

International World Wide Web Conference (WWW)

A frequent problem when dealing with data gathered from multiple sources on the web (ranging from booksellers to Wikipedia pages to stock analyst predictions) is that these sources disagree, and we must decide which of their (often mutually exclusive) claims we should accept. Current state-of-the-art information credibility algorithms known as ‘fact-finders’ are transitive voting systems with rules specifying how votes iteratively flow from sources to claims and then back to sources.

By: Jeff Pasternack, Dan Roth
May 13, 2013

Subgraph Frequencies: Mapping the Empirical and Extremal Geography of Large Graph Collections

International World Wide Web Conference (WWW)

A growing set of on-line applications are generating data that can be viewed as very large collections of small, dense social graphs – these range from sets of social groups, events, or collaboration projects to the vast collection of graph neighborhoods in large social networks.

By: Johan Ugander, Lars Backstrom, Jon Kleinberg
May 13, 2013

CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks

International World Wide Web Conference (WWW)

In this paper we focus on the social network Facebook and the problem of discerning ill-gotten Page Likes, made by spammers hoping to turn a profit, from legitimate Page Likes. Our method, which we refer to as CopyCatch, detects lockstep Page Like patterns on Facebook by analyzing only the social graph between users and Pages and the times at which the edges in the graph (the Likes) were created.

By: Alex Beutel, Tom Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos
May 7, 2013

Facebook’s Data Center Network Architecture

IEEE Optical Interconnects Conference (OI)

We review Facebook’s current data center network architecture and explore some alternative architectures.

By: Nathan Farrington, Alexey Andreyev