Research Area
Year Published

124 Results

August 11, 2013

Speeding up Large-Scale Learning with a Social Prior

ACM Conference on Knowledge Discovery and Data Mining (KDD)

Slow convergence and poor initial accuracy are two problems that plague efforts to use very large feature sets in online learning. This is especially true when only a few features are ‘active’ in any…

By: Deepayan Chakrabarti, Ralf Herbrich
July 23, 2013

Semantic Hashing using Tags and Topic Modeling

ACM Special Interest Group on Information Retrieval Conference (SIGIR)

It is an important research problem to design efficient and effective solutions for large scale similarity search. One popular strategy is to represent data examples as compact binary codes through semantic hashing, which has produced promising results with fast search speed and low storage cost. Many existing semantic hashing methods generate binary codes for documents by modeling document relationships based on similarity in a keyword feature space. Two major limitations in existing methods are: (1) Tag information is often associated with documents in many real world applications, but has not been fully exploited yet; (2) The similarity in keyword feature space does not fully reflect semantic relationships that go beyond keyword matching.

By: Qifan Wang, Dan Zhang, Luo Si
July 14, 2013

TAO: Facebook’s Distributed Data Store for the Social Graph

USENIX Annual Technical Conference 2013

We introduce a simple data model and API tailored for serving the social graph, and TAO, an implementation of this model.

By: Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov, Dmitri Petrov, Lovro Puzar, Yee Jiun Song, Venkat Venkataramani
July 1, 2013

Development and Deployment at Facebook

IEEE Internet Computing 17(4)

More than one billion users log in to Facebook at least once a month to connect and share content with each other. Among other activities, these users upload over 2.5 billion content items every day.

By: Dror Feitelson, Eitan Frachtenberg, Kent Beck
June 26, 2013

A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster

USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage)

Erasure codes, such as Reed-Solomon (RS) codes, are being increasingly employed in data centers to combat the cost of reliably storing large amounts of data. Although these codes provide optimal stora…

By: K.V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, Kannan Ramchandran
June 26, 2013

TAO: Facebook’s Distributed Data Store for the Social Graph

USENIX Annual Technical Conference (ATC)

We introduce a simple data model and API tailored for serving the social graph, and TAO, an implementation of this model. TAO is a geographically distributed data store that provides efficient and timely access to the social graph for Facebook’s demanding workload using a fixed set of queries. It is deployed at Facebook, replacing memcache for many data types that fit its model.

By: Nathan Bronson, Zachary Amsden, George Cabrera III, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov, Dmitri Petrov, Lovro Puzar, Yee Jiun Song, Venkat Venkataramani
June 23, 2013

Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached

ACM/IEEE International Symposium on Computer Architecture (ISCA)

Distributed in-memory key-value stores, such as memcached, are central to the scalability of modern internet services. Current deployments use commodity servers with high-end processors. However, give…

By: Kevin Lim, David Meisner, Ali Saidi, Parthasarathy Ranganathan, Thomas Wenisch
June 22, 2013

LinkBench: a Database Benchmark based on the Facebook Social Graph

ACM Special Interest Group on Management of Data (SIGMOD/PODS)

Database benchmarks are an important tool for database researchers and practitioners that ease the process of making informed comparisons between different database hardware, software and configuratio…

By: Timothy Armstrong, Nagavamsi Ponnekanti, Dhruba Borthakur, Mark Callaghan
May 7, 2013

Facebook’s Data Center Network Architecture

IEEE Optical Interconnects Conference (OI)

We review Facebook’s current data center network architecture and explore some alternative architectures.

By: Nathan Farrington, Alexey Andreyev
May 1, 2013

Hash Bit Selection: a Unified Solution for Selection Problems in Hashing

Conference on Computer Vision and Pattern Recognition (CVPR)

Hashing based methods recently have been shown promising for large-scale nearest neighbor search. However, good designs involve difficult decisions of many unknowns – data features, hashing algorithms, parameter settings, kernels, etc.

By: Xianglong Liu, Junfeng He, Bo Lang, Shih-Fu Chang