Workload Analysis of a Large-Scale Key-Value Store

ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS)


Key-value stores are a vital component in many scale-out enterprises, including social networks, online retail, and risk analysis. Accordingly, they are receiving increased attention from the research community in an effort to improve their performance, scalability, reliability, cost, and power consumption. To be effective, such efforts require a detailed understanding of realistic key-value workloads. And yet little is known about these workloads outside of the companies that operate them. This paper aims to address this gap.

To this end, we have collected detailed traces from Facebook’s Memcached deployment, arguably the world’s largest. The traces capture over 284 billion requests from five different Memcached use cases over several days. We analyze the workloads from multiple angles, including: request composition, size, and rate; cache efficacy; temporal patterns; and application use cases. We also propose a simple model of the most representative trace to enable the generation of more realistic synthetic workloads by the community.

Our analysis details many characteristics of the caching workload. It also reveals a number of surprises: a GET/SET ratio of 30:1 that is higher than assumed in the literature; some applications of Memcached behave more like persistent storage than a cache; strong locality metrics, such as keys accessed many millions of times a day, do not always suffice for a high hit rate; and there is still room for efficiency and hit rate improvements in Memcached’s implementation. Toward the last point, we make several suggestions that address the exposed deficiencies.

Related Publications

All Publications

Federated Learning for User Privacy and Data Confidentiality Workshop At ICML - July 24, 2021

Federated Learning with Buffered Asynchronous Aggregation

John Nguyen, Kshitiz Malik, Hongyuan Zhan, Ashkan Yousefpour, Michael Rabbat, Mani Malek, Dzmitry Huba

TSE - June 29, 2021

Learning From Mistakes: Machine Learning Enhanced Human Expert Effort Estimates

Federica Sarro, Rebecca Moussa, Alessio Petrozziello, Mark Harman

IEEE ICIP - September 19, 2021

Rate Estimation Techniques for Encoder Parallelization

Gaurang Chaudhari, Hsiao-Chiang Chuang, Igor Koba, Hariharan Lalgudi

RecSys - September 27, 2021

Jointly Optimize Capacity, Latency and Engagement in Large-scale Recommendation Systems

Hitesh Khandelwal, Viet Ha-Thuc, Avishek Dutta, Yining Lu, Nan Du, Zhihao Li, Qi Huang

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy