Publication

Coordinated Priority-aware Charging of Distributed Batteries in Oversubscribed Data Centers

IEEE/ACM International Symposium on Microarchitecture (MICRO)


Abstract

Data centers employ batteries for uninterruptible operation during maintenance and power failures, for example, when switching to diesel generator power after a utility power failure. Depleted batteries start to recharge once the input power is back, creating a sudden power spike in the power hierarchy. If not properly controlled, a sustained power overload can potentially trip circuit breakers, leading to service outages. Power overloads due to battery recharging are even more likely in oversubscribed data centers where the power infrastructure is aggressively provisioned for high utilization. The problem caused by simultaneous recharging of batteries in a data center has not been extensively studied and no real-world solutions have been proposed in the literature.

In this paper, we identify the problem due to battery recharging with case studies from Facebook’s data centers. We describe the solutions we have developed to coordinate charging of batteries without exceeding the circuit breaker power limit. We explain in detail, the variable battery charging algorithm built into the distributed battery charger hardware deployed in Facebook data centers, and the system design considerations necessary on a large scale. The new variable charger is able to reduce battery recharge power by up to 80%. We further leverage individual battery charging control mechanism to coordinate the charging process such that we charge the batteries according to priorities of applications running on the servers supported by the batteries. We evaluate our coordinated priority-aware battery charging algorithm by building a prototype in a Facebook production data center as well as through simulation experiments using production power traces. Our results show that we are able to meet reliability service level agreements by using our battery recharging algorithm, while satisfying given power constraints.

Related Publications

All Publications

The CacheLib Caching Engine: Design and Experiences at Scale

Benjamin Berg, Daniel S. Berger, Sara McAllister, Isaac Grosof, Sathya Gunasekar, Jimmy Lu, Michael Uhlar, Jim Carrig, Nathan Beckmann, Mor Harchol-Balter, Gregory G. Ganger

OSDI - November 4, 2020

Virtual Consensus in Delos

Mahesh Balakrishnan, Jason Flinn, Chen Shen, Mihir Dharamshi, Ahmed Jafri, Xiao Shi, Santosh Ghosh, Hazem Hassan, Aaryaman Sagar, Rhed Shi, Jingming Liu, Filip Gruszczynski, Xianan Zhang, Huy Hoang, Ahmed Yossef, Francois Richard, Yee Jiun Song

OSDI - November 4, 2020

Twine: A Unified Cluster Management System for Shared Infrastructure

Chunqiang (CQ) Tang, Kenny Yu, Kaushik Veeraraghavan, Jonathan Kaldor, Scott Michelson, Thawan Kooburat, Aravind Anbudurai, Matthew Clark, Kabir Gogia, Long Cheng, Ben Christensen, Alex Gartrell, Maxim Khutornenko, Sachin Kulkarni, Marcin Pawlowski, Tuomas Pelkonen, Andre Rodrigues, Rounak Tibrewal, Vaishnavi Venkatesan, Peter Zhang

OSDI - November 4, 2020

FlightTracker: Consistency across Read-Optimized Online Stores at Facebook

Xiao Shi, Scott Pruett, Kevin Doherty, Jinyu Han, Dmitri Petrov, Jim Carrig, John Hugg, Nathan Bronson

OSDI - November 4, 2020

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy