Publication

UBIS: Utilization-aware cluster scheduling

International Parallel and Distributed Processing Symposium (IPDPS)


Abstract

Data center costs are among the major enterprise expenses, and any improvement in data center resource utilization corresponds to significant savings in true dollars. We focus on the problem of scheduling jobs in distributed execution environments to improve resource utilization. Cluster schedulers like YARN and Mesos base their scheduling decisions on resource requirements provided by end users. It is hard for end-users to predict the exact amount of resources required for a task/ job, especially since resource utilization can vary significantly over time and across tasks. In practice, users pick highly conservative estimates of peak utilization across all tasks of a job to ensure job completion, leading to resource fragmentation and severe under utilization in production clusters. We present UBIS, a utilization-aware approach to cluster scheduling, to address resource fragmentation and to improve cluster utilization and job throughput. UBIS considers actual usage of running tasks and schedules opportunistic work on under-utilized nodes. It monitors resource usage on these nodes and preempts opportunistic containers when over-subscription becomes untenable. In doing so, UBIS utilizes wasted resources while minimizing adverse effects on regularly scheduled tasks. Our implementation of UBIS on YARN yields improvements of up to 30% in makespan for representative workloads and 25% in individual job durations.

Related Publications

All Publications

DELF: Safeguarding deletion correctness in Online Social Networks

Katriel Cohn-Gordon, Georgios Damaskinos, Divino Neto, Joshi Cordova, Benoît Reitz, Benjamin Strahs, Daniel Obenshain, Paul Pearce, Ioannis Papagiannis

USENIX Security - August 11, 2020

Eliminating Bugs with Dependent Haskell (Experience Report)

Noam Zilberstein

Haskell Symposium - August 27, 2020

MLPerf Inference Benchmark

Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun, Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, Yuchen Zhou

ISCA - May 22, 2020

RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing

Liu Ke, Udit Gupta, Benjamin Youngjae Cho, David Brooks, Vikas Chandra, Utku Diril, Amin Firoozshahian, Kim Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Meng Li, Bert Maher, Dheevatsa Mudigere, Maxim Naumov, Martin Schatz, Mikhail Smelyanskiy, Xiaodong Wang, Brandon Reagen, Carole-Jean Wu, Mark Hempstead, Xuan Zhang

ISCA - May 22, 2020

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy