Rethinking Concurrency Control for In-Memory OLAP DBMSs

IEEE International Conference on Data Engineering (ICDE 2018)

By: Pedro Pedreira, Yinghai Lu, Sergey Pershin, Amit Dutta, Chris Crosswhite


Although OLTP and OLAP database systems have fundamentally disparate architectures, most research work on concurrency control is geared towards transactional systems and simply adopted by OLAP DBMSs. In this paper we describe a new concurrency control protocol specifically designed for analytical DBMSs that can provide Snapshot Isolation for distributed in-memory OLAP database systems, called Append-Only Snapshot Isolation (AOSI). Unlike previous work, which are either based on multiversion concurrency control (MVCC) or Two Phase Locking (2PL), AOSI is completely lock-free and always maintains a single version of each data item. In addition, it removes the need for per-record timestamps of traditional MVCC implementations and thus considerably reduces the memory overhead incurred by concurrency control. In order to support these characteristics, the protocol sacrifices flexibility and removes support for a few operations, particularly record updates and single record deletions; however, we argue that even though these operations are essential in a pure transactional system, they are not strictly required in most analytic pipelines and OLAP systems. We also present an experimental evaluation of AOSI’s current implementation within the Cubrick in-memory OLAP DBMS at Facebook, and show that lock-free single-version Snapshot Isolation can be achieved with low memory overhead and minor impact in query latency.