June 23, 2020 By Diego Didona
Nikolas Ioannou
3 min read

Over the past 15 years, key-value (KV) stores have emerged as a popular solution to build large scale data platforms. 

KV stores expose a basic data model that maps unique keys to values, much like a hash-map. This data model comes with the promise of high flexibility and scalability, which are generally considered to be pain points of traditional transactional processing systems based on the relational model. In fact, sharding the data set across several cores or machines effectively and supporting efficient operations such as joins over several tables are very challenging tasks. As a result, KV stores have been adopted in many deployments as a replacement of traditional database systems.

KV stores embedded as storage engines within database systems

With time, KV store designs have become more mature, providing additional functionalities, such as efficient range queries, durability, and atomicity. These are key requirements for databases and are typically implemented from scratch in such systems. The increased set of KV native functionalities has led database designers to reconsider the historical contraposition between KV stores and database systems and to embrace designs where KV stores are embedded as storage engines within database systems. In these designs, the transaction processing logic is decoupled from the data storage and retrieval subsystem, which is taken care of by the KV store.

Examples of database systems that embrace this design are Microsoft’s Deuteronomy, which is built around the BW-tree KV store; Apple’s FoundationDB, which has employed SQLite and is now shipping with the new RedWood KV store; Google’s Spanner, which has been layered on-top of a Bigtable-based KV store; MongoDB, which uses WiredTiger; and MariaDB, which uses Facebook’s RocksDB.

This decoupled design allows for a clear separation of concerns, making performance improvements to the KV store available for the database system as a whole, with no further integration efforts.

Efficient and high-performance KV store designs

In light of this design shift, proposing efficient and high-performance KV store designs becomes relevant for a very broad set of data platform systems and use cases, and IBM Research is highly involved in projects that pursue this line of research.

Researchers from the Zurich lab have designed and implemented uDepot, a KV store built from the ground up to enable high throughput, low latency, and high efficiency with emerging NVM storage devices, such as Intel 3DXP SSDs, and with NAND flash devices, which are widely available in Cloud environments. 

uDepot achieves high performance and  resource efficiency thanks to the synergistic implementation of two main techniques:

  1. A lightweight task-based runtime that can handle multiple I/O operations in parallel, thus saturating the available I/O bandwidth.
  2. A two-level main-memory index that dynamically adjusts its DRAM footprint to the current data-set size, while providing fast lookups and insertions thanks to high cache efficiency and a low-overhead concurrency control scheme.

Overall, uDepot is delivers up to 2x higher throughput than existing systems on the YCSB workloads and is able to fully utilize the available I/O bandwidth, even when deployed on top of 20 storage devices. Additional details on uDepot can be found in the paper published at the USENIX FAST conference.

Other contributions

IBM Research is also active in the development of FoundationDB and has already contributed with two improvements to its KV store component. The first contribution is the design of an in-memory KV data structure that is used by the storage layer of FoundationDB to buffer the incoming writes before committing them to disk. This data structure is based on the Adaptive Radix Tree, and replaces the existing Red-black tree implementation, providing up to 20% higher write throughput. 

The second contribution is the implementation of a new page caching scheme that adopts the last recently used replacement policy. This policy aims to retain frequently accessed pages in memory and provides up to 10% higher hit rates on skewed workloads compared to the default random replacement policy. Both contributions have been made directly to the public repository and in collaboration with the FoundationDB community. 

Additional details on the involvement of IBM Research in the FoundationDB project can be found in this presentation, given at the 2019 FoundationDB summit.

Was this article helpful?
YesNo

More from Cloud

Enhance your data security posture with a no-code approach to application-level encryption

4 min read - Data is the lifeblood of every organization. As your organization’s data footprint expands across the clouds and between your own business lines to drive value, it is essential to secure data at all stages of the cloud adoption and throughout the data lifecycle. While there are different mechanisms available to encrypt data throughout its lifecycle (in transit, at rest and in use), application-level encryption (ALE) provides an additional layer of protection by encrypting data at its source. ALE can enhance…

Attention new clients: exciting financial incentives for VMware Cloud Foundation on IBM Cloud

4 min read - New client specials: Get up to 50% off when you commit to a 1- or 3-year term contract on new VCF-as-a-Service offerings, plus an additional value of up to USD 200K in credits through 30 June 2025 when you migrate your VMware workloads to IBM Cloud®.1 Low starting prices: On-demand VCF-as-a-Service deployments begin under USD 200 per month.2 The IBM Cloud benefit: See the potential for a 201%3 return on investment (ROI) over 3 years with reduced downtime, cost and…

The history of the central processing unit (CPU)

10 min read - The central processing unit (CPU) is the computer’s brain. It handles the assignment and processing of tasks, in addition to functions that make a computer run. There’s no way to overstate the importance of the CPU to computing. Virtually all computer systems contain, at the least, some type of basic CPU. Regardless of whether they’re used in personal computers (PCs), laptops, tablets, smartphones or even in supercomputers whose output is so strong it must be measured in floating-point operations per…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters