Over the past 15 years, key-value (KV) stores have emerged as a popular solution to build large scale data platforms. 

KV stores expose a basic data model that maps unique keys to values, much like a hash-map. This data model comes with the promise of high flexibility and scalability, which are generally considered to be pain points of traditional transactional processing systems based on the relational model. In fact, sharding the data set across several cores or machines effectively and supporting efficient operations such as joins over several tables are very challenging tasks. As a result, KV stores have been adopted in many deployments as a replacement of traditional database systems.

KV stores embedded as storage engines within database systems

With time, KV store designs have become more mature, providing additional functionalities, such as efficient range queries, durability, and atomicity. These are key requirements for databases and are typically implemented from scratch in such systems. The increased set of KV native functionalities has led database designers to reconsider the historical contraposition between KV stores and database systems and to embrace designs where KV stores are embedded as storage engines within database systems. In these designs, the transaction processing logic is decoupled from the data storage and retrieval subsystem, which is taken care of by the KV store.

Examples of database systems that embrace this design are Microsoft’s Deuteronomy, which is built around the BW-tree KV store; Apple’s FoundationDB, which has employed SQLite and is now shipping with the new RedWood KV store; Google’s Spanner, which has been layered on-top of a Bigtable-based KV store; MongoDB, which uses WiredTiger; and MariaDB, which uses Facebook’s RocksDB.

This decoupled design allows for a clear separation of concerns, making performance improvements to the KV store available for the database system as a whole, with no further integration efforts.

Efficient and high-performance KV store designs

In light of this design shift, proposing efficient and high-performance KV store designs becomes relevant for a very broad set of data platform systems and use cases, and IBM Research is highly involved in projects that pursue this line of research.

Researchers from the Zurich lab have designed and implemented uDepot, a KV store built from the ground up to enable high throughput, low latency, and high efficiency with emerging NVM storage devices, such as Intel 3DXP SSDs, and with NAND flash devices, which are widely available in Cloud environments. 

uDepot achieves high performance and  resource efficiency thanks to the synergistic implementation of two main techniques:

  1. A lightweight task-based runtime that can handle multiple I/O operations in parallel, thus saturating the available I/O bandwidth.
  2. A two-level main-memory index that dynamically adjusts its DRAM footprint to the current data-set size, while providing fast lookups and insertions thanks to high cache efficiency and a low-overhead concurrency control scheme.

Overall, uDepot is delivers up to 2x higher throughput than existing systems on the YCSB workloads and is able to fully utilize the available I/O bandwidth, even when deployed on top of 20 storage devices. Additional details on uDepot can be found in the paper published at the USENIX FAST conference.

Other contributions

IBM Research is also active in the development of FoundationDB and has already contributed with two improvements to its KV store component. The first contribution is the design of an in-memory KV data structure that is used by the storage layer of FoundationDB to buffer the incoming writes before committing them to disk. This data structure is based on the Adaptive Radix Tree, and replaces the existing Red-black tree implementation, providing up to 20% higher write throughput. 

The second contribution is the implementation of a new page caching scheme that adopts the last recently used replacement policy. This policy aims to retain frequently accessed pages in memory and provides up to 10% higher hit rates on skewed workloads compared to the default random replacement policy. Both contributions have been made directly to the public repository and in collaboration with the FoundationDB community. 

Additional details on the involvement of IBM Research in the FoundationDB project can be found in this presentation, given at the 2019 FoundationDB summit.

More from Cloud

Clients can strengthen defenses for their data with IBM Storage Defender, now generally available

2 min read - We are excited to inform our clients and partners that IBM Storage Defender, part of our IBM Storage for Data Resilience portfolio, is now generally available. Enterprise clients worldwide continue to grapple with a threat landscape that is constantly evolving. Bad actors are moving faster than ever and are causing more lasting damage to data. According to an IBM report, cyberattacks like ransomware that used to take months to fully deploy can now take as little as four days. Cybercriminals…

2 min read

Integrating data center support: Lower costs and decrease downtime with your support strategy

3 min read - As organizations and their data centers embrace hybrid cloud deployments, they have a rapidly growing number of vendors and workloads in their IT environments. The proliferation of these vendors leads to numerous issues and challenges that overburden IT staff, impede clients’ core business innovations and development, and complicate the support and operation of these environments.  Couple that with the CIO’s priorities to improve IT environment availability, security and privacy posture, performance, and the TCO, and you now have a challenge…

3 min read

Using advanced scan settings in the IBM Cloud Security and Compliance Center

5 min read - Customers and users want the ability to schedule scans at the timing of their choice and receive alerts when issues arise, and we’re happy to make a few announcements in this area today: Scan frequency: Until recently, the IBM Cloud® Security and Compliance Center would scan resources every 24 hours, by default, on all of the attachments in an account. With this release, users can continue to run daily scans—which is the recommended option—but they also have the option for…

5 min read

Modernizing child support enforcement with IBM and AWS

7 min read - With 68% of child support enforcement (CSE) systems aging, most state agencies are currently modernizing them or preparing to modernize. More than 20% of families and children are supported by these systems, and with the current constituents of these systems becoming more consumer technology-centric, the use of antiquated technology systems is archaic and unsustainable. At this point, families expect state agencies to have a modern, efficient child support system. The following are some factors driving these states to pursue modernization:…

7 min read