Community

Make Big Data small with IBM dashDB Enterprise MPP

Share this post:

IBM dashDB™ Enterprise MPP is a high performance, massively scalable cloud data warehouse service, fully managed by IBM. dashDB MPP enables simple and speedy information management, analytics and business intelligence operations in the cloud. One of its key features is making data small in size – with state-of-the-art compression technology, dashDB MPP can deliver an impressive storage saving to maximize both business value as well as query performance.

Columnar Technology

dashDB MPP is built on an innovative columnar technology. One of the biggest innovations is its ability to compress data at a very high rate. Two factors that contribute to this high compress rate are the nature of native column organization and the principle of “like data compresses better than unlike data.” If you think about it, a column represents a particular data type such as an item name, or an item price. All the values of the column are of the same data type, typically a string or a number, and may even be further constrained by range (e.g., the item prices may be within a range of 9.99 – 19.99), possibly with many duplicates, or similar-looking pieces of data. Contrast this to trying to compress a row in a row-based database, which can have many different data types, patterns and an arbitrarily large number of columns. All of this makes compression more difficult. On top of this, dashDB’s sophisticated algorithms are datatype-sensitive.

Super Compress

The compression technology in dashDB MPP optimizes compression based on the frequency of data. That is, more commonly repeating data values are compressed more tightly. For example, a more common last name like “Smith” will be compressed more tightly than uncommon last names. Moreover, the compressed values are packed as tightly as possible in a collection of bits to best fit in the register width of the CPU. dashDB MPP can compress a column value as low as 1 bit!

In internal testing, we have observed that dashDB MPP can compress a representative BI database by a factor of 10 from the pre-loaded data size, which is 2 times better than another major cloud database service we have evaluated*.

Improve Memory Utilization

In addition to the storage saving, dashDB MPP can store data in its bufferpool (i.e., memory) in a compressed format; this fits more data into the same amount of memory, significantly increasing data density in memory, and improving query performance.

Actionable Compression

dashDB MPP’s state-of-the-art compression technology in dashDB MPP enables “actionable compression;” in other words, many analytical operations (such as predicate evaluation, joins and aggregates) can be performed on the compressed data. Imagine how this can save CPU cycles and further speed up your query processing.

IBM dashDB™ Enterprise MPP is truly built for Big Data – speedy, scalable, and small.

* Disclaimer: Performance and compression data is based on measurements and projections using IBM benchmarks in a controlled environment. The actual throughput, performance or compression that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

More stories
May 7, 2019

We’ve Moved! The IBM Cloud Blog Has a New URL

In an effort better integrate the IBM Cloud Blog with the IBM Cloud web experience, we have migrated the blog to a new URL: www.ibm.com/cloud/blog.

Continue reading

April 19, 2019

Reach Out to the IBM Cloud Development Teams on Slack

Get the help you need fast—directly from the IBM Cloud Development Teams and other users on Slack.

Continue reading

April 11, 2019

Permanent Redirect to cloud.ibm.com from console.bluemix.net

Starting on April 27, 2019, we will be turning on permanent redirects from bluemix.net to cloud.ibm.com. All of the same functionality that existed on bluemix.net is still available in cloud.ibm.com.

Continue reading