Row compression in DB2 9

Analysis of a DSS database environment

Row compression is a new feature of DB2 9 that yields storage capacity savings. This article examines the storage savings and performance impact of this feature. It describes the workload characteristics and environments in which row compression will thrive, as well as providing general guidelines that you can follow to efficiently use this feature.

Victor Chang (vicchang@ca.ibm.com), DB2 Performance Analyst, EMC

Victor Chang is a DB2 Performance Analyst who joined IBM in July 2001. He has worked in the area of improving DB2 recovery performance, competed in several DB2/PeopleSoft benchmarks, and has worked on fixing performance problems on both DSS and OLTP workloads over Linux, Unix and Windows platforms. His current focus includes the role of team-lead for a fundamental performance workload, performance regression in SAP and performance analysis of some critical DB2 line items. He is currently pursuing an MBA degree at UofT's Rotman Business School.



Yun Han Lee (yunhanl@ca.ibm.com), DB2 Performance Analyst, EMC

Yun Han Lee is a DB2 Performance Analyst and IBM Certified Solution Expert in DB2 for Linux, UNIX, and Windows. He has been working on DB2 performance for more than six years. He has been actively involved several world-record TPC-C benchmarks on AIX. His current focus is DB2 Performance for AIX and DB2 Connect to z/OS. You can reach him at yunhanl@ca.ibm.com



Nailah Bissoon (nbissoon@ca.ibm.com), Performance Engineer, EMC

Nailah Bissoon photoNailah Bissoon, MSc. is currently a member of the DB2 solutions development and benchmarking team providing DB2 solutions for business needs and helping to publish leading DB2 benchmark performance results. Her tasks also include promoting and verifying new and existing DB2 features, one of them being data compression.



05 October 2006

Row compression is a new feature of IBM® DB2® 9 that yields storage capacity savings. This article examines the storage savings and performance impact of this feature. It describes the workload characteristics and environments in which row compression will thrive, as well as providing general guidelines that you can follow to efficiently use this feature.

In this article

  • Introduction
  • Experimental design
  • Results: Compressed versus uncompressed
  • Best practices for row compression through DB2 REORG
  • Conclusion

Executive summary

Every business model looks for the best way to minimize production costs in order to maximize profits. In the summer of 2006, IBM® released the latest version of its data server software technology, DB2® 9. One of the most prominent features introduced in this version is the ability to perform row compression. The goal of row compression is to reduce table storage. When using this feature, DB2 is able to shrink row sizes by building and utilizing a compression dictionary, which is used as an alphabet to compress the rows of data in the database tables. Since more data can reside on a data page, fewer data pages are allocated per table. The net effect is a reduced rate of storage consumption which may lead to fewer disks needed when allocating storage requirements. Since disk storage is one of the most expensive components on any data server, this feature can significantly reduce Total Cost of Ownership (TCO).

Since row compression can reduce TCO, what effect does it have on performance? Due to the nature of compression and expansion algorithms, row compression naturally consumes additional CPU cycles beyond the equivalent uncompressed database activity. In systems with a typical amount of idle CPU, the performance effect of row compression may be negligible and in some cases be more efficient depending upon the compression ratio and the amount of data being retrieved from disk. However, for workloads that consume significant amount of CPU, it may be necessary to increase CPU capacity in order to maintain desirable performance results.

In a 362 GB DSS database, overall database storage consumption was reduced by 40% when row compression was enabled. As a result, the compressed database was restructured to use 50% fewer data disks than the original uncompressed database used. Regardless, the compressed database was able to achieve throughput similar to the uncompressed database when running a single-stream workload, and even experience a 25% throughput improvement in a multi-stream scenario. For row compression to perform efficiently, free CPU resources are indispensable because they are necessary to account for the overhead of compressing and expanding rows. Row compression should provide a performance enhancement to a database system if complex select queries that require mostly sequential data access create the bulk of its workload. If these requirements are met, the benefits of decreased production costs and increased process efficiency can both be attained through the use of this feature.

Download

DescriptionNameSize
Article in PDF formatRow_Compression.pdf297KB

Resources

Learn

Get products and technologies

  • Download a free trial version of DB2 Enterprise 9.
  • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edtion and provides a solid base to build and deploy applications.
  • Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2, Lotus®, Rational®, Tivoli®, and WebSphere®.

Discuss

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=165665
SummaryTitle=Row compression in DB2 9
publish-date=10052006