Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Pure Concentrated Data

Informix Dynamic Server compresses, consolidates data to boost data warehousing

Bob Scheier, Contributing writer, IBM Data Management magazine
Bob Scheier is a veteran IT writer based in Swampscott, Massachusetts

Summary:  This article reviews recent Informix enhancements, including tools to help customers model, schedule, and execute the data transformation and data flows required to create data warehouses. Most recently, IBM also announced new data compression and consolidation features that reduce the cost (and boost the performance) of both BI and OLTP applications on IDS.

Date:  27 May 2009
Level:  Introductory
Also available in:   Chinese

Activity:  5638 views
Comments:  


IDS story in the print edition of IBM Data Management magazine

Mention Informix Dynamic Server (IDS) and most DBAs think of a fast, reliable, and low-maintenance platform for online transaction processing (OLTP). But those transactions hold valuable insights into business trends, leading many organizations to also use IDS as a data warehouse against which they can run business intelligence (BI) queries.

To meet those needs, Informix has unveiled a series of enhancements that bring its long-standing speed and ease-of-use benefits to data warehousing and BI. These include tools to help customers model, schedule, and execute the data transformation and data flows required to create data warehouses. Most recently, IBM announced new data compression and consolidation features that reduce the cost (and boost the performance) of both BI and OLTP applications on IDS. The key new technology is the IDS Storage Optimization Feature. It reduces the size of not only OLTP data stores but also data warehouses that, if they grow too large, can make business analysis overly complicated and expensive.

Introducing Storage Optimization

Based on technology used in IBM DB2, Storage Optimization compresses and consolidates the data within IDS. Early customer trials show that Storage Optimization reduces the amount of space required to store data either in memory or on disk by an average of 50 percent. This can cut the time required to process queries by as much as 20 percent because more data can be kept in memory, reducing the number of I/O operations to slower mechanical disk drives. Cutting the database size can also, of course, delay or even eliminate the need to upgrade disk storage.

The first of the three Storage Optimization components is compression, which examines each row in the database for recurring patterns of data. It stores the individual recurring patterns in a dictionary, replacing those patterns with shorter strings of symbols. Unlike other compression techniques that scan only a portion of each row for repeating patterns, Storage Optimization scans the entire row, regardless of how many columns it intersects. By scanning a larger area than other compression techniques, it can find and compress more repeating patterns, and thus achieve very high compression ratios (see Figure 1).


Figure 1: Saving repeating patterns as shorter character strings allows Storage Optimization to achieve very high compression ratios.
Figure 1: Saving repeating patterns as shorter character strings allows Storage Optimization to achieve very high compression ratios.

Repack (or Coalesce), the second component, consolidates the free space created within each partition, while the final capability, Shrink, removes the unused portion of the partition and returns it for reuse by IDS. These larger, contiguous spaces are much easier for IDS to reuse than smaller, isolated free spaces. It is this compression and consolidation that speeds query performance, while holding down the amount of physical disk space required for data warehousing.


Compression on autopilot

Database compression isn't new, but Storage Optimization automates key tasks to make it easier to use. For example, it builds the compression dictionary automatically, instead of requiring the DBA to manually specify the patterns to be compressed. Informix also provides guidelines (see sidebar, "Resources") for when optimization would be least useful, such as for small tables that fit into memory without compression and for temporary tables and system catalogues. But even in those cases, the CPU cycles needed to compress the data and the table scans required to repack and shrink data have little impact on overall performance.

While compression and decompression do require CPU cycles, "We're finding the price you pay in CPU overhead is outweighed by the benefit you get from having fewer pages and fewer I/Os," says IDS Chief Architect Kevin Brown. "If before, your database couldn't fit in memory, and now it can, you're going to see a really big performance boost."

In cases where compression might compromise application performance, administrators can specify which fragments to compress. "If a table has 10 fragments, for example, you might want to leave the most recently created fragments uncompressed for performance purposes, and compress the older fragments, since they're less likely to be accessed," says Brown. A recompress option allows administrators to rebuild the compression dictionary to ensure optimal compression of the tables, which is useful if the data within them has changed significantly over time.

DBAs can use any or all of the Storage Optimization capabilities while the database is running, thus preventing interruption to either transactions or queries. They can also manage Storage Optimization from within the same IDS OpenAdmin Tool (OAT) they use to manage the size of volumes and to monitor performance. Graphical tools show administrators or business users how much space they can save by running any of the functions within Storage Optimization. Storage Optimization can also be controlled through a command-line interface, which is useful for the many ISVs that bundle IDS within their own applications.


The benefits of the bundle

Storage Optimization is available in a new data warehouse platform package called IBM Informix Warehouse, which includes IDS, its new Storage Optimization Feature, and the IBM Informix Warehouse Feature V11.50. Informix Warehouse Feature V11.50 includes the SQL Warehouse (SQW) Client with Design Studio for data modeling, schema design, data transformation design, and data flow design; SQW Warehouse Server, with an administration console to schedule and manage data flows; and SQL Warehouse runtime to perform data transformation within the IDS data server.

The Informix Warehouse infrastructure also supports integrated external tooling for BI analytics from IBM Cognos, for managing data growth with IBM Optim, and for data transformation and cleansing with IBM InfoSphere DataStage and InfoSphere QualityStage.

With the new IDS data warehousing tools, DBAs can perform business analysis on their operational data stores to ensure access to the most current data. They can also create separate data warehouses within IDS to prevent analytic queries from slowing production applications or run a "shared disk" configuration in which separate servers access the same data for OLTP and business analysis. The bottom line: businesses that have been using IDS for data warehousing and BI now have better, more targeted tools to achieve their goals. "We give organizations the choice so they can make the best use of their data and infrastructure based on their technical and business requirements," says Brown.


Resources

Data warehousing with Informix
IDS Storage Optimization
IBM Informix Warehouse
IBM Informix Warehouse Feature

About the author

Bob Scheier is a veteran IT writer based in Swampscott, Massachusetts

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=392262
ArticleTitle=Pure Concentrated Data
publish-date=05272009
author1-email=bob.scheier@tdagroup.com
author1-email-cc=Author1 cc address

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Try IBM PureSystems. No charge.

Special offers