Mention Informix Dynamic Server (IDS) and most DBAs think of a fast, reliable, and low-maintenance platform for online transaction processing (OLTP). But those transactions hold valuable insights into business trends, leading many organizations to also use IDS as a data warehouse against which they can run business intelligence (BI) queries.
To meet those needs, Informix has unveiled a series of enhancements that bring its long-standing speed and ease-of-use benefits to data warehousing and BI. These include tools to help customers model, schedule, and execute the data transformation and data flows required to create data warehouses. Most recently, IBM announced new data compression and consolidation features that reduce the cost (and boost the performance) of both BI and OLTP applications on IDS. The key new technology is the IDS Storage Optimization Feature. It reduces the size of not only OLTP data stores but also data warehouses that, if they grow too large, can make business analysis overly complicated and expensive.
Introducing Storage Optimization
Based on technology used in IBM DB2, Storage Optimization compresses and consolidates the data within IDS. Early customer trials show that Storage Optimization reduces the amount of space required to store data either in memory or on disk by an average of 50 percent. This can cut the time required to process queries by as much as 20 percent because more data can be kept in memory, reducing the number of I/O operations to slower mechanical disk drives. Cutting the database size can also, of course, delay or even eliminate the need to upgrade disk storage.
The first of the three Storage Optimization components is compression, which examines each row in the database for recurring patterns of data. It stores the individual recurring patterns in a dictionary, replacing those patterns with shorter strings of symbols. Unlike other compression techniques that scan only a portion of each row for repeating patterns, Storage Optimization scans the entire row, regardless of how many columns it intersects. By scanning a larger area than other compression techniques, it can find and compress more repeating patterns, and thus achieve very high compression ratios (see Figure 1).
Figure 1: Saving repeating patterns as shorter character strings allows Storage Optimization to achieve very high compression ratios.
Repack (or Coalesce), the second component, consolidates the free space created within each partition, while the final capability, Shrink, removes the unused portion of the partition and returns it for reuse by IDS. These larger, contiguous spaces are much easier for IDS to reuse than smaller, isolated free spaces. It is this compression and consolidation that speeds query performance, while holding down the amount of physical disk space required for data warehousing.
Database compression isn't new, but Storage Optimization automates key tasks to make it easier to use. For example, it builds the compression dictionary automatically, instead of requiring the DBA to manually specify the patterns to be compressed. Informix also provides guidelines (see sidebar, "Resources") for when optimization would be least useful, such as for small tables that fit into memory without compression and for temporary tables and system catalogues. But even in those cases, the CPU cycles needed to compress the data and the table scans required to repack and shrink data have little impact on overall performance.
While compression and decompression do require CPU cycles, "We're finding the price you pay in CPU overhead is outweighed by the benefit you get from having fewer pages and fewer I/Os," says IDS Chief Architect Kevin Brown. "If before, your database couldn't fit in memory, and now it can, you're going to see a really big performance boost."
In cases where compression might compromise application performance, administrators can specify which fragments to compress. "If a table has 10 fragments, for example, you might want to leave the most recently created fragments uncompressed for performance purposes, and compress the older fragments, since they're less likely to be accessed," says Brown. A recompress option allows administrators to rebuild the compression dictionary to ensure optimal compression of the tables, which is useful if the data within them has changed significantly over time.
DBAs can use any or all of the Storage Optimization capabilities while the database is running, thus preventing interruption to either transactions or queries. They can also manage Storage Optimization from within the same IDS OpenAdmin Tool (OAT) they use to manage the size of volumes and to monitor performance. Graphical tools show administrators or business users how much space they can save by running any of the functions within Storage Optimization. Storage Optimization can also be controlled through a command-line interface, which is useful for the many ISVs that bundle IDS within their own applications.
Storage Optimization is available in a new data warehouse platform package called IBM Informix Warehouse, which includes IDS, its new Storage Optimization Feature, and the IBM Informix Warehouse Feature V11.50. Informix Warehouse Feature V11.50 includes the SQL Warehouse (SQW) Client with Design Studio for data modeling, schema design, data transformation design, and data flow design; SQW Warehouse Server, with an administration console to schedule and manage data flows; and SQL Warehouse runtime to perform data transformation within the IDS data server.
The Informix Warehouse infrastructure also supports integrated external tooling for BI analytics from IBM Cognos, for managing data growth with IBM Optim, and for data transformation and cleansing with IBM InfoSphere DataStage and InfoSphere QualityStage.
With the new IDS data warehousing tools, DBAs can perform business analysis on their operational data stores to ensure access to the most current data. They can also create separate data warehouses within IDS to prevent analytic queries from slowing production applications or run a "shared disk" configuration in which separate servers access the same data for OLTP and business analysis. The bottom line: businesses that have been using IDS for data warehousing and BI now have better, more targeted tools to achieve their goals. "We give organizations the choice so they can make the best use of their data and infrastructure based on their technical and business requirements," says Brown.
IDS Storage Optimization
IBM Informix Warehouse
IBM Informix Warehouse Feature




