At the end of March 2011, IBM® announced the new Informix Warehouse Accelerator (IWA) for Informix 11.70.xC2. I have been beta testing this accelerator for a while and would like to share some of my findings with you. IWA is an exciting breakthrough—combining new data warehouse technology with traditional Informix relational database servers—that results in very fast performance. My benchmark run of 16 queries went from 9 hours and 40 minutes to just over 15 minutes using this new technology. The chart in Table 1 shows the results of my benchmark testing.
Table 1. Performance of Informix with and without Informix Warehouse Accelerator
|Run||Query||Type of Query||IDS only||IDS & IWA||Improvement multiple||Rows returned|
|1||Query12||Scan - Select First 1000||105m7.730||9m48.071||10.7261||1000|
|2||Query13||Index - by Dim||0m0.116||0m11.605||0.0100||9|
|3||Query14||Index - by Dim||0m0.055||0m11.971||0.0046||9|
|4||Query15||Index - by Dim||0m8.345||0m4.469||1.8673||9|
|5||Query0||Scan - Select First 1000||106m44.474||0m20.946||305.7612||1000|
|6||Query10||Index - by Dim||0m4.817||0m6.062||0.7946||1|
|7||Query11||Index - by Dim||0m3.430||0m6.068||0.5653||1|
|8||Query1||Scan - Select First 100||63m33.083||1m13.501||51.8780||100|
|9||Query2||Scan - Select First 100||65m17.075||1m29.518||43.7574||100|
|10||Query3||Scan - Sum by Dim||87m47.842||0m17.207||306.1453||3107|
|11||Query4||Scan - Select First 100||53m57.169||0m14.162||228.5813||100|
|12||Query5||Index - by Dim||0m0.049||0m12.692||0.0039||0|
|13||Query6||Scan - Select First 1000||1m48.886||0m9.495||11.4677||1000|
|14||Query7||Scan - by Dim w Group by||0m15.855||0m9.171||1.7288||26937|
|15||Query8||Scan - by Dim w Group by||0m14.846||0m2.347||6.3255||28358|
|16||Query9||Scan - by Dim w Group by||79m14.253||0m42.011||113.1669||523499|
|Total in hours||9 hr 40 min||15 min 32 sec|
IWA technology loads the records from a data warehouse star schema into memory in a compressed format, and organizes the data using new columnar storage technology. The records are not stored as rows but as columns of data that can be very quickly accessed using this new technology. When an application sends a query to Informix, the Informix optimizer decides whether the query should be sent to IWA. When the Informix server gets the data back from IWA, it returns the results to your application—only much faster than before (Figure 1). Your applications do not need to change or be recompiled to benefit from the speed of IWA. My favorite business intelligence query tools continue to work unchanged with Informix: Cognos, Hyperion, Pentaho—they all just work. All ODBC, JDBC, or ESQL/C applications connect to Informix and do not even need to know that they are connecting to IWA.
From one hour to nine minutes
Based on my testing, one-hour queries run in two to three minutes with IWA. Disk access is the slowest part of a data warehouse query, and when you need to read millions of records to get a result set for a query, the performance of your disks determines how fast you can get data back. The best performance gains with IWA are in table scans of large fact tables, where disk activity is normally the limiting factor.
Over the years, I have built up a midsize data warehouse star schema that I use for testing new releases of Informix, and I used it for this benchmark. It contains a fact table of 630 million records, a customer dimension of 2.5 million customers, and three small dimension tables for organization, programs, and commodities. It uses real data on tracking government payments to farmers, which is available to the public by request.
The 16 queries in the benchmark were based on real application query requests I have seen from business users. Some seem simple, like the first one in the chart in Table 1. The objective is to find the top 1,000 farmers that received payments. This query requires reading all 630 million records, sorting and summing the payments by farmer, and returning 1,000 records. This query can take hours to run on big servers with expensive storage systems.
Running this query without IWA on Informix took more than an hour to read, gather, sort, and sum all the data. The data set was too large to fit into Informix's memory buffers. Running this query with IWA took nine minutes, and we were able to reduce the size of the Informix memory buffers. Both of these tests were from a cold start with no data in the memory buffers. Running the same query a second time on Informix alone did not speed things up much because all the data would not fit in memory and the buffers were continually thrashing. In contrast, the same query run a second time with IWA finished in less than a minute.
Figure 1. Informix managing the Informix Warehouse Accelerator
When Informix takes over
For some queries, IWA does not improve performance. As the chart in Table 1 shows, most queries that returned nine or fewer records and can use an index are faster without IWA. Most were very fast queries, and Informix returned the results in less than a second. IWA does not handle all types of queries yet, such as correlated sub-queries or queries using temp tables. One of my tests used a query to select records into a temp table and then a second query to further refine the results from the temp tables. IWA could send the results from the first query into a temp table, and that was very fast. But since the temp tables had not been defined and realized in IWA, Informix handled the second query as a normal query. Any query that cannot be optimized by IWA is optimized and handled by Informix without users even knowing it, except that they will notice the slower response time.
A database must be designed using a dimensional model star schema to work with IWA, and the tables must be defined and loaded into IWA to take advantage of its performance. Load times were very fast, comparable to creating some of the summary tables I traditionally build in a data warehouse. In fact, I see IWA replacing the need for many of the summary tables I design in a data warehouse.
I have been building data warehouses with Informix for more than 18 years, and it has been getting better and faster with each new release. The Informix Warehouse Accelerator really raises the performance level that we can get and expect from a data warehouse. It is also exciting to see IBM invest so heavily in Informix and its role as a data warehouse engine.
- Learn about BenchmarkSQL, an easy-to-use JDBC benchmark from SourceForge.
- Download a trial version of Informix to try it for yourself.