IBM Informix - SQL & NoSQL
keshavamurthy 0600019DXM Tags:  david jeopardy ferrucci informix watson warehouse accelerator 5,995 Views
This week, we've Watson to play with at the IIUG Conference. Andrew Ford posted a photo of it.
Yesterday, newly minted IBM Fellow, Dr. David Ferrucci gave a terrific speech at the Informix User Conference. Notice the Informix T-shirt Dr. Ferrucci wearing.
He talked about nuances of creating software for NLP that's different from database queries, google queries. The not so publicized result of Jeopardy! is, as a result of this project, IBM is leading on all the categories of NLP, like categorization, coherence, missing link analysis, etc. With Google dominating the search and text related application (e.g. translation, search phrase prediction, etc)
I know JPG recorded part of it... Hope someone has full recording and posts it somewhere. I was pretty excited to meet Dr. Ferrucci and got his autograph on the Final Jeopardy book...He signed with: "Out of Jeopardy, Finally!" :-)
On this occasion, just for fun, I've tried to see some common characteristics, differences between Watson and Informix Warehouse accelerator....See below for the comparison table.
keshavamurthy 0600019DXM Tags:  michael warehouse accelerator warehousing stonebraker daa informix 5,780 Views
May 2011 edition of Communications of ACM has an article from Michael Stonebraker. It can also be found on the ACM blog below.
Over the last year, and specially with the release of Informix Warehouse Accelerator, we've improved the performance and lowered the TCO.
As many of you know, Stonebraker is associated with Vertica, columnar database company recently bought by HP.
Even though his blog is admittedly biased (towards vertica?), his contributions to RDBMS are too numerous to mention in the blog, but you can see it here..
Informix bought his company Illustra and one point, he was CTO of Informix as well.
I reviewed his assertions against what we've done. My comment is in blue.
(Keshav) I couldn’t agree more. Using Star and snow-flake schema has become best practice for data warehousing world. Informix 11.70 has added star and snowflake query optimizations to improve performance. IWA implementation focuses on star and snowflake schema. The data mart design, the compression techniques, the query processing is all optimized for star and snowflake schema.
In the last few years, column store and access has improved the warehouse query performance. The benefit if column store comes into fore when you combine column store with compression. The compression can and will be done on the value instead of bit streams.
In warehouses, size matters. You can have terabytes of data in your
warehouse. The hardware vendors
recognize this and are increasing capacity and the prices are falling on
this. Recently, IBM announced eX5
servers with up to 6 terabytes capacity.
Intel announced Westmere with 10-cores.
MPP for warehouse has been proven in many contexts, configurations, hardware and vendors. At the same time, multi-core processors have added lot of CPU power into a single node. Depending on your data warehouse size and performance requirements, an SMP system can provide very good return on investment. SMP – simplicity
IWA is a no knob accelerator – no indexes, no statistics, nothing to tune. You tell IWA how much memory and CPU to use. Then, you simply load the data and start querying immediately. When you cannot tune, there’s nothing to tune.
Informix Ultimate Warehouse Edition with Informix Warehouse Accelerator is available in the following configuration.
• IWA on Linux on Intel x86_64 (RHEL 5 or SUSE SLES 11)
• IDS 11.70 + IWA code modules including IDS Stored Procedures
– Linux on Intel (64 bit)
– AIX on Power (64 bit)
– HPUX on Itanium (64 bit)
– Solaris on Sparc (64bit)
We give the software. You choose the right hardware for you.
ROW stores are well suited for OLTP and COLUMN stores are well suited for warehousing. So far, we’ve seen database supports either ROW store or COLUMN store… Hence Stonebraker’s comment
Informix database server uses ROW store. Informix OLTP performances are well known and proven. It also has number of features for warehouse management like time cyclic data management (via fragmentation features) and query optimization (has joins, multi index scans, star and snowflake join optimization).
IWA uses deep columnar storage, optimized for extreme performance. See this paper for details.
So, with Informix Ultimate Edition, you do get best of both worlds. You can run just OLTP, hybrid or just warehouse workload on Informix. And, do it all very well.
Inforimx has the best HA solution in the database industry. HDR, ER technology has been used by our customers for 15 years. MACH11 has increased this presence in the four years since its release. Flexible Grid in Informix 11.7 takes this to a new level by enabling easier management, replication of schema changes along with data.
One you off-load the data to IWA from the primary, you can accelerate the query from any node in the cluster or HA server.
Stonebraker’s idea here is the DBMS instances should be elastic ONLINE.
The snapshot of the data fro Informix to IWA is done ONLINE. When you need to reprovision the number of workers, you simply disable the marts, change the number of nodes and reload the data ONLINE. IWA still runs on a single machine… But, the underlying architecture and implementation will eventually support MPP environment.
The performance issues are owed to two factors: IO performance and CPU sharing. IWA does not have any IO – That’s one factor taken care of. Virtualization is typically deployed when/because you have excess CPU capacity. IWA maximizes the usage of CPU usage. We’ve tested IWA in virtualized and cloud environment and found you still get these incredible speeds… We’ve internal users and customers validating this in virtualized and Cloud environment.
So, we score quite well on these assertions!