These are two concepts I've been reading about lately in a book from Eliyahu M. Goldratt (The Goal).
It's interesting to read that a system throughput is determined by its slowest component. Of course, that's something we are familiar with in database management: we want to optimize the I/O to get better performance. What I found more interesting is that when an event is delayed, it can have a direct impact on the overall system throughput. For example, if the slowest component is delayed, it represents a direct loss to the system. In other cases, other components can take a long time to catch up after a delay.
One key to all this is to look at improving the entire system and the way to do it is to find out where the bottlenecks. Once they are found, we must figure out how to make sure they are not idle waiting for something to happen and that they don't do extra work.
This seems to be a lot of what an Informix DBA does when there are performance questions. I could easily point to disk fragmentation by expression, use of prepared statements and so on. The thing is that I've also seen other situations where people point to the database as the source of the bottleneck to find out that it is outside the database. I've seen issues of network and recently I was told by a customer that they must have a specific response time because the transaction already takes 3 times that before outside of IDS. IDS has to sprint because the other components jog.
In another situation, I found that what the customer was seeing as one database requests turned out to be over 100 SQL statements. The kicker was that most statements were unnecessary.
Next time people point to the database as the problem, make sure to get the complete picture from end to end.
I've been silent for quite a while. That does not mean I have not been busy!
A lot of efforts has been put on TimeSeries over 11.70.xC3 and 11.70.xC4 and we are still going full steam ahead. We continue to improve its performance, scalability, usability and functionality.
I wanted to put together a repository of information so people can find it all (or most of it in one place. For this purpose, I put together a wiki on developerWorks that is dedicated to The smart meter support. It is still a work in progress but I believe it is a good start. you can find it using the tinyurl: tinyurl.com/InformixSmartMeterCentral
Let me know what you think.
I've been saying for quite a while now that smart meters represent BIG DATA and that Informix TimeSeries is the optimal solution for an operational data store.
We can complement the Informix capabilities with other IBM products. When it comes to real-time processing of huge amount of data. The IBM solution is InfoSphere Streams.
It happens that Streams can interface with Informix as a data source or as a target (sink).
If you want to know more in this area, go take a look at the new information added to the Smart Meter Central wiki on Streams.
Two pages were added. One on a quick overview of Steams (with a youtube video) and another on setting up the environment.
The exact pages URLs are:
More to come as we go deeper into BIG DATA!
Wednesday started with an Informix "eat and meet" breakfast followed by nine different Informix sessions spread throughout the day. My favorite session was: "How Hildebrand and IBM bring smart metering to homes across Britain". It was very interesting to see a real-time system where people can see their power consumption and compare it to a pool of similar housed to see how they are doing. The system does not only measure the total consumption at a home but can break it down to specific outlets. For example, some people were able to find out that their energy consumption was greatly impacted by their use of hair straitening devices. Another person could find out that they spent around 250 pounds per year to run their old refrigerator. Buying a new one for 200 pounds made it pay for itself pretty quickly.
Of course, the other presentations were also interesting. They covered areas such as building data warehouse, grid-based replication, Informix in the cloud and more.
An additional 11 sessions were held on Thursday to wrap up the conference.
The one thing that is hard to measure at a conference like this is the value of the interactions with other people. Discussions on different interests and new challenges, and also how Informix has been used. This ties into what I mentioned in this blog on Oct 9. Good ideas come form people interactions. The conference provided a good environment for that. This was a great conference and you can expect interesting things coming out of the Informix lab in the future. I'm sure we'll have a lot to say next time we meet: The International Informix Users Group (IIUG) conference in Overland Park, Kansas, that will be help between May 15 and 18, 2011.
I came back from the Informix conference Thursday night and woke up thinking about an analogy about why we use Informix Dynamic Server. More on that in a minute.
I've been using databases for a long time. I believe that the first formal database system I used was back in 1984. It was a hierarchical database. I developed an inventory system for the Canadian Coast Guard. Over the following years, I used and supported multiple databases systems some looking more like C-ISAM and others relational. I still remember the good old days where I had to debug Oracle installation scripts :-)
So, why Informix? Isn't a database a database?
I uses to use a car analogy: people buy cars and they are used to what happens to it: If they have to go to the shop to get it fixed or tunes every other month, that's just the way cars are. Who would believe that you could buy a car and only have to put gas in it for years after years without having to waste time in the shop? the car is used to get you from point A to point B day after day. It almost makes it invisible but not quite since you still have to drive it. It's not the same with a database system: it can really be invisible.
I woke up Friday with this thought: You can write just about any application in any computer language you want. Why don't we all use COBOL. Way back, I know a guy that could do EVERYTHING in COBOL. He was even doing system programming! An object oriented version of COBOL has been available for years buy why. Isn't the "vintage" version of COBOL good enough? If I'm not mistaken, the number of COBOL lines of code in production still surpass any other programming language. That should be enough of an argument to standardize on it.
It seems to me that many people apply this line of reasoning to database systems. The trend is to look at databases as commodity. Who cares that one barely requires any attention? Who cares that it provides easy continuous availability? Who cares that it has great storage optimization? The difference is only more overhead. that translates only into more costs. Those significant costs are easy to hide so why worry about them. Everybody does it so no need to be more efficient...
Well, me, I'm old school. I come from an era where memory was measured in kilobytes and disk drives in megabytes. Yes, memory is much bigger now and not that expensive. Disk drives are so much bigger and not very expensive. Computers are so fast now. It seems to me that we should stop the insanity and pay attention to efficiency. Isn't that what cloud computing, virtualization and being green is all about?
No matter how I try to slide it, to me, Informix is number 1.
Someone asked me the following question:
"How do I keep passwords in the database so nobody can get them?"
It means that we cannot keep the the passwords in plain text in the database. Informix has a few functions that can be used for encryption: ENCRYPT_AES and ENCRYPT_TDS. It would be easy to create a table and encrypt the column that contains the passwords.
The next statement that came up was: "..but, if someone has the encryption password, he can get all the passwords. We need to protect the passwords from internal access".
This means that we need to use a different password to protect each password in the table. The solution I proposed was to use the password to encrypt itself. Let's look at an example:
CREATE TABLE passwd (
INSERT INTO TABLE passwd VALUES(1, ENCRYPT_AES("Jacques", "Jacques"));
INSERT INTO TABLE passwd VALUES(1, ENCRYPT_AES("Lance", "Lance0"));
INSERT INTO TABLE passwd VALUES(1, ENCRYPT_AES("Daniel", "Daniel"));
INSERT INTO TABLE passwd VALUES(1, ENCRYPT_AES("Umut", "Umut01"));
The values inserted look as follow:
SELECT * FROM passwd
I can now test f someone has the right password for user 1 by using the password value to decrypt itself:
SELECT col1, DECRYPT_CHAR(col2, "Jacques") FROM passwd WHERE col1 = 1;
If I use the improper password, I receive an error:
SELECT col1, DECRYPT_CHAR(col2, "Jacques") FROM passwd WHERE col1 = 3;
26008: The internal decryption function failed
One more thing. Note that the encryption password must be at least six-character long. This is why in the example I padded some encryption passwords. An easy way to work around it would be to always add padding to make sure we meet that minimum size. Keep in mind that the maximum size of an encryption key is 128 bytes.
With this approach, we can keep passwords in the database and keep them secret.
The new Informix, version 12.10 was announced last week. It is time to start talking about the new features in TimeSeries.
The Informix team has added a public version of a fast loading mechanism. It allows to load into existing TimeSeries that are defined as part of a container.
This loader API was previously undocumented. It was only available to use as part of the Tooling. A lot of work went into it since its internal implementation. You should not try to use the older internal version since it disappears in 12.10 in favor of this new one.
You can find a description of its use in the "Informix Smart Meter Central" in the page Loading fastest with the loader API
You should also refer to the Informix documentation for more details.
Since the Loader API is an SQL API, it can be used by any clients including InfoSphere Streams.
For more information on how to use Streams with the loader api, please see the Informix Smart Meter Central wiki: Streams and the TimeSeries Loader API
More to come. Don't forget, the IIUG conference is just around the corner. This is the perfect place to learn about all the new features in Informix 12.10: Simply powerful.
The day started with a Q&A with IBM excutives: Alys Passarelli, Inhi Cho Suh and Rob Thomas. There were a number of good questions and it was an opportunity for the excutives to state their commitment to Informix and describe many of the efforts in progress.
Once again, a good mix of sessions on subjects from customer case studies, database administration, security, best practices, and use of tools such as Eclipse.
The day ended at 4:30 after the delivery of 25 sessions.
This was a great conference with lots of good information and fantatic networking.
When I was in school I wanted to know why I had to learn something: Why learn about history? It’s about a bunch of dead people, often from far away. I would also ask: Why would I ever learn English. . .
I feel that the computer industry does not only forget about history but is quick to discard what has been done before. Just remember when object databases came out, the trade magazines where trumpeting the death of relational databases.
There is a disconnect between the object-oriented (OO) approach and the use of relational databases. This will be the subject of the next few entries. Lets start with an example:
An object person will look at the employees of a company and see managers, full-time employees, part-time employees and contractors. This will lead to the following model:
With the definition of the multiple types of employees, we can easily see that they will want multiple tables, one per defined object. Of course, for a database person, we see something like:CREATE TABLE employee (
Empno int PRIMARY KEY,
mgrNo int ,
. . .
As you can see, we can already see that a "data access expert" can start some discussions with the OO architects and programmers.
Don’t get me wrong. I like OO. I think it is a wonderful approach but just like anything it can be abused. See what you think of:http://csis.pace.edu/~bergin/patterns/ppoop.html
I'm always looking for interesting information to stimulate my thinking.
My morning routine usually starts at around 5:30am and I use my tablet to look at news, blogs, tweets, and some web sites.
As part of the tweets I get, it includes some from a site called TED. I've talked about TED before. Take a look at my blog entry for January 2011: Happy new year!
In this blog entry, I recommended no less than four TED presentations.
For people that don't know TED, it is an organization that organizes conferences on all sorts of subjects. The presentations used to be have to be 17 minutes.
Now, you can find presentations that can also be much shorted. TED's tagline is: "Ideas worth spreading".
So, in the morning, I often check what's new on TED to see if there is something interesting to watch during breakfast (of course, when I have breakfast alone...).
I recently came across one that I thought was interesting considering everything we've been hearing over the last 4-5 years about the global economy.
Of course, the fact that it talks about complexity and emergence is just a bonus.
Here is the link to this presentation: Who controls the world?
I recently ran into the mention of INT8 and, by association, SERIAL8 by Informix engineers and a recent redbook. I want to make a quick comment on that.
These two types were added a long time ago to support the eight-byte integers (64 bits). They are defined as being a 10-byte structure that includes two "standard" integers. It was done this was so eight-byte integers could be supported on 32-bit operating systems. Now it appears that most operating systems support a native 64-bit integer. For this reason, new data types were added to Informix version 11.50 (fixpack 1). The new types, BIGINT and BIGSERIAL, take less space and perform better. Here is what the release notice says:
Improved Query Performance for Large Integers and Serial Data
The BIGINT and BIGSERIAL data types, which are provided as alternatives to the INT8 and SERIAL8 data types. can provide better performance than the INT8 and SERIAL8 data types.
So, let's forget about INT8 and SERIAL8 and let's use BIGINT and BIGSERIAL available in Informix 11.50.
I did forget to mention the key note presentations of the evening of Monday. Rob Thomas gave us his view of the Informix business and a glimpse at his plan for continuing successes. It was followed by a presentation by dr. Arvind Krishna, general manager of IBM Information Management. Great information!
Tuesday, sessions started at 8:00 AM. I was the moderator for the session on disk level encryption. It was about a flexible product that can protect your database data at rest. It works for file-based dbspaces (cooked files) or raw devices. All that, transparently to Informix. Mark Jamison did a great job at presenting the product in a clear and concise manner.
Theere were many sessions on application development in cluding sessions on Groovy, Perl, Python, PHP. Of course, there were also more sessions on duifferent aspects of tuning and also presentation on embeddability, replication , warehouse and so on.
With the sessions starting at 8:000 AM, we had a total of 35 sessions for the day.
This was followed by a casino night. What a full day!
The Informix development team has put a lot of efforts over the last year or so to continue to improve the product capabilities.
We strongly believe that this new release will help everyone, customers and partners alike, address the challenges and changing needs of data management.
Will it be faster? Will it be easier to manage? Will it include new functionality? Will it be smarter to accommodate a smarter planet?
What about big data and analytic?
You're in for a treat! Here is the webcast information:
The New IBM Informix: It's Simply Powerful
Date: Tuesday, March 26, 2013
Time: 10:00 AM PDT
Don't miss it.
I dare add to this, to me, the new IBM Informix, it's simply wonderful!
Another year is coming to an end.
All in all, not a bad year. Informix released 11.70.xC5 and 11.70.xC6 while continuing to work on the next major version of the product. You can find the latest Informix release notes at: Informix 11.70 Information center
We continue to see more acceptance of features like IWA and TimeSeries. The Informix group also delivered many presentations and demo that the IIUG and the IOD conference. We can ad to that support for regional Informix users' group, new redbook, and so on.
Well... stay tuned. 2013 is lining up to be another good one for Informix. But what about ourselves. Are we improving over time like good wine or...
Here are some of my new year resolutions:
- Get back in shape.
In 2012, I neglected this a bit but I am already getting back to it by running regularly and continuing to train in Brazilian Jiujitsu.
- Learn new things
Informix does not operate in a vacuum. It needs an ecosystem. For me, I need to look into what it takes to integrate Informix more with the IBM BigData products. I already started. You can find some information on using Informix with InfoSphere Streams in the SmartMeterCentral wiki.
I'm also slowly started using twitter (@jroy58). I re-tweetted two tweets and I will put in my first tweet as soon as I'm done with this blog entry.
What about your new year resolutions?
For one, are you using the best Informix you could use? Resolve to upgrade to Informix 11.70.xC6 as soon as possible
We are seeing more and more interest in using both InfoSphere Streams and Informix together.
This is in the context of "Big Data".
InfoSphere Streams is a platform that allows you to add operators as you see fit.
In our case, there are already a few operators that can be used to read from or write to Informix from InfoSphere Streams.
There is a new DeveloperWorks article that describe how this could be done. With these basic examples you should be
able to integrate Informix in a Streams environment (or vice versa) in no time.
This week we have the International Informix Users' Group annual conference. It is being held at the Overland Park Marriott.
Here's a bit of trivia for you: In 2008, Overland Park was listed as the 9th best place to live in the US. No doubt having the Informix lab close helped its ranking :-). You can check it at:CNNMoney.com
Sunday was a day of tutorials with eight tracks running at once. I arrived in the early afternoon. It is amazing how quickly you can get into interesting conversations, discussing different projects and business solutions.
The evening reception was a success with lots of networking and good food. This is a great start to what should be a very interesting and useful conference. Monday starts with a keynote from Dr. Arvind Krishna and continues with five tracks on a variety of subjects.
We left off with an insert through the virtual table view. We created a container, a row type, a table, and a virtual table. What if we could simplify this? What if we could avoid creating a container?
One reason why you don't want to create containers could be that you have a lot of data to load and you would need a lot of containers. Would it be nice if Informix could help you with that? Informix can! In the Informix 11.70.xC3 release, we added a capability that does just that.
The new feature if referred to as auto create container. When you insert a new time series in a table and no container is specified, Informix will create one for you if needed. For example, let's take the following table:
CREATE TABLE jroy (
loc_esi_id char(20) NOT NULL PRIMARY KEY,
) LOCK MODE ROW;
WE can insert a new TimeSeries without specifying a container:
INSERT INTO jroy
VALUES(1, "origin(2010-11-10 00:00:00.00000),calendar(tst15min),threshold(0),regular,");
If there is no container available, a container is created as we can see in the tscontainertable table:
SELECT * FROM tscontainertable
partitiondesc autopool00000000 datadbs 16 16 4194538
This features goes a few steps further. If the table is partitioned over multiple dbspaces, Informix will create one container per dbspace and put them in a pool called autopool. It is possible to have the following inserts go through the pool in a round robin fashion to evenly distribute your time series over multiple container and dbspaces.
If you prefer to manage your containers tourseld, you can create your own containers and ut them in a specific pool so you can take advantage of a container pool. You can even create your own policy to decide where new time series should be located.
There is more to know about these capabilities. You can find out more in the information center starting at:
A while back, I started reading a book called "Thinking, Fast and Slow" from Daniel Kahneman.
Daniel Kahneman is a professor of psychology who won a Nobel prize in economic.
I have to admit, I am not done reading it. I need more "plane" time
What I read so far is fascinating. This is the type of book that can be read multiple times.
Today, I just want to relate some parts of chapter 14 where he put together a test to see how people would classify individuals
based on some personality descriptions. Here is the description:
"Tom W is a high intelligence, although lacking is true creativity.
He has a need for order and clarity, and for neat and tidy systems
in which every detail finds its appropriate place His writing is
rather dull and mechanical, occasionally enlivened by somewhat
corny puns and flashes of imagination of the sci-fi type. He has a
strong drive for competence. He seems to have little feel and little
sympathy for other people and does not enjoy interacting with
others. Self-centered, he nonetheless has a deep moral sense."
After reading the description, the subject was asked to figure out which field of study Tom was most likely in.
The description was actually designed so people should rank computer science among the best fitting
because of 'hints of nerdiness ("corny puns")'.
I laughed out loud when I read that part. I immediately though of one of my co-worker, Robert U., that
reminds me regularly that I make corny jokes during my presentations. And yes, I graduated in computer science.
For those who read this blog, if you make corny jokes/puns and graduated in computer science rejoice.
Embrace your nerdiness. You picked the right major
The book is full of interesting information including the fact that even statisticians can misuse/misinterpret statistics.
One I really like is:
"you dispose of a limited budget of attention that you can allocate to activities. . .
You can do several things at once, but only if they are easy and undemanding."
My conclusion: if someone tells you he/she's multitasking, they do trivial work.
Happy new year everyone!
The informix team is always hard at work improving the Informix products.
It turns out that, while working on V.next, a feature escaped and made it into version 11.70.xC5 and above (xC6 being the current release as of October 2012).
It concerns loading data into TimeSeries using a relational view of a TimeSeries (also known as VTI interface). To take advantage of this new feature, you simply
use the TS_VTI_ELEM_INSERT (128) flag when you create the relational view with the TSCreateVirtualTab() procedure.
A simple test showed that this feature loads data 3.6 times faster than previously. Of course, your "mileage" will vary depending on your environment. To know more on how you can
use this new feature, consult the following link from the Informix Smart Meter Central wiki:
A few years ago, IBM started talking about a smarter planet: Instrumented, interconnected, intelligent.
We are seeing more and more uses of sensors starting from your smart phone ant its many sensors (GPS, proximity, temperature, barometer, etc) to electric meters at your house. Add to that all the other sensors used in many industrial plants and even sensors on rails!
How can we convert this deluge of data into information?
This leads to issues related to two ways to handle data: in-motion and at-rest.
It happens that IBM has a mix of products that can handle these two "states" of the data:
For data in motion, we can use InfoSphere Streams for real-time analytics based on more in depth analysis on historical data (analytics models).
For the data at-rest, there are problems of how fast we can store it and how fast we can retrieve the information, specially when it concerns many users making requests. This would be an operational data store environment. Then, of course, there is the issue of "in-depth" analysis that requires fast access of large amount of data.
Informix has the combined solution with its TimeSeries capabilities and the Informix Warehouse Accelerator.
Learn more about the use of Informix to solve this big data problem in the following webcast:
Solving the Big Data Challenge of Sensor Data
Date: June 26, 2013
Time: 1:00 PM EDT / 10:00 AM PDT
Register at: https://event.on24.com/eventRegistration/EventLobbyServlet?target=registration.jsp&eventid=641115&sessionid=1&key=AA3293E3AC9715CF3D602D0DEAE4D52B&sourcepage=register