Big data in motion
I've been silent for quite a while. That does not mean I have not been busy!
A lot of efforts has been put on TimeSeries over 11.70.xC3 and 11.70.xC4 and we are still going full steam ahead. We continue to improve its performance, scalability, usability and functionality.
I wanted to put together a repository of information so people can find it all (or most of it in one place. For this purpose, I put together a wiki on developerWorks that is dedicated to The smart meter support. It is still a work in progress but I believe it is a good start. you can find it using the tinyurl: tinyurl.com/InformixSmartMeterCentral.
Let me know what you think.
JacquesRoy 120000A2MS 4,245 Views
These are two concepts I've been reading about lately in a book from Eliyahu M. Goldratt (The Goal).
It's interesting to read that a system throughput is determined by its slowest component. Of course, that's something we are familiar with in database management: we want to optimize the I/O to get better performance. What I found more interesting is that when an event is delayed, it can have a direct impact on the overall system throughput. For example, if the slowest component is delayed, it represents a direct loss to the system. In other cases, other components can take a long time to catch up after a delay.
One key to all this is to look at improving the entire system and the way to do it is to find out where the bottlenecks. Once they are found, we must figure out how to make sure they are not idle waiting for something to happen and that they don't do extra work.
This seems to be a lot of what an Informix DBA does when there are performance questions. I could easily point to disk fragmentation by expression, use of prepared statements and so on. The thing is that I've also seen other situations where people point to the database as the source of the bottleneck to find out that it is outside the database. I've seen issues of network and recently I was told by a customer that they must have a specific response time because the transaction already takes 3 times that before outside of IDS. IDS has to sprint because the other components jog.
In another situation, I found that what the customer was seeing as one database requests turned out to be over 100 SQL statements. The kicker was that most statements were unnecessary.
Next time people point to the database as the problem, make sure to get the complete picture from end to end.[Read More]
JacquesRoy 120000A2MS 3,315 Views
Wednesday started with an Informix "eat and meet" breakfast followed by nine different Informix sessions spread throughout the day. My favorite session was: "How Hildebrand and IBM bring smart metering to homes across Britain". It was very interesting to see a real-time system where people can see their power consumption and compare it to a pool of similar housed to see how they are doing. The system does not only measure the total consumption at a home but can break it down to specific outlets. For example, some people were able to find out that their energy consumption was greatly impacted by their use of hair straitening devices. Another person could find out that they spent around 250 pounds per year to run their old refrigerator. Buying a new one for 200 pounds made it pay for itself pretty quickly.
Of course, the other presentations were also interesting. They covered areas such as building data warehouse, grid-based replication, Informix in the cloud and more.
An additional 11 sessions were held on Thursday to wrap up the conference.
The one thing that is hard to measure at a conference like this is the value of the interactions with other people. Discussions on different interests and new challenges, and also how Informix has been used. This ties into what I mentioned in this blog on Oct 9. Good ideas come form people interactions. The conference provided a good environment for that. This was a great conference and you can expect interesting things coming out of the Informix lab in the future. I'm sure we'll have a lot to say next time we meet: The International Informix Users Group (IIUG) conference in Overland Park, Kansas, that will be help between May 15 and 18, 2011.
I've been saying for quite a while now that smart meters represent BIG DATA and that Informix TimeSeries is the optimal solution for an operational data store.
We can complement the Informix capabilities with other IBM products. When it comes to real-time processing of huge amount of data. The IBM solution is InfoSphere Streams.
It happens that Streams can interface with Informix as a data source or as a target (sink).
If you want to know more in this area, go take a look at the new information added to the Smart Meter Central wiki on Streams.
Two pages were added. One on a quick overview of Steams (with a youtube video) and another on setting up the environment.
The exact pages URLs are:
The wiki URL of the welcome page is: https://www.ibm.com/developerworks/mydeveloperworks/wikis/home?lang=en#/wiki/Informix%20smart%20meter%20central/page/Welcome
Make sure to bookmark it.
More to come as we go deeper into BIG DATA!
Someone asked me the following question:
"How do I keep passwords in the database so nobody can get them?"
It means that we cannot keep the the passwords in plain text in the database. Informix has a few functions that can be used for encryption: ENCRYPT_AES and ENCRYPT_TDS. It would be easy to create a table and encrypt the column that contains the passwords.
The next statement that came up was: "..but, if someone has the encryption password, he can get all the passwords. We need to protect the passwords from internal access".
This means that we need to use a different password to protect each password in the table. The solution I proposed was to use the password to encrypt itself. Let's look at an example:
CREATE TABLE passwd (
INSERT INTO TABLE passwd VALUES(1, ENCRYPT_AES("Jacques", "Jacques"));
INSERT INTO TABLE passwd VALUES(1, ENCRYPT_AES("Lance", "Lance0"));
INSERT INTO TABLE passwd VALUES(1, ENCRYPT_AES("Daniel", "Daniel"));
INSERT INTO TABLE passwd VALUES(1, ENCRYPT_AES("Umut", "Umut01"));
The values inserted look as follow:SELECT * FROM passwd
I can now test f someone has the right password for user 1 by using the password value to decrypt itself:SELECT col1, DECRYPT_CHAR(col2, "Jacques") FROM passwd WHERE col1 = 1;
If I use the improper password, I receive an error:SELECT col1, DECRYPT_CHAR(col2, "Jacques") FROM passwd WHERE col1 = 3;
26008: The internal decryption function failed
One more thing. Note that the encryption password must be at least six-character long. This is why in the example I padded some encryption passwords. An easy way to work around it would be to always add padding to make sure we meet that minimum size. Keep in mind that the maximum size of an encryption key is 128 bytes.
With this approach, we can keep passwords in the database and keep them secret.
I came back from the Informix conference Thursday night and woke up thinking about an analogy about why we use Informix Dynamic Server. More on that in a minute.
I've been using databases for a long time. I believe that the first formal database system I used was back in 1984. It was a hierarchical database. I developed an inventory system for the Canadian Coast Guard. Over the following years, I used and supported multiple databases systems some looking more like C-ISAM and others relational. I still remember the good old days where I had to debug Oracle installation scripts :-)
So, why Informix? Isn't a database a database?
I uses to use a car analogy: people buy cars and they are used to what happens to it: If they have to go to the shop to get it fixed or tunes every other month, that's just the way cars are. Who would believe that you could buy a car and only have to put gas in it for years after years without having to waste time in the shop? the car is used to get you from point A to point B day after day. It almost makes it invisible but not quite since you still have to drive it. It's not the same with a database system: it can really be invisible.
I woke up Friday with this thought: You can write just about any application in any computer language you want. Why don't we all use COBOL. Way back, I know a guy that could do EVERYTHING in COBOL. He was even doing system programming! An object oriented version of COBOL has been available for years buy why. Isn't the "vintage" version of COBOL good enough? If I'm not mistaken, the number of COBOL lines of code in production still surpass any other programming language. That should be enough of an argument to standardize on it.
It seems to me that many people apply this line of reasoning to database systems. The trend is to look at databases as commodity. Who cares that one barely requires any attention? Who cares that it provides easy continuous availability? Who cares that it has great storage optimization? The difference is only more overhead. that translates only into more costs. Those significant costs are easy to hide so why worry about them. Everybody does it so no need to be more efficient...
Well, me, I'm old school. I come from an era where memory was measured in kilobytes and disk drives in megabytes. Yes, memory is much bigger now and not that expensive. Disk drives are so much bigger and not very expensive. Computers are so fast now. It seems to me that we should stop the insanity and pay attention to efficiency. Isn't that what cloud computing, virtualization and being green is all about?
No matter how I try to slide it, to me, Informix is number 1.[Read More]
JacquesRoy 120000A2MS 3,083 Views
The day started with a Q&A with IBM excutives: Alys Passarelli, Inhi Cho Suh and Rob Thomas. There were a number of good questions and it was an opportunity for the excutives to state their commitment to Informix and describe many of the efforts in progress.
Once again, a good mix of sessions on subjects from customer case studies, database administration, security, best practices, and use of tools such as Eclipse.
The day ended at 4:30 after the delivery of 25 sessions.
This was a great conference with lots of good information and fantatic networking.
The new Informix, version 12.10 was announced last week. It is time to start talking about the new features in TimeSeries.
The Informix team has added a public version of a fast loading mechanism. It allows to load into existing TimeSeries that are defined as part of a container.
This loader API was previously undocumented. It was only available to use as part of the Tooling. A lot of work went into it since its internal implementation. You should not try to use the older internal version since it disappears in 12.10 in favor of this new one.
You can find a description of its use in the "Informix Smart Meter Central" in the page Loading fastest with the loader API
You should also refer to the Informix documentation for more details.
Since the Loader API is an SQL API, it can be used by any clients including InfoSphere Streams.
For more information on how to use Streams with the loader api, please see the Informix Smart Meter Central wiki: Streams and the TimeSeries Loader API
More to come. Don't forget, the IIUG conference is just around the corner. This is the perfect place to learn about all the new features in Informix 12.10: Simply powerful.
JacquesRoy 120000A2MS 3,006 Views
I did forget to mention the key note presentations of the evening of Monday. Rob Thomas gave us his view of the Informix business and a glimpse at his plan for continuing successes. It was followed by a presentation by dr. Arvind Krishna, general manager of IBM Information Management. Great information!
Tuesday, sessions started at 8:00 AM. I was the moderator for the session on disk level encryption. It was about a flexible product that can protect your database data at rest. It works for file-based dbspaces (cooked files) or raw devices. All that, transparently to Informix. Mark Jamison did a great job at presenting the product in a clear and concise manner.
Theere were many sessions on application development in cluding sessions on Groovy, Perl, Python, PHP. Of course, there were also more sessions on duifferent aspects of tuning and also presentation on embeddability, replication , warehouse and so on.
With the sessions starting at 8:000 AM, we had a total of 35 sessions for the day.
This was followed by a casino night. What a full day!
JacquesRoy 120000A2MS 2,819 Views
I recently ran into the mention of INT8 and, by association, SERIAL8 by Informix engineers and a recent redbook. I want to make a quick comment on that.
These two types were added a long time ago to support the eight-byte integers (64 bits). They are defined as being a 10-byte structure that includes two "standard" integers. It was done this was so eight-byte integers could be supported on 32-bit operating systems. Now it appears that most operating systems support a native 64-bit integer. For this reason, new data types were added to Informix version 11.50 (fixpack 1). The new types, BIGINT and BIGSERIAL, take less space and perform better. Here is what the release notice says:
Improved Query Performance for Large Integers and Serial Data
The BIGINT and BIGSERIAL data types, which are provided as alternatives to the INT8 and SERIAL8 data types. can provide better performance than the INT8 and SERIAL8 data types.
So, let's forget about INT8 and SERIAL8 and let's use BIGINT and BIGSERIAL available in Informix 11.50.
The Informix development team has put a lot of efforts over the last year or so to continue to improve the product capabilities.
We strongly believe that this new release will help everyone, customers and partners alike, address the challenges and changing needs of data management.
Will it be faster? Will it be easier to manage? Will it include new functionality? Will it be smarter to accommodate a smarter planet?
What about big data and analytic?
You're in for a treat! Here is the webcast information:
The New IBM Informix: It's Simply Powerful
Date: Tuesday, March 26, 2013
Time: 10:00 AM PDT
Don't miss it.
I dare add to this, to me, the new IBM Informix, it's simply wonderful!
I'm always looking for interesting information to stimulate my thinking.
My morning routine usually starts at around 5:30am and I use my tablet to look at news, blogs, tweets, and some web sites.
As part of the tweets I get, it includes some from a site called TED. I've talked about TED before. Take a look at my blog entry for January 2011: Happy new year!
In this blog entry, I recommended no less than four TED presentations.
For people that don't know TED, it is an organization that organizes conferences on all sorts of subjects. The presentations used to be have to be 17 minutes.
Now, you can find presentations that can also be much shorted. TED's tagline is: "Ideas worth spreading".
So, in the morning, I often check what's new on TED to see if there is something interesting to watch during breakfast (of course, when I have breakfast alone...).
I recently came across one that I thought was interesting considering everything we've been hearing over the last 4-5 years about the global economy.
Of course, the fact that it talks about complexity and emergence is just a bonus.
Here is the link to this presentation: Who controls the world?
When I was in school I wanted to know why I had to learn something: Why learn about history? It’s about a bunch of dead people, often from far away. I would also ask: Why would I ever learn English. . .
I feel that the computer industry does not only forget about history but is quick to discard what has been done before. Just remember when object databases came out, the trade magazines where trumpeting the death of relational databases.
There is a disconnect between the object-oriented (OO) approach and the use of relational databases. This will be the subject of the next few entries. Lets start with an example:
An object person will look at the employees of a company and see managers, full-time employees, part-time employees and contractors. This will lead to the following model:
With the definition of the multiple types of employees, we can easily see that they will want multiple tables, one per defined object. Of course, for a database person, we see something like:
CREATE TABLE employee (
Empno int PRIMARY KEY,
mgrNo int ,
. . .
As you can see, we can already see that a "data access expert" can start some discussions with the OO architects and programmers.Don’t get me wrong. I like OO. I think it is a wonderful approach but just like anything it can be abused. See what you think of:http://csis.pace.edu/~bergin/patterns/ppoop.html[Read More]
Another year is coming to an end.
All in all, not a bad year. Informix released 11.70.xC5 and 11.70.xC6 while continuing to work on the next major version of the product. You can find the latest Informix release notes at: Informix 11.70 Information center
We continue to see more acceptance of features like IWA and TimeSeries. The Informix group also delivered many presentations and demo that the IIUG and the IOD conference. We can ad to that support for regional Informix users' group, new redbook, and so on.
Well... stay tuned. 2013 is lining up to be another good one for Informix. But what about ourselves. Are we improving over time like good wine or...
Here are some of my new year resolutions:
What about your new year resolutions?
For one, are you using the best Informix you could use? Resolve to upgrade to Informix 11.70.xC6 as soon as possible
We are seeing more and more interest in using both InfoSphere Streams and Informix together.
This is in the context of "Big Data".
InfoSphere Streams is a platform that allows you to add operators as you see fit.
In our case, there are already a few operators that can be used to read from or write to Informix from InfoSphere Streams.
There is a new DeveloperWorks article that describe how this could be done. With these basic examples you should be
able to integrate Informix in a Streams environment (or vice versa) in no time.
Here's the link to the article: Using InfoSphere Streams with Informix
JacquesRoy 120000A2MS 2,579 Views
We left off with an insert through the virtual table view. We created a container, a row type, a table, and a virtual table. What if we could simplify this? What if we could avoid creating a container?
One reason why you don't want to create containers could be that you have a lot of data to load and you would need a lot of containers. Would it be nice if Informix could help you with that? Informix can! In the Informix 11.70.xC3 release, we added a capability that does just that.
The new feature if referred to as auto create container. When you insert a new time series in a table and no container is specified, Informix will create one for you if needed. For example, let's take the following table:CREATE TABLE jroy (
loc_esi_id char(20) NOT NULL PRIMARY KEY,
) LOCK MODE ROW;
WE can insert a new TimeSeries without specifying a container:INSERT INTO jroy VALUES(1, "origin(2010-11-10 00:00:00.00000),calendar(tst15min),threshold(0),regular,");
If there is no container available, a container is created as we can see in the tscontainertable table:SELECT * FROM tscontainertable
partitiondesc autopool00000000 datadbs 16 16 4194538
This features goes a few steps further. If the table is partitioned over multiple dbspaces, Informix will create one container per dbspace and put them in a pool called autopool. It is possible to have the following inserts go through the pool in a round robin fashion to evenly distribute your time series over multiple container and dbspaces.
If you prefer to manage your containers tourseld, you can create your own containers and ut them in a specific pool so you can take advantage of a container pool. You can even create your own policy to decide where new time series should be located.
There is more to know about these capabilities. You can find out more in the information center starting at:
This week we have the International Informix Users' Group annual conference. It is being held at the Overland Park Marriott.
Here's a bit of trivia for you: In 2008, Overland Park was listed as the 9th best place to live in the US. No doubt having the Informix lab close helped its ranking :-). You can check it at:CNNMoney.com
Sunday was a day of tutorials with eight tracks running at once. I arrived in the early afternoon. It is amazing how quickly you can get into interesting conversations, discussing different projects and business solutions.
The evening reception was a success with lots of networking and good food. This is a great start to what should be a very interesting and useful conference. Monday starts with a keynote from Dr. Arvind Krishna and continues with five tracks on a variety of subjects.[Read More]
JacquesRoy 120000A2MS 2,504 Views
2010 has been a great year with the release of Informix 11.70 and 2011 is lining up to be a busy year with plenty of activities and execution on the plans of v.next. I also hope that in 2011 we'll see even more participation from the Informix community to continue to make Informix and the solutions around it better and more exciting.
Part of making Informix and its ecosystem system better is to share ideas and be exposed to ideas that may or may not be related to Informix. If you remember, in my blog entry of October 9, I talked about where good ideas come from. It is time that I divulge the source of these comments. It came from a site called TED. That specific presentation is Where good ideas come from
Here are a few other presentations I enjoyed from www.ted.com:
There are many more interesting talks in there. I hope you'll enjoy these short presentations. Who knows, by exploring these presentations and others, you may come back with a new outlook on how we can use Informix to make our world a smarter planet.
I listened to a presentation on this subject recently.
What I found interesting is that the research found that good ideas do not come from a Eureka moment. For example Darwin recounts his Eureka moment in his auto-biography. Further study of his personal journals show that Darwin had the full theory of natural selection many months before his stated Eureka moment.
According to research, most good ideas come from discussions:
Another interesting point was that many good ideas come from the connection of people that share their thoughts to form a complete idea that is worth pursuing.
How can we generate good ideas we can act upon and make our environment better?
We need to interact with people in a situation that is conducive to generating these ideas. We have such an opportunity in just a few week: The Information on Demand (IOD) conference in Las Vegas, October 24-28.
Think about it: we will be with a bunch of people that have technical problems to solve around the use of technology in general and Informix in particular. We'll listen to presentation on new features, solutions in different industries, best practices, bird of a feather sessions, mingling in social settings such as the Informix celbration on Monday night.
Let's take advantage of this great opportunity! See you in Las Vegas!
Happy new year everyone!
The informix team is always hard at work improving the Informix products.
It turns out that, while working on V.next, a feature escaped and made it into version 11.70.xC5 and above (xC6 being the current release as of October 2012).
It concerns loading data into TimeSeries using a relational view of a TimeSeries (also known as VTI interface). To take advantage of this new feature, you simply
use the TS_VTI_ELEM_INSERT (128) flag when you create the relational view with the TSCreateVirtualTab() procedure.
A simple test showed that this feature loads data 3.6 times faster than previously. Of course, your "mileage" will vary depending on your environment. To know more on how you can
use this new feature, consult the following link from the Informix Smart Meter Central wiki:
You can find additional details in the Informix information center in the following pages:
I ran into a simple problem the other day: I got an error while creating an index because the key was too big to fit in my index. As you may remember, the maximum size of an index key on a standard Unix/Linux system is 387 bytes.
Why do we have this limit?
This is a function of the page size and the way a B-tree index works. With the limit of 387 bytes on a 2K page, we can have at least 5 keys per page. This way, we divide the data in at least 5 parts at each level. the end result is eliminating comparisons to get to our our result faster. If we had only one key per page, it would be the equivalent of doing a sequential scan so the index would be useless.
In IDS version 10.0 (2005), Informix introduced the configurable page size. from that point on, it is possible to create DBspaces with page sizes of up to 16KB in size. the page sizes available has to be a multiple of the basic page size: 2KB or 4KB.
These larger pages can provide better performance when you have a wide table where the row size could be, let say 12KB. This way, you can fit an entire row in a page instead of using page chaining to support these larger rows. The savings in I/O could make a noticeable difference in performance in many situations.
Coming back to my indexing problem, I can fix it by using a larger page size. According to the documentation, the maximum index key size is as follow for each page sizes:
If your key fits in a 2KB page (shorter than 387 bytes), you could still use a larger page size for your index. The difference is that more keys would fit in one page so the index will not be as deep so it could provide additional performance.
Why not simply use the 16KB page size everywhere?
The short answer is that you could waste space on the page used for a table. A page can include a maximum of 255 rows. If your page size is 16KB and your row contains only two integers (2 x 4 bytes), you could, in theory, have over 2000 rows in that page. Since we are limited to 255 rows, we are wasting over 14,000 bytes.
Why not use four or five different page sizes?
Each page size requires its own buffer pool. We have to decide how much memory to allocate for each of these pools. Our decision may not result in the optimal memory allocation. The result is that some pools will have too much memory and others would benefit from more. Bottom line, this would make system administration more complex.
I would suggest to limit ourselves to two page sizes. The default page size and another one. The second page size depends on the environment requirements. I would also look at the size of the I/O on the particular machine and how many requests do multiple I/O on sequential data.
If you haven't looked at the configurable page size in IDS, maybe it is a good time to do so now.
JacquesRoy 120000A2MS 2,435 Views
In my blog entry of February 17, 2010, I had to put out a retraction about the common drivers. As I said then, I had to start lobbying for their inclusion with Informix.
I am happy to report that the Informix client SDK version 11.50.xC7 includes these drivers. Note that they are not included in Informix 11.50.xC7. I would expect that future releases of Informix will include a CSDK that has the common drivers.
There is a bit of work to do after the CSDK installation to complete the common drivers installation. I'll cover that in an other blog entry later. If you need information on how to use the common drivers, I suggest you download:
"Informix Dynamic Server Application Development: Getting Started":
If you are interested in application development, the following URL is also of interest:
Until next time...
JacquesRoy 120000A2MS 2,389 Views
A few years ago, IBM started talking about a smarter planet: Instrumented, interconnected, intelligent.
We are seeing more and more uses of sensors starting from your smart phone ant its many sensors (GPS, proximity, temperature, barometer, etc) to electric meters at your house. Add to that all the other sensors used in many industrial plants and even sensors on rails!
How can we convert this deluge of data into information?
This leads to issues related to two ways to handle data: in-motion and at-rest.
It happens that IBM has a mix of products that can handle these two "states" of the data:
For data in motion, we can use InfoSphere Streams for real-time analytics based on more in depth analysis on historical data (analytics models).
For the data at-rest, there are problems of how fast we can store it and how fast we can retrieve the information, specially when it concerns many users making requests. This would be an operational data store environment. Then, of course, there is the issue of "in-depth" analysis that requires fast access of large amount of data.
Informix has the combined solution with its TimeSeries capabilities and the Informix Warehouse Accelerator.
Learn more about the use of Informix to solve this big data problem in the following webcast:
Yes, a new version of Informix is now available: Informix 11.70.
There are a lot of great features in this release. I could talk about the flexible grid that allows you to manage many machines like one and support rolling upgrades. I could talk about the new analytics features where we've seen speed up of warehouse-type queries of around 50%. I could talk about storage provisioning, improved installation and embeddability features. Yes, I could talk about all this but at this time, I want to talk about some features that should interest application developers.
I have to admit I am a little biased since my group is called application development services. However, the features I want to talk about were either requested by customers or have had a very positive reception in early mention under non-disclosure or during the beta period.
The first one will facilitate porting schemas from other databases to Informix. Let me first show an example:
CREATE TABLE tab (
The first improvement is the ability to change the order of constraints and default values. Before Informix 11.70, the col1 definition would have returned an error since the default clause had to be located before the NOT NULL constraint.
The second improvement is the ability to explicitly say that a column can accept NULL values. Before, it was implied if the NOT NULL constraint was not there.
The last improvement shown in the example above shows that we can add "ON DELETE CASCADE" after the constraint name.
Another improvement in the DDL area is the ability to conditionally execute CREATE and DROP statements. Here are two examples:
CREATE TABLE IF NOT EXISTS tab ( . . .);
If, for example, you want to make sure a table is re-created, you could always say:
DROP TABLE IF EXISTS tab;
If you want to make sure that you keep the table if it already exists, then don't do the "DROP IF EXISTS" and simply use "CREATE TABLE IF NOT EXISTS".
Finally, here's another DDL feature that was in great demand. It is not really an application development feature but it has been requested a lot: The ability to define the EXTENT size in a CREATE INDEX statement:
CREATE INDEX myidx tab(col1) FIRST EXTENT 8 NEXT EXTENT 8;
Don't forget to read the release notice since there are many other improvements on the INDEX capabilities.
On the DML side, we are now able to use expressions in the COUNT aggregate function. This can be useful if you want multiple aggregates in one statement:
SELECT COUNT(*) total, count(CASE WHEN sex = 'M' then 1 else NULL) males
Without this capability, you would have to solve this problem with three separate statements. For example:
SELECT * FROM
A while back, I started reading a book called "Thinking, Fast and Slow" from Daniel Kahneman.
Daniel Kahneman is a professor of psychology who won a Nobel prize in economic.
I have to admit, I am not done reading it. I need more "plane" time
Today, I just want to relate some parts of chapter 14 where he put together a test to see how people would classify individuals
"Tom W is a high intelligence, although lacking is true creativity.
After reading the description, the subject was asked to figure out which field of study Tom was most likely in.
The description was actually designed so people should rank computer science among the best fitting
I laughed out loud when I read that part. I immediately though of one of my co-worker, Robert U., that
For those who read this blog, if you make corny jokes/puns and graduated in computer science rejoice.
The book is full of interesting information including the fact that even statisticians can misuse/misinterpret statistics.
"you dispose of a limited budget of attention that you can allocate to activities. . .
My conclusion: if someone tells you he/she's multitasking, they do trivial work.
JacquesRoy 120000A2MS 2,301 Views
As you know, the IIUG has a page devoted to open source products that can run on Informix. There has been a new addition recently. It is a patch to have Drupal version 6.16 run on Informix.
Drupal is a popular product that provides capabilities to publish, manage and organize a wide variety of content on a web site. To run it with Informix, you need:
You still need to download the code from the drupal site (www.drupal.org) and then apply the changes provided at www.iiug.org/opensource.
JacquesRoy 120000A2MS 2,286 Views
We are now living in a world that is more and more instrumented, intelligent, and interconnected. That is actually the IBM definition of a smarter planet.
This opens the door to many possibilities to better use natural resources and improve many things.
Recently I ran into an interesting video on Ted called Tracking the trackers.
It is basically about how many different sites that you may have never visited can track you and your information and you can't say anything about it.
You can find this video at: Tracking the trackers
It is about a basically unregulated industry doing what is called "behavioral tracking". Apparently it is a $39B industry!
It is easy to jump from stealth tracking to security concerns: What is going on in your network?
Maybe it's time to review this URL: http://publib.boulder.ibm.com/infocenter/idshelp/v117/topic/com.ibm.sec.doc/SEC_wrapper.htm
First, let me put an end to the rumor that the IIUG conference was moved to San Diego to accommodate me.
It is true, I live in that area. It is also true that I am presenting my fair share of material but I can assure you that not even one passing thought on my location was part of the decision .
This being said, the conference is approaching quickly. One more week in March and then a few weeks in April and we're there.
As usual the conference organizers are trying to outdo themselves year after year. This year is no exception. What happened since last year?
For one, Informix 11.70.xC3 was just out then. Since we've seen xC4 come out. Can we hope for xC5 soon?
On my side, I am giving four sessions on various subject:
Take a look at the list of sessions and hands-on labs at: http://www.iiug.org/conf/2012/iiug/sessions.php
See you there!
I'm currently in Paris in the second week of a business trip. For a two-week trip it is pretty common to have some clothes laundered otherwise this makes for a lot of stuff to lug around.
I took a look at what was offered at my hotel: To launder one shirt (men), they charge 8.50 euros (around 12.37 US dollars). As I was leaving the hotel, I saw a hotel employee with a laundry bag in her hands. Looking at the size of the bag, I could just imagine the small fortune spent by the guest.
As I was walking to the IBM office, I passed a dry cleaner that advertized the cleaning and pressing of men shirts for 2.20 euro per shirt for 5 shirts. The price at the hotel was over 3.8 times that price. With a little knowledge a a 5 minute walk, the hotel guest could save a significant amount of money: for 5 shirts the price goes from 42.50 euros to 11 euros. For a company with a lot of employees that use that type of service, this can add up to significant savings.
Of course, that made me think of Informix. It is well known that IDS provides a high level of performance and scalability and require minimal resources for its administration. In some cases, one database administrator can manage thousands of instances. Of course it is much easier to go with a safe choice, use as much hardware as needed, and hire as many employees and consultants as the situation requires for the management of the environment and business application development. This is simply the cost of doing business...
It seems to me that with a little knowledge and a little effort, that cost of doing business could be greatly optimized.
There is a new redbook now available for people that want to get into the use of the TimeSeries feature.
It focuses on "how to" and nicely complement the Informix documentation.
The redbook can be found at: http://www.redbooks.ibm.com/abstracts/sg248021.html
Other resources to help you include:
I just had a need for a function that takes a datetime year to second and returns the number of seconds since January 1, 1970.
That would be easy to do by writing a "C" UDR but I did not want to deal with compiling and installing a shared library so I decided to approach it as an SPL routine.
Not that it is a great thing but I thought I'd share it with whoever needs it. Let me know if you find this useful:
CREATE FUNCTION epoch(dt datetime year to second)
You can use it either in an SQL statement or directly with EXECUTE FUNCTION. For example:
EXECUTE FUNCTION epoch("2008-11-25 08:32:45");
1 row(s) retrieved.[Read More]
Informix TimeSeries is a specialized storage and retrieval mechanism that optimizes the processing usually done on this type of information. For this reason, it includes specialized storage called "container". A container is created in a dbspace. In fact, multiple containers can be created in a dbspace. A container is created using the TSContainerCreate procedure:
This command creates a new container called meter_cont in the datadbs dbspace. It is created specifically for time series elements of type meter_data (row type). Since we are talking about a row type, it could include anything a row type accepts. The only restriction is that the first column has to be a datetime year to fraction(5). Here's a simple row type that could be used:
The last two arguments represent the initial space allocation and the growth space allocation. This is similar to initial extend and next extend. A value of 0 resolves to the default of 16KB.
With this in place, we can create TimeSeries in a table. Let's start with the following definition:
We can insert a row in a table with an empty TimeSeries as follows
We now have a row in the table with an empty TimeSeries column. This is different from
Now, you may say: "Whoa! How do I insert data in that TimeSeries? Must be difficult".
The TimeSeries functionality includes a way to create a relational view on a table that contains a TimeSeries column. If the table were to include multiple TimeSeries column, you could create multiple "views", one for each TimeSeries column. This capability is provided through an Informix feature called the Virtual Table Interface (VTI). This is a capability that allows Informix users to make something look like a standard relational table. At this point, there is no need to describe this interface further. The Informix TimeSeries provides a stored procedure that facilitates the creation of that virtual table. For example, we can create a relational view on out ts_data table as follows:
This creates a virtual table called
If you want to insert into a timeseries, you simply use a standard insert statement. If the row does not exist, it gets created, if it is there, the TimeSeries column gets updated. Here's a simple insert example:
A simple standard SQL insert... How easy can it be?
We have a lot more to talk about. Next time, we'll start introducing some 11.70.xC3 capabilities. This is starting to get exciting! See you next time
JacquesRoy 120000A2MS 2,219 Views
In 11.70.xC3, we added some new time series capabilities. Why would you care?
Time series are found everywhere. It is simply data that is collected over time. It could be changes in stock price and transaction volumes. It could also be reading of your house electric meter. Readings could be done every 15 minutes for example to provide a much more accurate picture of how electricity is being used. Other time series examples include weather information, network traffic, thermal readings in a large data center, and so on.
One key characteristic of time series is that the processing always include a time component. For example, you want to get all the meter readings for one month for a specific customer. With this data, you can calculate daily consumption, running averages, etc. To do this type of processing, you need quick access to the specific range of data you want to analyze and you also need to get it in time order.
Informix provides a data type that is used specifically to optimize time series data. It also comes with a extensive set of functions used to manipulate these time series. The Informix TimeSeries provide three major benefits:
Informix TimeSeries also provides the ability to create relational views on top of your time series data. This opens the door to the use of standard off the shelf products to do things like reporting.
With this very brief introduction, we are now ready to talk about the improvements made in 11.70.xC3. This will have to wait until next time
There was a big change for me this year: I left the Informix CTE group to lead a new group. I am now a manager... and architect.
My new group is called Application Development Services. This mean that my group looks at IDS from a programmer point of view. Let me give you an example of what that means. Let's look at the major features included in IDS 11.50.xC6:
I care about these features but I my attention goes to a feature of the new Client SDK that deserved a one line mention in its release notice:
"When you install Client SDK or IConnect, you have the option to install IBM Data Server Driver version 9.7. For more information, see the Client Products Installation Guide."
As you may remember, the long term direction for client applications is to use the DRDA interface to IDS. With this one line statement, I can now write programs using CLI (ODBC) without having to have to figure out where to get the driver. Since IBM has multiple packages available, I could have easily made the mistake of thinking that I need to download the entire DB2 client (about 600MB) to get this functionality.
In addition, this is all I need to build PDO_IBM for PHP applications or IBM_DB gem file for Ruby and Rails development.
As far as what my group will do, we can start by figuring out and prioritizing what features will make Informix more attractive to developers/programmers. It's not just features in the server. It has to consider everything. Even documentation.
I'm sure I'll have more to say about this later this year. Hopefully I'll have interesting results to report by the time I see some of you at the IIUG conference in April.
The Informix team is putting a lot of energy behind this conference. The team is also putting together a Customer Advisory Council meeting on June 2nd where there will be discussions on product directions and features prioritization.
For more information on the conference, please see:
The call for speakers is going on until February 13. This is a great opportunity to participate with the EMEA Informix community and get some exposure for yourself and your company. Take advantage of it.
Find out more at the URL mentioned above. Like it says on that site: Register Today![Read More]
JacquesRoy 120000A2MS 2,199 Views
IDS 11.50.xC5 became eGA on July 24th. It includes several new features including the "CONNECT BY" syntax and the MERGE command. There were other improvements in multiple areas such as administration and usability, and in the continuous availability including Enterprise Replication (ER).
For more information look at:Read More]
JacquesRoy 120000A2MS 2,172 Views
In the last few blog entries, I've been talking about TimeSeries. This time, I'd like to diverge a little for a change. Still there is a tie to TimeSeries
About a year ago, I went to a E&U conference. As you may know, Informix is making a push in this industry due to the advantages that TimeSeries can provide to this industry. In one of the sessions I attended, the presentor mentioned in passig the "Did you know?" video on youtube. Just the context when it was mentioned made me pay attention. I took a note and decided to look it up later. Last time I checked, it had had over 14 million viewing!
"Did you know?" starts with a global view of the world ("If you are 1 in a million in China, there are 1300 people just like you") and continues to talk about the evolution of the impact of technologies on our lives and its impact in the future.
Some other highlights:
Like it says in the video, we are living in exponential times.
Take a look at it, it's only 5 minutes of your time: http://www.youtube.com/watch?v=cL9Wu2kWwSY
JacquesRoy 120000A2MS 2,159 Views
Looks like I jumped to conclusion too quickly. I won't give you any details or attenuating circumstances. I simply did not check properly. It looks like we do hve something on the Windows platform but not on the others.
I simply have to statrt lobying for the data server drivers as part od CSDK on all platforms. In the meantime, you can download the common drivers starting at this URL:
The one you want is the IBM Data Server Driver Package (DS Driver). On Linux, it is a 24MB download.
More on how to use it later.
JacquesRoy 120000A2MS 2,107 Views
I recently received a note about the IOD conference, October 25-29, at the Mandalay Bay in Las Vegas. If you register by August 31, you can get the early bird hotel rate!
Please go to the Conference Site to learn more about the IOD conference and register. Here are the top reasons provided to attend:
More on the conference later.
I think Terri is pulling my leg. She is apparently receiving concerned emails about what happened in Brussels. It was a humorous situations that I wanted to relate in a fun way. I guess I have a future in fiction writing :-).
Really, nothing happened. She took a picture, the police courteously told us that the American embassy did not want people to take picture. Terri deleted the picture from her camera while having a pleasant time with the officers. We then left and laughed about it.
So, don't worry, Terri is doing fine and we all had a good time in Brussels. I strongly encourage people to come and visit.
JacquesRoy 120000A2MS 2,092 Views
What does that mean to be green? Is that using solar panels to power our computers? What about if we put pedals under the desk of each employee so they generate the power they need. That would have the additional benefit of raising the fitness level and maybe improve the overall health level... I remember Martie Lurie and his son demonstrating something about a bicycle and an Informix database at the last IIUG conference (or was it IOD?). Maybe they were onto something :-)
Kidding aside, we know that Informix was green before people cared about green. The product was built from the ground up to take full advantage of hardware resources. To me, that's not even half the story. Informix IDS is set it and "forget it". What type of impact does that have on the overall carbon foot print? We often hear customers talk about a 1 to 10 ratio between Informix DBAs and competitor products. And there is the extensibility features. You know me, I could go on and on about that and the huge benefits it can provide such as processing 100,000 trades per seconds on a 4-CPU Linux machine.
Lately there has been more happening with Informix to help make it greener by including it into a virtual appliance that can even be deployed in the Amazon EC2 Cloud. Guy Bowerman has talked about it on his blog and will cover this subject in his presentation next week at the Informix conference (Apr 26-29).
Take a look at theThe GReen IT Report on DeveloperWorks to see what IBM has to say about this subject.
Till next time.[Read More]
JacquesRoy 120000A2MS 2,016 Views
Last week I stayed at a quaint hotel in Strasbourg. Since the room did not have an alarm clock, I decided to use my watch to wake me up on Monday morning. Considering that there is an eight-hour timezone difference between Denver and Strasbourg, using an alarm is a good idea.
I woke up on Monday 30 minutes before the alarm was supposed to ring. That's long enough to make it worthwhile falling asleep again so I did. I woke up again with a start, picked up my watch and looked at the time: the display was blank!I needed to find out what time it was in a hurry. Maybe I was late for the start of the class! Luckily for me, it turned out that it was the time I was planning to get up at. I guess my brain kept track of the time as I was sleeping. It has worked in the past but I don't find this method the most reliable. At this point, I started using my phone as my alarm clock.
Later that week, when I was in Paris, I had to go visit a partner. The sales specialist send me the information. I wrote the address down on a piece of paper and went to grab a taxi. The taxi driver could not find the place even with the use of a GPS device. I did not have access to my email with my laptop, I did not write down the partner's phone number and I had no way to contact anybody. I was about to tell the driver to turn around when I remember that I get my emails on my phone. Luckily, there was a phone number and we were able to get to the right location.
Twice in one week! Since I had to leave my hotel on Saturday at 5:00am, I did not want to take any chances: I setup a wake up time on both my phone and on the television/alarm clock. Surely at least one of the two would work. It turns out that both worked that morning and 20+ hours later I was back at home (ahh! the glamor of travel). Now my laptop appears to act a little strange. I better do a backup...
That made me think: Do all Informix DBAs have a contingency plan? What happens if something goes wrong? How much does it cost the business for each hour of downtime?
IDS offers a lot of capabilities that can address the needs of a business environment. It starts with online backup either full or incremental and adds to it through the following:
All these options work together. Talk to your local IBM-Informix IT specialists if you want to know more about these capabilities.[Read More]
JacquesRoy 120000A2MS 1,982 Views
Next Thursday afternoon (3/5/2009 1:00pm mountain time) I will participate in an internet radio show called DM Radio.
The URL for it is: http://www.information-management.com/dmradio/10015010-1.html
This edition of the radio show is described as follows:
Tune into this episode of DM Radio to find out! We'll talk to green guru John O'Brien of Dataupia, Bill Smoldt of STORserver, Bob Maness of Pillar Data Systems, and Jacques Roy of IBM.
Attendees will learn: How much money companies can save by going green; Why a multi-tiered storage strategy is key; How to leverage the Cloud in saving dough; Best practices for dovetailing initiatives.
I'll be on for about 10 minutes and also be part of a panel of expert discussion toward the end of the broadcast. Let's hope I can do Informix justice! :-)[Read More]
This was the last day of the conference with a 35 sessions. I was surprised to see how many people attended the presentations until the end. I see this as a big endorsement of the value provided by these presentations.
On my part I delivered one presentation first thing in the morning and another one starting at 2:10 pm. Despite that, my session was well attended.
Overall a very successful conference that was well worth attending.[Read More]
You may not know but the Informix lab is extending a helping hand to universities around the world. One example of that was the hosting of university professors at the last Informix conference.
As part of this, I am on my way to the university of Strasbourg (France) to teach a 3-day seminar on subjects related to IDS. I had all the latitude I wanted (and more) to decide on the content. I will be delivering this seminar starting next Monday (June 8). We'll see how it is received. Watch for my blog entries after each day, network access permitting.[Read More]
I'd like to come back to the book "The Goal" I mentioned in my last blog entry.
This book focuses on manufacturing environments but the interview at the end of the book mentions that the concepts of the theory of constraints (TOC) can be applied to other fields. Looking back in teh book, I found that they ask three basic questions about the impact of changes:
We can easily see that this makes sense to a financial person in manufacturing. Let's see how we can look at it when our concern is running a database.
Did you sell more?
"The cheapest, fastest and most reliable components of a computer system are those that aren’t there"
Did you reduce the number of people on the payroll?
I've met many customers that have a mixed environments where we see a 10-1 ratio of Informix personnel compared to the personnel for the competitor's platform. Why not bring that up to the appropriate people. I'm sure your local IBM representative will be happy to help.
Did you reduce inventory?
I think these three questions are worth exploring no matter which environment you're in. That can be good for your company, for you, and for all the people that invest their efforts into the Informix products.[Read More]
The day started with a keynote presentation by Dr. Anant Jhingran on "Cloud computing, databases and the role of IDS". He was assisted by our own Guy Bowerman. That was quite a good start to another great day of learning and networking.
There were 35 sessions covering subjects including Gillani Fourgen case tool, Genero report writer, IDS tasks and sensors, performance tuning, backups, trouble shooting, encryption expert, and index enhancements. Quite a range of subjects and that's not the half of it!
I had interesting conversations with some partners. One of them mentioned how the AGS Server Studio product transformed someone that knew nothing of databases into a database administrator in no time flat. Looks like the ease of use of IDS with the ease of use of AGS is an unbeatable combination. I also had a discussion about collecting and sharing the information about sensors worldwide to monitor the health of the planet. Talk about a stimulating conversation.
There is one more day of this! I don't know how much more I can take :-).[Read More]
The machines configurations caused problems in using Data Studio with WAS CE, I already mentioned that yesterday. This also meant that we could not do the web services lab. To work around this problem, I spent a few minutes showing the students what was involved in creating a web service using the vmware image on my laptop. Of course, it took a lot less time than would be required to do the lab since everything was already setup.
The rest of the class went well. It included a review of the enterprise features such as backup, SDS, HDR, RSS, CLR, ER, CDC (Change Data Capture), and MQ integration. I think we should add a lab on shared disk and HDR since the labs appear to be very well received. They are more fun than just sitting there listening to a speaker. The class ended with a prsentation on cloud computing.
I went through the evaluation and found that the class was a success. I know there are a few adjustments but it was a good start. All in all, it was a good few days.
I took the train to Paris. It takes around 2 hours 15 minutes to cover the 500 kilometers between Strasbourg and Paris. That's an average of over 220 km per hour. The ride was so smooth. It is interesting to note that a plane ride would have taken one hour but the train is actually faster since you can get there just a few minutes before departure and it drops you off in the middle of Paris instead of the "far away" Charles De Gaulle Airport. That's a reminder that we should always use the right tools for the right problem :-)[Read More]
I mentioned the Informix warehouse in my previous entry. There is the chat with the lab coming up. Here's something more: a new tutorial on DeveloperWorks:
Then there are the informix Warehouse product pages:
JacquesRoy 120000A2MS 1,915 Views
There's a book I read many years ago called "Controlling Software Projects" by Tom DeMarco. The first chapter starts with: "You can't control what you can't measure". This statement has popped up in my head regularly, especially since the IOD conference last October. While I was there, I had a long conversation with a customer that seems to be measuring everything and the results show.
IDS provides the ability to collect a lot of measurements. With the latest IDS version, we have even more capabilities than ever before. If we add the Open Admin Tool for IDS to the mix, monitoring has never been easier. Of course, monitoring and measuring are two different things. You need to collect the data provided in the monitoring and compare it over time.
Here's an example of measurement: Each time new features are added to the IDS code line, performance testing is done. The results are compared to previous performance runs. If the performance declines, it triggers an investigation to fix the performance issue. The IDS development team always aims at improving IDS performance.
So, we know IDS is always improving. We know that we can collect a lot of data on the IDS execution and tune the engine to get the most out of it. Is that enough?
My answer is NO. As data access experts, we need to get involved in the application side and monitor and measure what is being done there. How much data are they pulling out? what are they doing with it? There can be cases where implementing a stored procedure could greatly reduce data movement and yield both additional performance and scalability. This is just one example of what can be done.
You can't control what you can't measure. Don't limit yourself to the database server. By lending your expertise to a larger team, you can make a significant difference in your work environment.[Read More]