The day started with a Q&A with IBM excutives: Alys Passarelli, Inhi Cho Suh and Rob Thomas. There were a number of good questions and it was an opportunity for the excutives to state their commitment to Informix and describe many of the efforts in progress.
Once again, a good mix of sessions on subjects from customer case studies, database administration, security, best practices, and use of tools such as Eclipse.
The day ended at 4:30 after the delivery of 25 sessions.
This was a great conference with lots of good information and fantatic networking.
I did forget to mention the key note presentations of the evening of Monday. Rob Thomas gave us his view of the Informix business and a glimpse at his plan for continuing successes. It was followed by a presentation by dr. Arvind Krishna, general manager of IBM Information Management. Great information!
Tuesday, sessions started at 8:00 AM. I was the moderator for the session on disk level encryption. It was about a flexible product that can protect your database data at rest. It works for file-based dbspaces (cooked files) or raw devices. All that, transparently to Informix. Mark Jamison did a great job at presenting the product in a clear and concise manner.
Theere were many sessions on application development in cluding sessions on Groovy, Perl, Python, PHP. Of course, there were also more sessions on duifferent aspects of tuning and also presentation on embeddability, replication , warehouse and so on.
With the sessions starting at 8:000 AM, we had a total of 35 sessions for the day.
This was followed by a casino night. What a full day!
The day started with a keynote speach by Jerry Keesee. I believe the key to the presentation is that the Informix is taking the next step to insure continuing growth of the Informix product. There are some exciting things happening on that front.
The first session I attended was "Data Modeling" given by Jack Parker. There were a lot of interesting examples how a data model greatly impact your production system. I have to agree with Jack that the greatest performance gain in with the data model, not tunig the database after the fact. This is why database experts should be involved at the beginining of a project. They can then also take advantage of database extensibility; something I could talk about for a long time...
There were five sessions going at once for a total of 30 sessions for Monday. There were a lot of good sessions on subjects including database administration, application development, programming, security and more. there wer demos of all sort and great conversations to be had. Someone told me that one of the benefits of this conference is that they can solve in minutes problems that have benn bugging them for weeks.
Of course the evening event was also worth attending with a bunch of engineers coming from the lab just to talk to everyone. All in all a fantastic day!
Sunday was the beginning of the conference. Even though there were tutorials during the day. The real beginning was the welcome reception. The exhibitors booths were ready and the attendees were cheerful. IT was a great networking opportunity with good food and plenty to drink.
I have to admit I missed a lot during the reception because I was in multiple intense discussions.
All in all, a great start to the conference.
I ended my last entry with the IBM statement about smarter planet: instrumented, interconnected, intelligent. I'd like to comment a little more on the instrumented part.
In my last entry, I mentioned the utilization of active RFID to monitor a large data center. Of course, this monitoring compares data points at a specific point in time. Most likely, the readings are done at specific time intervals. The same would be true if you were monitoring the energy consumption and temperature of houses in different neighborhood in a city, traffic information (cars or packets), or any other thing you'd want to collect.
There are two main concerns in this type of processing:
- How do we quickly ingest the large amount of information generated into a database?
- How do we efficiently process the information?
The first concern is to be able to ingest the information without falling behind and with cycles to spare for analysis. In most cases, the information must be kept for further analysis or for historical comparisons and analysis.
The second one is about the analysis itself. Informix has addressed these issues with time-based data with the TimeSeries datablade. It provides an efficient way to store and process large amount of data very quickly. If the ingestion rate is a concern, Informix also has the TimeSeries Real-Time Loader. As its name implied, you can ingest a large amount of data and make it available for analysis virtually immediately.
For a smarter planet, keep these efficient tools in mind. Informix is likely to be the answer to your needs.
I saw the cover of Computer world the other day with a title of "Swinging toward centralization". I'm not one to be jumping on trends but I think this idea has merit. To me, it ties into virtualization, possibly cloud computing, and also the IBM concepts of the smart planet.
Centralized IT could mean first the optimization of hardware resources. The best approach is to use virtualization so all the hardware resources can be used optimally. For example, instead of having, let say 100 computers running at 50%-70% utilization, you can centralize and use virtualization and either reduce the number of computers to around 70 or use the extra capacity for growth. This is a pretty conservative example. Just consider this quote from Computer World, April 20, 20009:
"Austin Energy: With a new virtual environment, applications run on 150 servers instead of 600"
Centralization gives you this opportunity. Note that I'm talking about centralizing the hardware resources. If you centralize processing for one large application, you'll likely need the help of advance features such as IDS Continuous availablity (CAF) and the integrated replication capabilities (HDR and ER).
Centralization does not mean that the personnel must also be centralized. Today, network access is pretty much a fact of life (I so wanted to use the word ubiquitous!). All the application and system management can be done from anywhere. For IDS, just consider the Open Admin Tool for IDS (OAT) or management tools from our partners such as AGS and CobraSonic. Managers can consider these resources as part of a "cloud".
What a nice segway to my next point
We hear a lot about cloud computing. You can buy time on some machines in the cloud. We could also mention software as a service like in the case of LotusLive (see https://www.lotuslive.com/en/) or the IBM cloud offering. This does not mean that you have to go outside to have a cloud. You could create a cloud from your centralized data center and provide capacity on-demand based on resource optimization.
When we talk about a large centralized data center, the server consolidation is only part of the savings. the saving in energy can be significant. The other day, I listened to a presentation by an IBMer that manages a large data center providing services worldwide. Here are the type of things he did:
His team installed active RFID sensors to monitor the temperature and humidity levels in different areas of his data center, including multiple locations in the racks, and at different times. With this information, he was able to clearly identify machine needs. At one point, he was able to identify that if he installed a (raised) floor tile with holes at a specific location, he could eliminate his "hot spot" without increasing his air conditioning needs. He even figured out the correlation between applications and machines heat output. So he can regulate the room temperature based on which application is running!
Talk about a great example of a smarter planet: instrumented, interconnected, intelligent (devices).
I ran into a simple problem the other day: I got an error while creating an index because the key was too big to fit in my index. As you may remember, the maximum size of an index key on a standard Unix/Linux system is 387 bytes.
Why do we have this limit?
This is a function of the page size and the way a B-tree index works. With the limit of 387 bytes on a 2K page, we can have at least 5 keys per page. This way, we divide the data in at least 5 parts at each level. the end result is eliminating comparisons to get to our our result faster. If we had only one key per page, it would be the equivalent of doing a sequential scan so the index would be useless.
In IDS version 10.0 (2005), Informix introduced the configurable page size. from that point on, it is possible to create DBspaces with page sizes of up to 16KB in size. the page sizes available has to be a multiple of the basic page size: 2KB or 4KB.
These larger pages can provide better performance when you have a wide table where the row size could be, let say 12KB. This way, you can fit an entire row in a page instead of using page chaining to support these larger rows. The savings in I/O could make a noticeable difference in performance in many situations.
Coming back to my indexing problem, I can fix it by using a larger page size. According to the documentation, the maximum index key size is as follow for each page sizes:
max key size
If your key fits in a 2KB page (shorter than 387 bytes), you could still use a larger page size for your index. The difference is that more keys would fit in one page so the index will not be as deep so it could provide additional performance.
Why not simply use the 16KB page size everywhere?
The short answer is that you could waste space on the page used for a table. A page can include a maximum of 255 rows. If your page size is 16KB and your row contains only two integers (2 x 4 bytes), you could, in theory, have over 2000 rows in that page. Since we are limited to 255 rows, we are wasting over 14,000 bytes.
Why not use four or five different page sizes?
Each page size requires its own buffer pool. We have to decide how much memory to allocate for each of these pools. Our decision may not result in the optimal memory allocation. The result is that some pools will have too much memory and others would benefit from more. Bottom line, this would make system administration more complex.
I would suggest to limit ourselves to two page sizes. The default page size and another one. The second page size depends on the environment requirements. I would also look at the size of the I/O on the particular machine and how many requests do multiple I/O on sequential data.
If you haven't looked at the configurable page size in IDS, maybe it is a good time to do so now.
Since I've been on a common driver kick lately, might as well keep on going...
There was a chat with the lab on Feb 25th that talked about the common Java JDBC driver (referred as the JCC driver): Top 10 reasons to consider IBM Data Server Driver for JDBC and SQLJ for IDS
You can use the JCC driver with IDS when connecting using the DRDA protocol. Some of the benefits include:
- Better integration with WebSphere
- Ability to use the capabilities PureQuery
- Better tracing and debugging
- Full IDS clustering support
- Superior performance over the Informix JDBC driver
All this is significant:
- PureQuery can increase the performance of SQL statements by analyzing the usage and make changes transparently from the application. For example, it can detect the use of the same statement with different literals and convert that under the cover into a prepared statement.
- Full IDS clustering support includes working with the connection manager to automatically and transparently connect to an alternate server when the primary fails.
- Superior performance: It provides a 5% to 10% performance boost over the Informix JDBC driver.
If you are using Java, maybe it is time to start looking into the JCC driver. You can download it from the IBM site at (10MB):
For more information on this chat with the lab:
Here's Where you can find information on this chat with the lab:
Looks like I jumped to conclusion too quickly. I won't give you any details or attenuating circumstances. I simply did not check properly. It looks like we do hve something on the Windows platform but not on the others.
I simply have to statrt lobying for the data server drivers as part od CSDK on all platforms. In the meantime, you can download the common drivers starting at this URL:
The one you want is the IBM Data Server Driver Package (DS Driver). On Linux, it is a 24MB download.
More on how to use it later.
There was a big change for me this year: I left the Informix CTE group to lead a new group. I am now a manager... and architect.
My new group is called Application Development Services. This mean that my group looks at IDS from a programmer point of view. Let me give you an example of what that means. Let's look at the major features included in IDS 11.50.xC6:
Backup from an RSS server
Dynamic listener threads
View event alarms
Basic Text Search enhancements
MERGE statement enhancements
I care about these features but I my attention goes to a feature of the new Client SDK that deserved a one line mention in its release notice:
"When you install Client SDK or IConnect, you have the option to install IBM Data Server Driver version 9.7. For more information, see the Client Products Installation Guide."
As you may remember, the long term direction for client applications is to use the DRDA interface to IDS. With this one line statement, I can now write programs using CLI (ODBC) without having to have to figure out where to get the driver. Since IBM has multiple packages available, I could have easily made the mistake of thinking that I need to download the entire DB2 client (about 600MB) to get this functionality.
In addition, this is all I need to build PDO_IBM for PHP applications or IBM_DB gem file for Ruby and Rails development.
As far as what my group will do, we can start by figuring out and prioritizing what features will make Informix more attractive to developers/programmers. It's not just features in the server. It has to consider everything. Even documentation.
I'm sure I'll have more to say about this later this year. Hopefully I'll have interesting results to report by the time I see some of you at the IIUG conference in April.
There's a children book that I used to read to my kids. It was about a boy that was laying around on the grass when he say a fly go by. What followed was a bunch of animals chasing each other.
Lately, I took a break from blogging (I hope you've noticed!), like a little boy laying on the grass, enjoying a sunny day. During that time, IDS 11.50xC6 came out. Here are a few interesting features:
- External table: an SQL interface to files to allow for very fast load and unload.
- XA transactions on secondary servers
- Backup on the RSS server: You can make an archive of an instance on an RSS server
- Dynamic listener threads: You can start, stop, and restart listener threads for the soctcp or tlitcp protocols without interupting existing connections.
There are also enhancements to the MERGE statement and the attach/detach capability among other thing. You can find out more about the xC6 release in the release notice
Some of you may remember that Lester Knutsen (Advanced DataTools Corporation) had a "fastest DBA" contest at the IIUG conference last April. when I was at the IOD conference, I picked up a copy of the Data Management magazine and found an article from Lester summarizing the tuning approaches. You can find the article on the web at:
I did not noticed a session on Wednesday. Luckily, I went to it Thursday morning. It was: "Tuning Informix in a Sandbox Environment" by Russell Glancy from GSN Digital.
Russell covered in details how a product from exactsolutions, iReplay, allows him to test new configurations, versions, and tuning in a safe environment using the same workload as his production machine. this way, he is knows exactly what will happen when he makes the changes to the production environment.
I also co-presented the session "Keeping costs low and maximizing flexibility for Jamaica using IDS" with Walt Brown, senior manager at FSL Jamaica. My role was mainly to introduce Walt and let him present his environment. Walt went into details about their environment and that they basically run all the Jamaican government systems, including tax collections that was even active and used during a hurricane.
There were several other sessions including:
A deep dive into the IBM Informix 4GL Service Oriented Architecture Feature, Gaga Mahesshwari, IBM
Dimensional modeling for IBM Informix warehouse users, Fred Ho, IBM, Sandra Tucker, IBM
Managing IDS configuration ans performance with server studio and sentinel, Keshava Murthy, IBM, Anatole Vichon, AGS Ltd
And several more... All that on the last day of the conference!
The conference is over. It is now time to go back to work.
Once again, another full day. There were Informix sessions on embeddability, virtualization/cloud computing, security, and zero-downtime upgrade. We also heard a great presentation on database tuning from Rick Rabe and Tom Girsch from Hilton Hotels.
Great sessions altogether. Now on to Thursday.
In Arvind Krishna feature keynote titled "Reduce Your Data Management costs with Workload-optimized System", we heard about Cisco Systems. They mentioned that they chose Informix a few years ago after looking at all possibilities for embedded databases including open-source ones.
I spent some time with Walt Brown (from FSL) and Cathy Elliott to fine-tune his presentation. More on that Thursday.
There were several interesting sessions Today:
- SOA Enablement on IBM INformix 4GL, Gagin Maheshwari, IBM
- Building Data Warehouses with Infomrix, Lester Knutsen, Advanced Data Tools
- Hands on lab on end-to-end security with Informix, Ted Wasserman, IBM
- Open Admin Tool for IDS, John Miller III, IBM
- All About IDS CAF, Conection Manager, and Failover, Ron Privett, IBM
- Using Informix in Telecommunications, Kevin Brown, IBM
- Secure and available public finances with IDS continuous availability, Cesar Jiminez, Jalesco Mexico Government
And, of course, demos, discussions and food on the expo floor and in the networking event in the evening.