Isn't it awesome? well, you may not be so excited until you visit the ebooks site here and take a look on the ebooks summary, download and enjoy them. Visit periodically the site to check if there are new ebooks available.
I remember a slogan from an IBM product (Lotus Notes, I guess) which says "Information has no value until it's shared". Well, that's what Raul Chong, a DB2 expert and evangelist, is doing with a program called DB2 on Campus, which offers DB2 high quality training, including virtual courses and free ebooks. You heard (read) it right: free! he and his DB2 on Campus team made of several experts wrote (and are still writing) several books not only about DB2, but Java, Perl, database design, PHP, Ruby, etc. From the series of 27 e-books, as I write this post there are "only" 7 e-books done and available. In fact, I have one myself, on paper, called "Getting started with IBM Data Studio for DB2" which they sent to me for attending a webinar (also free). Not too much people around the world do this labor, so we all should be thankful with DB2 on Campus team.
Triton Consulting, a consultant firm home-based in England, experts in DB2 consultancy and training, conducted a study of six of the most common DBA activities and compared the complexity of this tasks in both DB2 and Oracle. The result is astonishing: DB2 won on each one of the tasks tested, most of the time between 80% and 100% easier!
The 6 DBA tasks measured were:
High availability is a very critical point for the enterprise so it should be configured, tuned and tested. Most of database vendors customers, including DB2's do not take full advantage of database engines capabilities, so this is a great opporunity to understand how DB2 high availability gets our job as IT admins done. DB2 high availability features will be explained and demostrated in this interesting chat. Sal Vella and Dale McInnis will be presenting the chat. This sessions are very practical and interactive, and this one won't be the exception. They'll talk about HADR, automated failover, and more features and enhancements available on the latest releases of DB2. To join the web session please register here. To keep updated about overcoming DB2 related sessions with IBM Toronto Software Lab, please check this blog frequently or visit ibm.com/db2/labchats.
DB2 pureScale is the DB2 cluster solution which provides high availability, scalability and easy administration among other characteristics. Recently IBM updated pureScale, and now it's available on Intel-based hardware. IBM Chat with the Lab will host tomorrow, August 31, a webcast where pureScale features will be covered. I invite you to get more information and register here. See ya there!
All the previous chat sessions are available here.
When it comes to databases, storage is one of the most common issues that DBAs and SYSADMINs deal with daily. "Reduce the backup filesystem size. We haven't used it, anyway". Better (and more rational) solutions like adding more disks sound like the easier solutions for some people, but for today SMB it's not the most desired solution, besides, maybe when the disks arrive it could be to late and you'll find your SYSADMIN hanging from a rack. In addition to, as the database grows we start to notice how the response times start to increase almost at the same speed.
DB2 allows you to compress tables size by more than 70% in some cases and even increase the performance of your queries with the compression. To get started, first we need to choose which tables will be eligible for compression. DB2 builds a dictionary from the table. Common columns values and even repeated patterns in words are stored inside this dictionary and assigned an identifier, then this identifier replaces the columns in the table (you can take a look at how it works here). So, tables with common and reapeted data would be a great candidate, also read-only tables or low updated rows ratio tables can be perfect candidates. Queries which involve fetching sequentially data can get a notorious performance boost: as the row size is lower the database engine can retrieve more rows per fetch. However, tables frequently may not behave as fast as you expected because they're dictionary will probably change constantly and there will be a continous use of I/O and CPU to compress and uncompress data, so they're not the best candidates for compression.
Large tables are almost always ideal candidates. However, remember that "Large" is relative, for example: 1 million rows may be "Large" for your company but not for your competitor who actually uses DB2 ;)
Each table or partition has its own dictionary, stored in the table at the end of the existing data, so they will be close from each other. The dictionary does not get created immediately as you create the table. In fact, first we have to activate the feature for the desired table(s), we do this through the ALTER TABLE command:
ALTER TABLE TABLE_A COMPRESS YES
This tells the DB2 engine to make TABLE_A eligible for compression, unfortunately the table does not get compressed immediately, there are several easy ways to get this done. If we decide to do nothing after the ALTER TABLE statement, DB2 will use the Automatic Dictionary Creation feature (ADC) which will create a dictionary automatically after the size of the table is between 1-2 MBs. Actually, this is not recommended because the dictionary will be built using a small number of rows from the beginning of the table which probably won't represent the real data distribution in the table. The quality of the compression dictionary is based on the data used to create it. The "easiest" way to create the dictionary is from an offline REORG:
REORG TABLE TABLE_A
In addition to the reorganization of the table DB2 will create the dictionary based on the whole table. Depending of the size of the table, the REORG could take some time. Instead of building the compression dictionary from the whole table we can take a nice sample of the data, then create a dictionary from this sample and use it for the original table:
CREATE TABLE TABLE_COPY LIKE TABLE_ORIG IN TABLESPACE_X;
DECLARE CUR1 CURSOR FOR SELECT * FROM TABLE_ORIG TABLESAMPLE BERNOULLI(10);
LOAD FROM CUR1 OF CURSOR INSERT INTO T2;
REORG TABLE TABLE_B RESETDICTIONARY;
DECLARE CUR2 CURSOR FOR SELECT * FROM TABLE_ORIG;
LOAD FROM CUR2 OF CURSOR REPLACE INTO TABLE_COPY;
First, we create a new empty table (TABLE_COPY) which has the same structure as TABLE_ORIG, then we take a Bernoulli sample of 10 percent from the original table and load it into the new and until here empty table (TABLE_COPY), then we apply the REORG to this small table which also creates its dictionary. The REORG offers the possibility to keep the old dictionary (if there is one) with KEEPDICTIONARY option or RESETDICTIONARY to create a new one(for more info on REORG command here). The last step is to load data from TABLE_ORIG into TABLE_COPY. The LOAD operation takes advantage of the dictionary during the insertion of the records. However, if the table has no compression dictionary, LOAD won't create one.
REORG is the most commonly used way to create the dictionary, but is not the only one. We can also use INSPECT command
DB2 INSPECT ROWCOMPESTIMATE TABLE NAME table_name RESULTS KEEP file_name
This option will not only create the compression dictionary but also estimate the storage savings, placing the results in a binary file. This file can later converted to a readable text file containing the percentage of data saved using compression:
db2inspf file_name output_file_name
If you want to know how much space your tables or dictionary use you can enter this query
SELECT TABNAME, DATA_OBJECT_P_SIZE,DICTIONARY_SIZE FROM TABLE
(SYSPROC.ADMIN_GET_TAB_INFO(‘schema_name’,’’)) AS ORDER BY
If the columns DICTIONARY_SIZE is NULL or zero then there's still no dictionary created for that table.
Tables are not the only object which can be compressed on DB2: LOBs, Indexes, temporary tables and even XML columns can be compressed so depending on your database, for example a datawarehouse with huge read-only tables and a lot of indexes can get a high compression ratio. DB2 compression capabilities means not only money savings on additional disks or even floor space, but less time for storage administration and even power saving.