IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
3 replies Latest Post - ‏2014-09-12T18:40:22Z by tmhajas
7 Posts

Pinned topic database compression applied by default ?

‏2014-08-25T23:11:22Z |

hi everyone,

I wonder if the DB2 applies any compression by default, without using

either row or value compression ? 


Here is one thing which puzzle me, I'm trying to do an estimation (database sizing) based on some of our tables which are expected to grow the most (these are constantly inserted - every 5 min or more often with rows, but data is also deleted periodically eg. every 2 week, 1 month, 3 months, these are all factored into my formula).

Using either the :

  • NEW -  IBM data studio client  (it shows the average row size, eg. 200 bytes)
  • OLD - DB2 Control Center (it shows both average row size, but it can also show table size on disk)

I could determine roughly the size of my entire DB, but based on a simple calculus of "average row size" X "number of rows" to calculate the size of a table (one of the largest one), this does not match up with the size db2 control center is returning.

On the other hand, using get_dbsize_info() 

eg.  db2 => call get_dbsize_info(?,?,?,0)

  Value of output parameters
  Parameter Value : 2014-08-25-
  Parameter Name  : DATABASESIZE
  Parameter Value : 2395136000
  Parameter Name  : DATABASECAPACITY
  Parameter Value : 10164322304
  Return Status = 0

the size comes out to be much much smaller, about 20% of my calculus. Based on my calculation (worse case scenario) the DB size should be around 500GB, while using get_dbsize_info() it comes out only 120GB.

So what can be wrong ?  My first thought is that there might be some compression (especially for many zero value inserts - value compression?) going on by default, but i checked with our appplication DB admin, the tables are not created with any compression.

Is the data studio client unreliable ?

Is the db2 control center unreliable ? 

Am i miss calculating something in my formula so badly ? 


NOTE : We are currently using DB2 9.7 fp4 (+ newer FP levels) and DB2 10.1.fp2 (+ newer FP levels).

Any feedback is greatly appreciated ! Thanks


Updated on 2014-09-10T17:45:23Z at 2014-09-10T17:45:23Z by tmhajas
  • mor
    487 Posts

    Re: database compression ?

    ‏2014-08-26T09:00:37Z  in response to tmhajas

    db2 v9.7 / v10.1 does not automatically use row-compression or value-compression.

    If licensed, row-compression applies only to those tables/indexes that are compressed (either by their 'create table' 'create index' DDL or by Alter table.
    The same is true for value-compression - by explicit request.

    If your runstats are up to date, you don't need any GUI to determine table/index sizes, but instead can SUM + Group-By on the various columns in sysibmadm.admintabinfo.

    Control-Centre is deprecated in current db2 versions.
    Data Studio: a sledgehammer to crack a nut.

    • tmhajas
      7 Posts

      Re: database compression ?

      ‏2014-09-10T23:08:03Z  in response to mor

      hi Mike,

      Thanks for the reply, I think I've read that for compression some licensing is required, which is not available in our DB2 ESE license.

      The sysibmadm.admintabinfo is certainly useful, but it's an overkill to calculate what i need (too many tables to take into account), I just need to validated my db sizing roughly, I don't look for an exact match, but when in reality it's 120GB and theory 500GB, that it's a problem hard to overlook.


      I've done some more testing with this admintabinfo, for a smaller scale environment we have and something is not consistent:

      db2 => select tabschema, SUM(data_object_p_size),SUM(INDEX_OBJECT_P_SIZE), SUM(LONG_OBJECT_P_SIZE),SUM(LOB_OBJECT_P_SIZE), SUM(XML_OBJECT_P_SIZE)  fro
      m sysibmadm.admintabinfo group by tabschema

      TABSCHEMA                                                                                                                        2
      3                    4                    5                    6
      -------------------------------------------------------------------------------------------------------------------------------- --------------------
      -------------------- -------------------- -------------------- --------------------

      TPC                                                                                                                                           2124608
                    239744                  832                  704                    0

        1 record(s) selected.

      where total is : 2365880 which according to is in kilobytes, which is ~2.4 GB

      Then if i check this number against get_dbsize_info :

      db2 => call get_dbsize_info(?,?,?,0)

        Value of output parameters
        Parameter Name  : SNAPSHOTTIMESTAMP
        Parameter Value : 2014-09-10-

        Parameter Name  : DATABASESIZE
        Parameter Value : 3173998592

        Parameter Name  : DATABASECAPACITY
        Parameter Value : 10166206464

        Return Status = 0

      which is approx 3GB, that matches with the size on the disk: c:\DB2, but it's again off with ~30% from the admintabinfo.


      Finally here is my formula that I use to estimate the DB size (based on our application):

      (# bytes in DB) = R x (S x V x ((24 x CR x R1) + (24 x R2) + R3))

       R = performance record size; for a subsystem approximately 200 bytes
       S = number of subsystems
       V = number of volumes
       CR = Collection Rate; number of performance samples in a hour, for example, 5 min = 12 samples
       R1 = retention period for sample data, in days
       R2 = retention period for hourly data, in days
       R3= retention period for daily data, in days

      now this 200 bytes is from the db2cc or data studio client on average for the tables I expect the most growth (performance data tab), i just checked briefly throughout the tables, it's 200bytes on average.

      Again, if it's only a 10-20% difference, that would be fine. Let me know if you have any thoughts. Thanks

      • tmhajas
        7 Posts

        Re: database compression ?

        ‏2014-09-12T18:40:22Z  in response to tmhajas

        Apparently my formula is wrong, the number of subsystems should have not be part of the calculus. Without that variable, the database sizing matches up the size on the disk. Thanks