Skip to main content

New features in DB2 9.5 to help your business grow

Redesigned redistribute utility, enhancements to row compression, and database roles

Kevin Yeung-Kuen See (see@ca.ibm.com), Software Developer, IBM, Software Group
Kevin See photo
Kevin Yeung-Kuen See, CISSP, has been a software developer at the IBM Toronto Laboratory for the last 11 years. He is an IBM Certified Solutions Developer for XML and Related Technologies and an IBM Certified Solutions Expert (DB2 Database Administration for IBM OS/390; DB2 Database Administration for Linux, UNIX, and Windows; DB2 Advanced Database Administration for Linux, UNIX, and Windows; and DB2 Family Application Development). He is also an ISC2 Certified Information Systems Security Professional (CISSP). He has written numerous IBM developerWorks articles and is a co-author of the book Understanding DB2 9 Security (IBM Press, 2006). In his spare time, he enjoys hiking, learning something new, and trying to figure out the world according to Justin, his toddler son, and Natalie, his infant daughter.

Summary:  As organizations grow, systems and database administrators face many challenges; among them being how to increase the capacity of the database, how to store more data effectively, and how to manage privileges and authorities for a growing numbers of users. This article shows how you can leverage three important features in DB2® 9.5 to overcome these challenges: the new redesigned redistribute utility, enhancements to row compression, and database roles.

Date:  21 Jun 2007
Level:  Introductory
Activity:  1797 views

Introduction

As your organization grows, there are more users and more customers who require access to data inside your databases. Therefore, you need to increase the capacity to meet this demand. At the same time, increasing the capacity indirectly increases the need for more hardware to physically store the growing amount of information and data. Even though hard disks are much cheaper compared to 15 years ago when a 250 megabyte hard disk cost a couple hundred dollars, adding disk drives still incurs a cost to the organization.

Finally, with the growing number of internal users and customers, you need to ensure that privileges are granted at the right levels to the right people to avoid potential security breaches to your databases.

DB2 9.5 provides a newly redesigned offline redistribute database partition group utility that allows capacity to be increased without using as large amounts of database log space or time as may have been required for the earlier versions of the REDISTRIBUTE DATABASE PARTITON GROUP.

Row compression was introduced in DB2 9. One advantage of row compression is that you can improve the performance of your queries if the compression is effective because the prefetcher can fit more data into the bufferpool. However, you can only create a compression dictionary using either the offline reorg table command or the inspect utility. In DB2 9.5, the automatic dictionary creation feature is introduced to further enhance the row compression offering.

Finally, the database roles feature is now available with DB2 9.5. This feature allows you to model your authority and privileges closer to the organization structure. It also eases some of the pain caused by group privilege restrictions.


1. Redesigned redistribute database partition group utility

To invoke the newly redesigned utility, issue the REDISTRIBUTE DATABASE PARTITON GROUP command as before, but there are now more options that you can specify (as demonstrated in section 1.2). This feature is explained in detail in section 1.3.

To add any new database partition, first update the instance to include the new partition(s). As with previous releases, you can do so by using the db2start add node command if the database manager is already stopped, or if you are willing to take the database manager offline through db2stop. If the database instance is still running, you can use either the sqleaddn API or the add dbpartitionnum command.

Once the new database partition is added to the instance, every database in the instance can take advantage of the newly added database partition(s). You can choose the database partition group where you want to add this extra partition. You are advised to issue the ALTER DATABASE PARTITION GROUP statement to customize the tablespace setting and to control the container layout in the new partition. If you want to copy the setting from the first partition of the group, you can take advantage of the new add dbpartitionnum option within the REDISTRIBUTE DATABASE PARTITON GROUP command.

Just as before, there are three ways to generate a database partition map: uniform, weighted through a distribution array (using distfile), and user specified (using targetmap).

The old version of the redistribute utility uses special insert and delete with sub-select statements. It commits on every redistributed table. Thus, it generates a significant amount of database log activity. A small database log file size can run out of log space and cause the redistribution to fail. For every row that moves, there are two actions — insert and delete. Data is moved on a per row basis. Thus, it increases the redistribution time significantly.

The new version of the redistribute utility uses the minimum logging model and page-level data movement (as explained in section 1.1). Therefore, decreasing the database log file consumption and improving the speed of moving data.

Performance is improved in three ways:

  • Eliminating row-by-row operations (use page level data movement instead)
  • Supporting concurrent tables
  • Streamlining multiple operations into one

Generally, after a redistribution of a database partition group, you need to perform an offline table reorg, index recreation (if necessary), and runstats on the redistributed table. Each of these operations: redistribute, table reorg, index create, and runstats require scanning the table data once. As a result, being able to combine all three of these operations into one can greatly improve performance. In addition, logging is minimized so that the redistribute utility does not need to be broken down into multiple passes and you do not need to manage the excessive amount of active log space and log archiving space. Lastly, the usability aspect of the redistribute utility has improved by allowing you to specify table order and to stop and continue redistributing data table by table. In addition, a user friendly progress monitoring feature is introduced with a SQL interface and can be accessed remotely from any client. The progress information can be retrieved through the command line processor (CLP) (list utilities and list utilities show details) or administrative view (SNAPUTIL and SNAPUTIL_PROGRESS) or the administrative table user-defined function (UDF) (SNAP_GET_UTIL and SNAP_GET_UTIL_PROGRESS).


1.1 Non-rollforward recoverable nature

Since all updates related to the data movement in every object type are not logged, the redistribute utility is not rollforward-recoverable. This means that rolling forward through a redistribute operation has the effect of marking all tables that were redistributed as "invalid." There is nothing that can be done with such tables except drop them.

It is recommended that you take a full database backup before running the offline redistribute utility. At a minimum, ensure that you have sufficient tablespace backups taken prior to using the redistribute utility so that you can restore the database to a point in time just prior to the redistribute operation, if necessary.

When the utility starts to redistribute a table, it places all tablespaces (data tablespaces, index tablespaces, and long tablespaces) that are associated with the table into the BACKUP PENDING state and the table itself is marked as unavailable and redistribute_pending. After the table is redistributed (the table goes back to the normal state), you are required to take a tablespace level backup in order to acquire full access to the table again.

In theory, you can perform select, insert, update, or delete statements against a table that is not yet redistributed in the database partition group. Also, while the redistribute utility still running, the redistributed table can also do the same if tablespace backup is performed on the tablespaces associated with the table. However, as redistribute is not rollforward recoverable, any transaction that happened concurrently during redistribute can be lost if for any reason the redistribute recovery options — continue or abort — are not able to continue or abort a previous failed redistribute. Therefore, it is a good idea that you do not perform any operation other than read only operations (such as SELECT), as it is not rollforward recoverable.

*******************************************************************
Tips:
Do not rollfoward through logs where there is a redistribute operation that is in the middle of that log!

Always take a backup before and after using the redistribute utility.
*******************************************************************


1.2 Syntax for the newly redesigned utility


Listing 1. Syntax diagram
                
>>-REDISTRIBUTE DATABASE PARTITION GROUP--database partition group-->

>--NOT ROLLFORWARD RECOVERABLE-------------------------------------->
 
>--+-+-UNIFORM------------------+--+--+----------------------------+->
   | '-USING DISTFILE--distfile-' |   '--| ADD/DROP DBPARTITION |--+
   +-USING TARGETMAP--targetmap------------------------------------|
   +-CONTINUE------------------------------------------------------+
   '-ABORT---------------------------------------------------------'

>--+---------------------------------------+------------------------>
   |          .-,----------.               |
   |          v            |    ,--ONLY-,  |
   '--TABLE-(---table name-+-)--+-------+--+
                                '-FIRST-,

>--+----------------------------+----------------------------------><
   '--| REDISTRIBUTE OPTIONS |--'  

ADD/DROP DATABASE PARTITION SPEC:
>--+-----------------------------------------------------+---------->
   |                             .-,----------------.    |
   |                             v                  |    |
   +--ADD--+-DBPARTITIONNUM--+-(---n--+---------+--+-)--+
           '-DBPARTITIONNUMS-'         '--TO m--'

>--+-----------------------------------------------------+---------->
   |                              .-,---------------.    |
   |                              v                 |    |
   +--DROP--+-DBPARTITIONNUM--+-(---n--+--------+--+-)--+
            '-DBPARTITIONNUMS-'         '--TO m--'

REDISTRIBUTE OPTIONS:
|--+---------------------------------------------+------------------>
   |                                             |
   +---PARALLEL TABLE --n------------------------+
   |                                             |
   |   ,--COMPACT ON---,                         |
   +---+               +-------------------------+
   |   '--COMPACT OFF--'                         |
   |                                             |
   |   ,--INDEXING MODE AUTOSELECT--,            |
   +---+                            +------------+
   |   +--INDEXING MODE INCREMENTAL-+            |
   |   +--INDEXING MODE REBUILD-----+            |
   |   '--INDEXING MODE DEFERRED----'            |
   |                                             |
   +---DATA BUFFER --n---------------------------+
   |                                             |
   |   ,--STATISTICS USE PROFILE--,              |
   +---+                          +--------------+
   |   '--STATISTICS NONE---------'              |
   |                                             |
   '---STOP AT--local-isotime-------------------'
      


Note that there is a keyword change. The old keyword ROLLBACK has been deprecated and replaced by the new keyword ABORT. This is because ROLLBACK can easily be confused with the transaction rollback and ABORT is more consistent with other utilities.


1.3 New features within the newly redesigned utility

The first feature is the ability to add or drop database partitions to or from an existing database partition group and perform the redistribute on the same command. This used to be a two step process where you were required to issue an ALTER DATABASE PARTITION GROUP statement to add or drop the partitions first. If you are adding database partitions, you will be asked to specify the tablespace information for the new partitions.

You might specify to both add and drop database partitions on the same command but the add must come before the drop. Otherwise, the syntax is not recognized. You are likely to specify both if you are trying to retire some old machines and adding more powerful machines into your database DPF cluster. Note that, as described earlier, the add partition option automatically creates the tablespace on the new partition based on the tablespace setting from the first partition of the group.

There are eight new options that you may specify when issuing a REDISTRIBUTE DATABASE PARTITION GROUP command: TABLE FIRST, TABLE ONLY, PARALLEL TABLE, COMPACT, INDEXING MODE, DATA BUFFER, STATISTICS, and STOP AT. You can apply all of these options to the initial redistribute command or subsequent redistribute continue or redistribute abort commands.

TABLE FIRST
The TABLE FIRST option allows you to specify a list of tables that should have precedence over the others during data redistribution. After this set of tables is redistributed, the remaining tables that reside inside the database partition group are redistributed using an order as determined by DB2. All tables that reside in the group are redistributed if the command succeeds and returns sqlcode 0.

TABLE ONLY
The TABLE ONLY option allows you to specify a list of tables that will be redistributed during the data redistribution. After this set of tables is redistributed, the remaining tables that reside inside the database partition group will not be redistributed. As a result, the group is only partially redistributed, the remaining tables can be redistributed at a later time by using the redistribute command with the continue option.

PARALLEL TABLE
The PARALLEL TABLE option tells the redistribute utility to perform parallelism at the table level where multiple tables are redistributed at the same time. If this option is not explicitly specified, DB2 by default uses two table parallelism (as if you have specified the parallel table 2 option).

COMPACT
COMPACT ON is the default for the redistribute utility, where it performs space compaction much like an offline table reorg does. You have an option to specify it explicitly or turn it off, as you desired.

INDEXING MODE
By default, the indexing mode autoselect is used when the data is being redistributed. This means that DB2 chooses either the rebuild or incremental index maintain approach, whichever is the best fit. You can also specify the rebuild or incremental options, as you desire. The index is recreated if it is not valid to use using the specified approach. Should you prefer to mark all the indexes bad and recreate them later, you can specify the deferred option.

DATA BUFFER
Like the load utility, you can specify how much memory you want to grasp from the utility heap to run this utility. If you do not specify a value, DB2 determines a best fit value and uses that value.

STATISTICS
If a table already has a statistics profile built from a previously run runstats operation, you can specify the statistics use profile (or leave the default). The utility automatically performs a runstats against any table with an existing statistics profile after the data has been moved to the correct partitions. You can turn this option off by specifying STATISTICS NONE.

There is a possible condition where the table has a statistics profile but the utility is unable to perform statistics collection, despite having the statistics option specified. If all partitions are in a combo state where they are both sending and receiving data, then statistics cannot be collected. This is generally known as xcross data movement. For example, redistribute with a target map that has data moving as such:
partition 11 -> partition 11, 19, 51
partition 19 -> partition 11, 19, 51
partition 51 -> partition 11, 19, 51
Therefore, each partition is sending data to all other partitions, while leaving some data behind.

STOP AT
STOP AT allows you to specify a time that you would like the redistribute utility to stop even if it has not completely finished redistributing all the tables. The redistribute utility estimates how long it takes to redistribute a table and compares the current time and the stop time (if one is specified) to see if it can finish within the allowable time frame. If so, it kicks off redistributing that table. Since it is an estimate, it might not necessary stop at the specified time but is as close as possible.


1.4 Monitor the redistribute progress

DB2 9.5 allows you to monitor the progress of the redistribute utility by using existing utility monitoring interfaces: CLP (LIST UTILITIES and LIST UTILITIES SHOW DETAILS commands) or administrative view (SNAPUTIL and SNAPUTIL_PROGRESS) or administrative table UDF (SNAP_GET_UTIL and SNAP_GET_UTIL_PROGRESS).

The following output results from issuing db2 LIST UTILITIES SHOW DETAILS while data redistribution is in progress. The database partition group RDST_V10_015 is being redistributed with a new partition being added. Other options specified are COMPACT ON (default), INDEXING MODE INCREMENTAL, and PARALLEL TABLE 3. Also note that the output only shows one of the partitions. Actual output gives you each partition, including the newly added partition.


Listing 2. Sample Output from monitoring redistribute progress
                
db2 list utilities show detail

ID                               = 1
Type                             = REDISTRIBUTE
Database Name                    = RDST819
Partition Number                 = 11
Description                      = RDST_V10_015 UNIFORM ADD NODES
                                   COMPACT ON SPACE REUSE RECORD LEVEL
                                   INDEXING MODE INCREMENTAL
Start Time                       = 02-20-2007 23:21:33.785819
State                            = Executing
Invocation Type                  = User
Progress Monitoring:
   Estimated Percentage Complete = 8
   Summary:
      Total Work                 = 1965600
      Completed Work             = 155221
      Total Number Of Tables     = 15
      Tables Completed           = 0
      Tables In Progress         = 3

   Current Table 1:
      Description                = "NEWTON  "."RDST_V10_015A"
      Total Work                 = 655200 bytes
      Completed Work             = 55001 bytes

   Current Table 2:
      Description                = "NEWTON  "."RDST_V10_015B"
      Total Work                 = 450200 bytes
      Completed Work             = 54220 bytes

   Current Table 3:
      Description                = "NEWTON  "."RDST_V10_015C"
      Total Work                 = 978901 bytes
      Completed Work             = 46000 bytes


1.5 Sample Scenario

Online book seller "XYZ" has a three-partition DPF cluster (assume for simplicity that they use partition numbers 11, 19, and 51) for their backend DB2 database that stores customer transactions (in tablespace CUSTOMER), internal HR information, and many other data. They notice that one of the partitions is not performing as well as the others as it is an older, less powerful machine.

After making a successful business case, the IT department purchased two new machines. The system administrator or DBA was given the task of getting these two machines (say, the partition number is decided to be 171 and 999) into the cluster and retiring the old machine (say, partition number 19).

As a certain portion of the database is literally used 24x7, the change must be made gradually for each database partition group and each table in the system, as the maintenance window is very small and the system needs to be up and running immediately. The statistics need to be collected and all invalid indexes rebuilt.

First of all, the two new partitions must be added using add dbpartitionnums and modify the db2nodes.cfg to include the two new partitions. Now, the instance level work is done.

The next step is to decide on the priority of tables for the redistribution. Obviously in this situation, any tables that deal with the online book sales are priority. Assume that the database partition group SALES contains all the tables involved in the sales transactions and the table TRANS (containing customer credit card transaction) are more important than the table CATALOG (containing more static inventory information) and other related tables.

First issue the following commands:

redistribute database partition group SALES uniform add dbpartitionnums (171,999)
drop dbpartitionnums (19) table (TRANS) first stop at xxxx compact on
statistics use profile indexing mode rebuild

Next, perform a tablespace level backup for the CUSTOMER tablespace in order to be able to regain full access to both the TRANS and CATALOG tables.

A few weeks have gone by. At the next maintenance window, you decide that you only want to redistribute the CATALOG table, as you have other higher priority items to do. You would likely perform this:

redistribute database partition group SALES CONTINUE table (CATALOG) ONLY
compact on statistics use profile indexing mode rebuild

You need to perform a tablespace level backup for the tablespace CUSTOMER in order to be able to regain full access to both the TRANS and CATALOG tables.

You want to redistribute remaining tables but you have a constrained time, so you would use the STOP AT option:

redistribute database partition group SALES CONTINUE stop at xxx
compact on statistics use profile indexing mode rebuild

Continue to use the above redistribute continue command with the STOP AT option until you finished redistributing all tables. If there are more tables to redistribute, but the redistribute was stopped due to the STOP AT time constraint, the command returns sqlcode +1379.

Here is an example of sqlcode +1379:

SQL1379W  Database partition group "RDST_V10_017" has been partially
redistributed. The number of tables redistributed is "1", and the 
number of tables yet to be redistributed is "2". Reason code = "2".

If all tables are finally redistributed, the last redistribute continue with STOP AT option command will returned:

DB20000I  The REDISTRIBUTE DATABASE PARTITON GROUP command completed. successfully.  

Every time the redistribute command is complete, remember to perform a tablespace level backup on the tablespace where the redistributed table resides. Otherwise, the tablespace remains in the backup pending state.

Once all the tables and database partition groups are moved out of the partition that resided on the old machine, you can then use drop dbpartitionnums to remove the partition completely from the instance.

Note: Remember to redistribute the IBMDEFAULTGROUP database partition group too. IBMDEFAULTGROUP is automatically created when the database is created.



2. Enhancement to row compression

The row compression feature was one of the most popular features in DB2 9. This feature complemented the existing column level compression, system default value compression, and value compression options. However, to take advantage of row compression, you must either use the DB2 inspect utility or offline table reorg to build a dictionary first before any further data entered into the table are compressed. In DB2 9.5, a new feature called Automatic Dictionary Creation (ADC) eases this requirement.


2.1 Introduction to ADC

ADC, as named, is the processing whereby a compression dictionary is automatically created. The table COMPRESS attribute must be enabled in order for ADC to be applicable. Additionally, ADC only occurs if no compression dictionary exists within the physical table data object or partition. Once a dictionary exists, ADC is no longer active. The premise behind ADC is to automatically create a representative compression dictionary based on sampling a small subset of table data. Once the dictionary has been created and automatically inserted into the table, all future population of data is subject to being compressed. The ADC concept is demonstrated in the Figure 1.


Figure 1. ADC


2.2 More ways to perform row compression

In addition to the inspect utility and offline table reorg, you can now build a dictionary (automatically) during LOAD INSERT, LOAD REPLACE, REDISTRIBUTE DATABASE PARTITION GROUP, INSERT, or UPDATE as long as the COMPRESS table attribute is turned on and a sufficient amount of data is available (data already exists in the table, plus the new data row or expansion of data row).

Just like the offline table reorg, LOAD REPLACE has both the KEEPDICTIONARY and RESETDICTIONARY options. All you have to do is to specify the KEEPDICTIONARY or RESETDICTIONARY keyword after the REPLACE keyword. If neither keyword is specified, the KEEPDICTIONARY behavior will be used if the table COMPRESS attribute is turned on.

In order to synchronize with the new ADC feature, the behavior of KEEPDICTIONARY for the offline table reorg has changed. For more information, see the "2.4 Behavior changes in row compression using offline reorg utility" section.

If you use the offline table reorg or rowcompestimate option in the inspect tool, data in the table is fully compressed. For other interfaces, the sampling data (used for building the ADC dictionary) remains uncompressed, as seen in Figure 1, until an offline table reorg is performed.


2.3 Factors affecting ADC

There are two factors that affect ADC: a minimum (threshold) amount of data is available for sampling (ADC_THRESHOLD) and a minimum percentage of useful data in the sampling buffer (ADC_MINPCT). These values are preset by DB2 internally.

These two factors are used differently among the dictionary building interfaces. ADC kicks off when the ADC_THRESHOLD is reached. However, during sampling, the load and redistribute utilities do not use ADC_MINPCT.

At the end of using the load utility to add data into a table and the redistribute utility to redistribute the data to a new partition of the table, both utilities examine the amount of incoming new data to see if there is at least ADC_THRESHOLD * ADC_MINPCT amount of data. If so, they still generate a compression dictionary, but in this case, there is no compressed record, as all data has already been moved into the table.

Dictionary building for the load and redistribute utilities are asynchronous, so data can be loaded or moved into the new partition while the dictionary is being built. Once the dictionary is built, then any new incoming data is subject to row compression. Similar conditions can occur if you have two concurrent threads performing inserts to a table. One of the threads might be building the dictionary, while another thread continues inserting into the table without the row being compressed until the dictionary is built.


2.4 Behavior changes in row compression using the offline reorg utility

In DB2 9, the default compression dictionary directive for the REORG TABLE command is KEEPDICTIONARY. If the table COMPRESS attribute is set to YES, and no compression dictionary currently exists in the table, the KEEPDICTIONARY directive attempts to build and insert a compression dictionary into the table independent of the size or volume of data in the table. As long as one valid record exists in the table, an attempt is made to create a dictionary in this scenario.

With the introduction of ADC, the behavior of the REORG TABLE KEEPDICTIONARY scenario has changed. A dictionary is only inserted into a table provided that the table is of a size equal to the ADC threshold, and that a sufficient volume of data exists in the table when it reaches this threshold.


2.5 Behavior changes in row compression for tables that have data rows less than or equal to the minimum record length

In DB2 9, any data row that is equal to or less than the database manager minimum record length is not used in the sampling when building the dictionary, and is not compressed during row compression. As of DB2 9.5, all data rows participate in the sampling. However, data rows that have a length less than or equal to the database manager minimum record length are not compressed, even if the dictionary has been created.


2.6 The new table UDF and view for compression information

DB2 9.5 also includes a new table UDF SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO and a new administrative view SYSIBMADM.ADMINTABCOMPRESSINFO.

The new administrative view SYSIBMADM.ADMINTABCOMPRESSINFO returns a report of compression information for all tables in all schemas of a database. Since it is a view, you can add in your predicates to constrain the output result to what you need.

The new table UDF SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO has two functionalities: to report the compression information for the specified table(s) and to generate an estimate of new compression information based on current table data. The UDF takes three input parameters: tabschema, tabname, and execmode. You are allowed to use an empty string or NULL in the first two parameters to perform a wild card (*) search. If an empty string or NULL is used in the last parameter, it assumes the default option, which is REPORT.

The syntax of the SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO UDF:

>>- SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO- -(--tabschema--,--tabname--,--execmode--)--<


Here is an example of how to use the UDF to obtain the compression statistics report for table ROWCOMP.MDC:

select * from table(SYSPROC.ADMIN_GET_TAB_COMPRESS_INFO('ROWCOMP','MDC','REPORT'
)) as T
TABSCHEMA
                                                 TABNAME

                   DBPARTITIONNUM DATA_PARTITION_ID COMPRESS_ATTR DICT_BUILDER
                 DICT_BUILD_TIMESTAMP       COMPRESS_DICT_SIZE   EXPAND_DICT_SIZ
E     ROWS_SAMPLED PAGES_SAVED_PERCENT BYTES_SAVED_PERCENT AVG_COMPRESS_REC_LENG
TH
--------------------------------------------------------------------------------
------------------------------------------------ -------------------------------
--------------------------------------------------------------------------------
------------------ -------------- ----------------- ------------- --------------
---------------- -------------------------- -------------------- ---------------
----- ------------ ------------------- ------------------- ---------------------
--
ROWCOMP                                                                         
                                                 MDC                            
                                                                                
                                0                 0 Y             TABLE GROWTH  
                 2007-05-25-09.15.43.000000      66432                32768      
   2041                  66                  66                     176

  1 record(s) selected.

By looking at the pages_saved_percent and bytes_saved_percent, you can see that this table is nicely compression with 66 percent of saving in storage spaces.

The ESTIMATE capability is useful for determining what kind of storage saving you can obtain if the table were to be compressed. Note that the table does not require the COMPRESS attribute to be set to yes to use this capability. With the data returned, you can make a decision on whether you would like to perform an offline table reorg to take advantage of row compression. If the table is a range clustered table (that does not support row compression), you can consider switching it to become a range partitioned table (that supports row compression). Another advantage of the ESTIMATE capability is it allows you to compare the effectiveness of the existing dictionary with the new one that is built with the latest table content.



3. Database roles

As the number of employees in your organization grows, it can be a challenge to maintain the appropriate access control of your data. It is even worse if your organization has high turnover or staff moving from one area to another area of the organization. One intuitive thought will be to associate privilege and authority with a group and any person that requires that set of privileges or authorities can request to become a member of the group through the system administrator externally.

This sounds good on paper. However, there are two deficiencies:

  • You need the system administrator (if you are not in both roles) to define the group and grant membership to each person who must belong to the group.
  • There are a few restrictions where authorities or privileges granted to a group are not honored. To find out more about these restrictions, refer to the "Group privileges restriction" section in "Understanding DB2 9 security", found in the Resources section.

3.1 Introduction to database roles

The simplest description one can give for a database role is "in database group." The more formal definition for a role is a "database object that may group together one or more privileges or database authorities, and may be granted to users, groups, PUBLIC, or other roles."

Database roles can be used to close the gaps left behind by the external group membership. The database security administrator has tight control on the role membership. Note that a database administrator or system administrator does not possess the database security administrator authority by default. The database security administrator authority must be granted explicitly by a DBA or SYSADM to an individual.

Unlike group privileges, privileges associated with a role are also considered during privilege or authority checking. However, privileges associated with a role that are granted to a group observe the same restrictions as group privileges.

Role membership is managed by the database security administrator. The database security administrator can optionally delegate this ability to others when the role membership is granted. For example, if the database security administrator want user henry (with authorization ID HENRY) to administrate the ROLE CE_SECURITY and be a member of that role. This SQL statement performs both tasks:

    GRANT ROLE CE_SECURITY TO USER HENRY WITH ADMIN OPTION

Omit the WITH ADMIN OPTION clause if only membership is being granted. When a role membership is revoked from a user, a group, or a role, the ability to administrate the "revoked" role is also removed at the same time.

Instance level authorities: SYSADM, SYSMAINT, SYSCTRL and SYSQUIESCE cannot be assigned to a role. You must still rely on external group membership for these authorities.

During the establishment of the database connection, your role membership (along with your group membership) is obtained. Once successfully authenticated, any roles that are granted to the connected user are activated automatically. There is no way to explicitly enable or disable any granted role in DB2 9.5.

In DB2 9.5, there are two new table functions that allow you to obtain privilege and authority information. The AUTH_LIST_ROLES_FOR_AUTHID table function returns the list of roles in which the given authorization ID is a member. The AUTH_LIST_AUTHORITIES_FOR_AUTHID table function returns all authorities held by the authorization ID granted directly or indirectly (this is to replace the GET AUTHORIZATIONS command and the sqluadau API that are being deprecated).

In a hospital setting, the role INTERN is granted to the role DOCTOR and the role DOCTOR is granted to the role SPECIALIST, then the role SPECIALIST is granted to user JUSTIN. JUSTIN belongs to group STAFF and the role EMPLOYEE is granted to the group STAFF. JUSTIN also belongs to a special group PUBLIC and the role VISITOR is granted to PUBLIC. Retrieve all roles granted to user Justin. Listing 3 shows the syntax on how to get all the role memberships granted to an authorization ID using the AUTH_LIST_ROLES_FOR_AUTHID table function.


Listing 3. Same query and output on the AUTH_LIST_ROLES_FOR_AUTHID table function
                
SELECT substr(1, 7, GRANTOR) as GRANTOR, GRANTORTYPE, substr(1, 10, GRANTEE) as GRANTEE, 
GRANTEETYPE, substr(1, 10, ROLENAME) as ROLENAME, ADMIN FROM 
TABLE (SYSPROC.AUTH_LIST_ROLES_FOR_AUTHID ('JUSTIN', 'U') ) AS T

GRANTOR GRANTORTYPE GRANTEE    GRANTEETYPE ROLENAME   ADMIN
------- ----------- ---------- ----------- ---------- -----
SECADM1 U           DOCTOR     R           INTERN     N    
SECADM1 U           SPECIALIST R           DOCTOR     N    
SECADM1 U           JUSTIN     U           SPECIALIST N    
SECADM2 U           STAFF      G           EMPLOYEE   N    
SECADM3 U           PUBLIC     G           VISITOR    N 

Assume user authorization ID 'SEE' is the instance owner. Listing 4 shows the syntax of how to get the database and instance level authorities for a given authorization ID using the AUTH_LIST_AUTHORITIES_FOR_AUTHID table function.


Listing 4. Same query and output on AUTH_LIST_AUTHORITIES_FOR_AUTHID table function
                
SELECT * FROM TABLE (SYSPROC.AUTH_LIST_AUTHORITIES_FOR_AUTHID ('SEE', 'U') ) AS T 
ORDER BY AUTHORITY

AUTHORITY                 D_USER D_GROUP D_PUBLIC ROLE_USER ROLE_GROUP ROLE_PUBLIC D_ROLE
------------------------- ------ ------- -------- --------- ---------- ----------- ------
BINDADD                   Y      N       Y        N         N          N           *     
CONNECT                   Y      N       Y        N         N          N           *     
CREATE_EXTERNAL_ROUTINE   Y      N       N        N         N          N           *     
CREATE_NOT_FENCED_ROUTINE Y      N       N        N         N          N           *     
CREATETAB                 Y      N       Y        N         N          N           *     
DBADM                     Y      N       N        N         N          N           *     
IMPLICIT_SCHEMA           Y      N       Y        N         N          N           *     
LOAD                      Y      N       N        N         N          N           *     
QUIESCE_CONNECT           Y      N       N        N         N          N           *     
SECADM                    N      N       N        N         N          N           *     
SYSADM                    *      Y       *        *         *          *           *     
SYSCTRL                   *      N       *        *         *          *           *     
SYSMAINT                  *      N       *        *         *          *           *     
SYSMON                    *      N       *        *         *          *           * 


3.2 Role hierarchy and your organization chart

As long as it does not result in a cycle, membership of a role can be granted to another role. You can create a role hierarchy that mirrors your organization chart and associate privileges and authorities that are common to a "job role" to a role object. Any user who is classified to be working in a particular "job role" can then be granted the membership of that role object. Obviously, an additional requirement of privileges or authorities can still be received through a grant to USER, GROUP or PUBLIC.


3.3 Role and LBAC

As part of the DB2 9.5 database roles enhancement to Label-based access control (LBAC), you might grant or revoke security labels or exemptions to or from a role or a group. By default, privileges on a security label or exemption that are granted to a role or a group are not considered when a policy is created. You need to alter the security policy using the new ALTER SECURITY POLICY statement to enable them.


3.4 Sample Scenario

Take a look at a simple scenario.


Figure 2. Sample organization chart


Given an organization structure shown in Figure 2, a user is a member of one of the following types: Director, System Test Manager, Functional Test Manager, Integration Test Manager or a tester in one of the three testing areas.

One of the typical challenges for a quality assurance organization is in gathering the overall testing status and progress. Assume this organization uses a tool such as IBM® Rational® Enterprise ClearQuest® test management, which executes and stores the test results into a backend DB2 database.

Obviously, it makes sense that the director be able to see the overall progress while the tester should see his or her own progress. By building a role hierarchy that mirrors the organization chart and granting the access to the required data in the table, the director (which was granted the role DIRECTOR) automatically sees the data.

To build this role hierarchy, the following SQL statements should be executed:

CREATE ROLE DIRECTOR
CREATE ROLE SVT_MANAGER
CREATE ROLE FVT_MANAGER
CREATE ROLE INTEGRATION_MANAGER
CREATE ROLE SVT_TESTER
CREATE ROLE FVT_TESTER
CREATE ROLE INTEGATION_TESTER

GRANT ROLE SVT_MANAGER TO DIRECTOR
GRANT ROLE FVT_MANAGER TO DIRECTOR
GRANT ROLE INTEGRATION_MANAGER TO DIRECTOR

GRANT ROLE INTEGATION_TESTER TO INTEGRATION_MANAGER
GRANT ROLE FVT_TESTER TO FVT_MANAGER
GRANT ROLE SVT_TESTER TO SVT_MANAGER

Assume there is an underlying base table that stores all the test case names (NAME), test results (VERDICT), related defects (DEFECTS), and comments (NOTES) for each individual tester. The only relevant information to bring up the chain is the column VERDICT.

Each individual tester grants SELECT on the VERDICT column to the role in which they belong. For example, tester KEVIN enters:

GRANT SELECT ON TABLE KEVIN.MYTEST (VERDICT) TO ROLE FVT_TESTER.

Then, all verdicts become accessible by the management and they can do something like the following to deduce the overall successful rate.

VALUES ((
SELECT SUM(VERDICT) FROM KEVIN.MYTEST,... all the other tester table .... 
WHERE VERDICT = 'PASS') / (
SELECT SUM(VERDICT) FROM KEVIN.MYTEST,... all the other tester table .... ))



Conclusion

After reading this article, you should have a better understanding of how to take advantage of the three new DB2 9.5 features to ease or solve the three common challenges: increasing the capacity of the database, storing more data effectively, and managing the privileges and authority for growing numbers of users.



Resources

Learn

Get products and technologies

  • Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2, Lotus®, Rational®, Tivoli®, and WebSphere®.

Discuss

About the author

Kevin See photo

Kevin Yeung-Kuen See, CISSP, has been a software developer at the IBM Toronto Laboratory for the last 11 years. He is an IBM Certified Solutions Developer for XML and Related Technologies and an IBM Certified Solutions Expert (DB2 Database Administration for IBM OS/390; DB2 Database Administration for Linux, UNIX, and Windows; DB2 Advanced Database Administration for Linux, UNIX, and Windows; and DB2 Family Application Development). He is also an ISC2 Certified Information Systems Security Professional (CISSP). He has written numerous IBM developerWorks articles and is a co-author of the book Understanding DB2 9 Security (IBM Press, 2006). In his spare time, he enjoys hiking, learning something new, and trying to figure out the world according to Justin, his toddler son, and Natalie, his infant daughter.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=232247
ArticleTitle=New features in DB2 9.5 to help your business grow
publish-date=06212007
author1-email=see@ca.ibm.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers