Migrate terabytes of data from IBM Balanced Warehouse to IBM Smart Analytics System

A simple guide to accomplish a complex task

In the current, ever-demanding world, data warehouse environments are continuing to grow exponentially both in terms of data and real-time data processing requirements. To meet these demanding needs, organizations have to make right decisions to move the applications to the right platform, and more importantly, at the right time. Reckitt Benckiser Group plc. was an early adopter of the IBM Balanced Configuration Unit (BCU) Warehouse and recently upgraded to the next generation IBM Smart Analytics System (ISAS) to help financial customers with a better user experience, while providing higher data capacity.

Share:

Mohankumar Saraswatipura (mohankumarsp@gmail.com), DB2 Database Specialist, Reckitt Benckiser plc

  Mohankumar SaraswatipuraMohan works as a lead DB2 Database Administrator at Reckitt Benckiser Group plc, focusing on IBM Smart Analytics Systems business intelligence solutions, application performance tuning, and SAP DB2 implementation/migration. While with the IBM Software Lab, India, he worked with the High Performance On Demand Solutions (HiPODS) team to provide the performance solutions to clients. Earlier, he worked with the DB2 product development team, India. Mohan is an IBM certified DB2 Advanced Database Administrator, DB2 Application Developer, and DB2 Problem Determination Master. Mohan completed his M Tech (Master of Technology) in computer science in the year 2004 and executive MBA from IIM Calcutta in 2007.



12 September 2013

Introduction

The database migration in DB2 is a simple task when it comes to a single partition database environment. However, there are additional decisions you need to make when it comes to a large data warehouse environment that has terabytes of data on a Database Partition Feature (DPF). This article looks at the data migration options available in DB2 and also takes a deep look into the procedure for the data module migration from BCU D5100 to ISAS 5600 R2.


Understanding the systems

The DBAs and the database architects need to understand the source and target server configurations before planning the migration activity. Figure 1 shows the DB2 elements of an Intel-based source D5100 v2 system:

Figure 1. D5100 Balanced Warehouse node configuration
D5100 Balanced Warehouse Node Configuration

Figure 2 shows the DB2 elements of an Intel-based target ISAS 5600 R2 system:

Figure 2. 5600 R2 IBM Smart Analytics System node configuration
5600 R2 IBM Smart Analytics System Node Configuration

From the database administration and migration perspective, the major difference between the two systems is the number of database partitions. Table 1 lists the database partition group distribution among the two systems and post-migration change requirements:

Table 1. Database partition group distribution
Database partition groupBCU D5100 distributionISAS 5600 distributionPost migration
IBMDEFAULTGROUP41 (NODE0000-NODE0040)65 (NODE0000-NODE0064)Partition group expansion is required
IBMCATGROUP1 (NODE0000)1 (NODE0000)No change to the system catalog partition group
IBMTEMPGROUP41 (NODE0000-NODE0040)65 (NODE0000-NODE0064)Partition group expansion is not required, however you need to add the container paths
DWECONTROLGROUP1 (NODE0000)1 (NODE0000)There are no changes
SDPG1 (NODE0000)1 (NODE0000)There are no changes
PDPG41 (NODE0000-NODE0040)65 (NODE0000-NODE0064)Partition group expansion and the container path creation are required

Effective database migration strategy

Organizations consider a migration plan effective and robust when there is minimal business impact of the tasks, such as minimal planned outages, zero data integrity problems, and optimized resource cost. Figure 3 portrays the process and associated broader steps for a successful database migration:

Figure 3. Effective database migration strategy
Effective database migration strategy

Analyze is the first stage in the database migration, and this phase broadly includes understanding the following components of the source and target systems:

  • The database node differences and the migration options (shown in Table 2)
  • The Tivoli System Automation for Multiplatform configuration
  • The business requirements
  • The staging area requirements
  • The data integrity check requirement
  • The database migration process
Table 2. Homogeneous database migration options
NumberMigration optionsProsCons
1Database restoreA well-known migration strategy among homogeneous systemsThe challenge here is with the difference in the number of logical partitions between the source and target systems. The source BCU system has 41 database partitions and the target ISAS system has 65 database partitions. Hence, it is not just a straight database restore.
2IBM InfoSphereOptim High Performance Unload (HPU)High-speed automatic repartition and load with minimal DBA interventionYou have to buy the HPU license in both the source and target systems.
3In-house data migration scriptsEasy way of migrating the data from source to destinationThe challenge here is the DBA needs to have a complete understanding of the system and business needs.

In the Prepare phase, you start creating the migration plan document and migration scripts, and you prepare the system integration diagram to perform the application interface post migration job tests.

In the Validate phase, you test the end-to-end process in a UAT environment and record the execution time for each step along with success/failure messages. In case of any failure, record the process to fix it. The key element in this phase is to cross-verify each step’s results against the acceptance criteria.

In the Implement phase, you freeze the source server application activities by bringing the interfaces and processing jobs down. Perform the migration; upon successful completion of the migration, check all the interfaces against the acceptance criteria.

Being a market leader in the FMCG space, Reckitt Benckiser wanted to keep the system outage to a minimum (in other words, less than 4 hours for reporting capabilities and 20 hours for data loading capabilities). They decided to use Option 1 to perform the migration to meet the business needs. The migration steps are discussed in the next section in greater detail.


A step-by-step migration procedure

Considering the steps from all the phases of the migration strategy, Table 3 drills down into the database migration option of a database restore from BCU to ISAS, successfully eliminating the node differences between the systems:

Table 3. Migration procedure
StepSystemTask descriptionCommands
1BCUStop ETL and Reporting JobsNon DB2 commands
2BCUQuiesce the db2 instanceQUIESCE INSTANCE bculinux IMMEDIATE
3BCUDeactivate the databaseDEACTIVATE DB BCUDB
4BCUCreate soft links from the admin node export directory to all the data nodes$HOME/db2backup/
ln -s /db2fs/bculinux/NODE0001 NODE1
ln -s /db2fs/bculinux/NODE0002 NODE2
............
ln -s /db2fs/bculinux/NODE0040 NODE40
5BCUOffline database backup to local diskCatalog Partition:
db2_all "\"<<+0< db2 BACKUP DATABASE BCUDB TO $HOME/db2backup/NODE##/backup COMPRESS WITHOUT PROMPTING"

Non Catalog Partition in parallel:
db2_all "||\"<<-0< db2 BACKUP DATABASE BCUDB TO $HOME/db2backup/NODE##/backup COMPRESS WITHOUT PROMPTING"
6BCUValidate the database backup imagesYou can use the following script to trigger check backup command on all the partitions in parallel
#!/bin/ksh
node=$(echo `db2 -x "select current dbpartitionnum from sysibm.sysdummy1"`)
if [[ $node -ge 10 ]]
     db2ckbkp $HOME/db2backup/NODE${node}/backup/*NODE00${node}* > $HOME/db2backup/NODE${node}/backup/check${node}.out
else
     db2ckbkp $HOME/db2backup/NODE${node}/backup/*NODE000${node}* > $HOME/db2backup/NODE${node}/backup/check${node}.out
fi
exit 0

Validate the checkout files and make sure all the backups are fine.
7BCUCopy the backup images over to ISASUse scp or sftp command within the script to copy the database backup images from source to target server
8BCU
  • Un-Quiesce the instance
  • Start the Cognos services to facilitate reporting access to the business
  • No new data loads are permitted
  • Communication to the business
UNQUIESCE INSTANCE and Cognos related Commands
9ISASDisable High Availability Component TSA
  • Run the command samctrl -M T on the administration node as a ROOT user and validate it by running lssam command
    Sample Output:
    Online IBM.ResourceGroup:SA-nfsserver-rg Automation=Manual Nominal=Online
    Online IBM.AgFileSystem:shared_db2home
  • Run the change resource command to instruct Reliable Scalable Cluster Technology (RSCT) not to reboot a node in case of any communication problem affecting the cluster
    chrsrc -c IBM.PeerNodeCritRsrcProtMethod=5
  • Disable DB2 cluster integration
    db2 UPDATE DBM CFG USING CLUSTER_MGR NULL; db2stop; db2start
    Note: The target ISAS was prebuilt and already has all resources, groups and equivalency defined for the 65 node database.
10ISASBack up all the pre-change configurations from the target system:
  • db2nodes.cfg
  • DB2 database manager configuration parameter
  • DB2 database configuration parameters
  • DB2 registry variables
  • Instance profile
  • TSA policy
  • db2cfexp
  • sampolicy -s sampolicy_save.xml
  • cp -p ~/sqllib/cfg/db2ha.sys db2ha.sys.save
  • db2hareg -dump > hareg.dump.save
11ISASDrop the existing database shell and stop the instanceDROP DATABASE BCUDB; db2stop
12ISASReduce the number of database partitions to 41 and start the instance
  • Modify the db2nodes.cfg file, sample file would look like
    0 isasadm001 0 isasadm001
    1 isasdata001 1 isasdata001
    2 isasdata001 2 isasdata001
    .......
    .......
    40 isasdata005 7 isasdata005
  • db2start
13ISASApply the pre-restore configuration changesUpdate the database manager configuration parameter DIAGLEVEL to 4 and HEALTH_MON to OFF.
14ISASRestore database partition 0 and then remaining 40 database partitions NODE0001 to NODE0040Catalog Partition:
db2_all "\"<<+0< db2 RESTORE DATABASE BCUDB FROM $HOME/db2backup/NODE##/backup WITHOUT ROLLING FORWARD WITHOUT PROMPTING"

Non Catalog Partition in parallel:
db2_all "||\"<<-0< db2 RESTORE DATABASE BCUDB FROM $HOME/db2backup/NODE##/backup WITHOUT ROLLING FORWARD WITHOUT PROMPTING"
15ISASActivate database and do preliminary checksACTIVATE DATABASE BCUDB;
check all the table space states, table data accessibility
16ISASIdentify Materialized Query Tables (MQT), extract the DDLs, and drop them
  • List MQT Tables
    SELECT TRIM(TABSCHEMA)||'.'||TRIM(TABNAME) fromSYSCAT.TABLES where TYPE='S';
  • Extract DDL’s for all the MQT objects
    db2look -d BCUDB -e -xd-z <tabschema> -t <mqt-table> -o <tabschema>.<mqt-table>.ddl
  • Drop the MQT Tables
    db2 "SELECT 'DROP TABLE '|| TRIM(TABSCHEMA)||'.'||TRIM(TABNAME)||';' from SYSCAT.TABLES where TYPE='S' " > dropMQT
    db2 -tvf dropMQT
17ISAS
  • Check for abnormal state tables and rectify if any
  • Stop event monitors on the database
  • Enable any DMS table space that has auto-resize disabled
  • SELECT SUBSTR(rtrim(TABSCHEMA)), SUBSTR(rtrim(TABNAME), AVAILABLE FROM SYSIBMADM.ADMINTABINFO WHERE TABNAME = '<tbname>';
    SELECT substr(rtrim(TABSCHEMA)||'.'||rtrim(TABNAME),1,50) FROM SYSCAT.TABLES WHERE STATUS = 'C';
  • SET EVENT MONITOR <eventmon> STATE 0;
  • ALTER TABLESPACE <tbspace> AUTORESIZE YES;
18ISASPrune the recovery history file and deactivate the databasePRUNE HISTORY <date>;DEACTIVATE DB BCUDB;
19ISASAdd new database partitions 41 to 64
  • db2stop; db2start
  • NODE0041 to NODE0048
    db2start DBPARTITIONNUM 41 ADD DBPARTITIONNUM HOSTNAME isasdata006 PORT 0 WITHOUT TABLESPACES
    .......
    .......
    db2start DBPARTITIONNUM 48 ADD DBPARTITIONNUM HOSTNAME isasdata006 PORT 7 WITHOUT TABLESPACES
  • NODE0049 to NODE0056
    db2start DBPARTITIONNUM 49 ADD DBPARTITIONNUM HOSTNAME isasdata007 PORT 0 WITHOUT TABLESPACES
    .......
    .......
    db2start DBPARTITIONNUM 56 ADD DBPARTITIONNUM HOSTNAME isasdata007 PORT 7 WITHOUT TABLESPACES
  • NODE0057 to NODE0064
    db2start DBPARTITIONNUM 57 ADD DBPARTITIONNUM HOSTNAME isasdata008 PORT 0 WITHOUT TABLESPACES
    .......
    .......
    db2start DBPARTITIONNUM 64 ADD DBPARTITIONNUM HOSTNAME isasdata008 PORT 7 WITHOUT TABLESPACES
Sample output:
08/06/2013 21:04:26 8 0 SQL1489I The add database partition server operation was successful. The new database partition server "41" is active
SQL1489I The add database partition server operation was successful. The new database partition server "41" is active.
Note: Suggest to do a full db2stop and db2start after adding the first partition on each new server, in this case after adding 41, 49 and 57 partitions.
20ISASConfirm that the new database partitions have been added to the instance by checking the db2nodes.cfg file You should now see nodes 41 to 64 in the db2nodes.cfg nodes configuration file
0 isasadm001 0 isasadm001
1 isasdata001 1 isasdata001
2 isasdata001 2 isasdata001
.......
.......
63 isasdata008 6 isasdata008
64 isasdata008 7 isasdata008
21ISASAdd temporary table space containers on the newly added nodesALTER TABLESPACE "DB2TMP" ADD (FILE '/db2fs/bculinux/NODE00 $N/BCUDB/temp/db2tmp_001' 80000) ON DBPARTITIONNUMS (41 to 64) AUTORESIZE NO;
22ISASAlter IBMDEFAULTGROUP and PDPG database partition groups
  • ALTER DATABASE PARTITION GROUP IBMDEFAULTGROUP ADD DBPARTITIONNUMS (41 to 64) WITHOUT TABLESPACES;
  • ALTER DATABASE PARTITION GROUP PDPG ADD DBPARTITIONNUMS (41 to 64) WITHOUT TABLESPACES;
Sample output:
SQL1759W Redistribute database partition group is required to change database partitioning for objects in database partition group "PDPG" to include some added database partitions or exclude some dropped database partitions. SQLSTATE=01618
23ISASAdd container paths to the extended PDPG partitionsALTER TABLESPACE "TS_LARGE_DATA" ADD (FILE '/db2fs/bculinux/NODE00 $N/BCUDB/data/ts_large_data' 90000) ON DBPARTITIONNUMS (41 to 65) AUTORESIZE YES;
24ISASTake an offline backup of database BCUDB to the local diskThis step is necessary to save all the work you have done until step 24. These backup images can be used to restore in case of any problems during the data redistribution step and could save many hours
25ISASData Re-distribution Pre-Requisites
  • Increase the UTIL_HEAP_SZ database configuration parameter from 65536 to 200000
  • Increase SORTHEAP database configuration parameter from 8000 to 48000
  • Stop Optim Performance Manager (OPM)
  • Disable the transaction log archival process by setting database configuration parameter LOGARCHMETH1 to OFF
26ISASRedistribute the data across expanded database partitions for the partition groups IBMDEFAULTGROUP and PDPG
  • On IBMDEFAULTGROUP Partition Group
    REDISTRIBUTE DATABASE PARTITION GROUP IBMDEFAULTGROUP NOT ROLLFORWARD RECOVERABLE UNIFORM DATA BUFFER 180000 INDEXING MODE DEFERRED STATISTICS NONE QUIESCE DATABASE YES;
  • On PDPG Partition Group
    REDISTRIBUTE DATABASE PARTITION GROUP PDPG NOT ROLLFORWARD RECOVERABLE UNIFORM DATA BUFFER 180000 INDEXING MODE DEFERRED STATISTICS NONE QUIESCE DATABASE YES;
  • Access the tables or restart the database to activate index rebuild at the end of data redistribution.
The data redistribution can easily be monitored using the command LIST UTILITIES SHOW DETAIL.
27ISASTake an offline backup of database BCUDB to the local diskThis backup is just a place holder, and can be ignored. However it is recommended to take a copy of the latest redistributed database backup.
28ISASPerform the post-redistribution tasks
  • Reduce the table space high water mark
    ALTER TABLESPACE <tbspace> LOWER HIGH WATER MARK;
    ALTER TABLESPACE <tbspace> REDUCE (ALL x G);
         where x represents the amount of space to reclaim.
  • Update the configuration parameters LOGARCHMETH1, SORTHEAP, UTIL_HEAP_SZ and restart the instance
  • Recreate MQT tables using the scripts available from step (17)
  • Switch the event monitor ON
  • Perform RUNSTATS on all the tables
29ISASTake the first backup of database to TSMBACKUP DATABASE BCUDB ON ALL DBPARTITIONNUMS USE TSM WITHOUT PROMPTING;
If you can’t facilitate as many TSM mount points as required, backup the database in chunks using DBPARTITIONNUMS (db-partition-number [to db-partition-number], …) clause in the backup command.
30ISASEnable High Availability component TSA and connect applications
  • Run the command samctrl -M F on the administration node as a ROOT user and validate it by running lssam command
    Sample Output:
    Online IBM.ResourceGroup:SA-nfsserver-rg Nominal=Online Nominal=Online
    Online IBM.AgFileSystem:shared_db2home
  • Run the change resource command to revert Reliable Scalable Cluster Technology (RSCT) to the critical resource protection method.
    chrsrc -c IBM.PeerNodeCritRsrcProtMethod=0
  • Test application connections and communicate the ISAS system availability to the business

Furthermore

In case you are trying to migrate a database from a higher number of database partition source systems to a small number of database partition target systems, add additional numbers of partitions to the target system at Step 12 and drop the partition after a successful database restore using the following set of commands at Step 20:

  1. Redistribute data uniformly across the other partitions from the partition (x) where x is the databases partition you intend to delete:
    REDISTRIBUTE DATABASE PARTITION GROUP PDPG UNIFORM DROP DBPARTITIONNUM (40);
  2. Force all the application connections and verify:
    FORCE APPPLICATIONS ALL;
  3. Perform the partition drop verify before dropping it:
    export DB2NODE=40;
    TERMINATE; DROP DBPARTITIONNUM VERIFY;
    SQL6034W Database partition "40" is not being used by any databases.
    If the intended partition is in use, you will get SQL6035W warning message. In such cases, db2diag.log is your guide to know what is stopping the drop dbpartitionnum command to complete successfully.
  4. Actual database partitions delete statement:
    export DB2NODE=
    TERMINATE;STOP DATABASE MANAGER DROP DBPARTITIONNUM 40;
    SQL6076W Warning! This command will remove all database files on the node for this instance. Before continuing, ensure that there is no user data on this node by running the DROP NODE VERIFY command.
    Do you want to continue ? (y/n) y

You can also follow the process of decreasing the source database partition numbers to that of the target server; then take the backup and restore it over the target system. However remember, this leads to a much bigger outage compared to the earlier demonstrated process.


Conclusion

This article describes how to use the DB2's database restore and DPF scale-out techniques to migrate terabytes database from BCU to ISAS meeting business requirements. You should now be able to tweak the steps to address your database migration requirements irrespective of the number of nodes and the configuration.


Acknowledgement

Special thanks to some wonderful people for helping me complete the article:

  • Colin A Chapman (DB2 Expert, IBM UK)
  • Simon Woodcock (Data Warehouse Expert, IBM Hursley Lab UK)
  • Nelson Coish (President, Coish Consulting Inc, Canada)
  • John Rutter (Systems Analyst, Reckitt Benckiser UK)
  • Karthikraj Kanagaraj (Sify Technologies, UK)
  • Matt Gee (SBA, Reckitt Benckiser, UK)

Resources

Learn

Get products and technologies

  • Evaluate IBM products in the way that suits you best: Download a product trial, try a product online, use a product in a cloud environment, or spend a few hours in the DB2 Technology Preview learning what's new in DB2 10.5.

Discuss

  • Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=944742
ArticleTitle=Migrate terabytes of data from IBM Balanced Warehouse to IBM Smart Analytics System
publish-date=09122013