Skip to main content

Data preservation and data migration in Data Studio Administrator V1.2

Use Data Studio Administrator to preserve and migrate data during a complex database change

Carolyn Henry (henryca@us.ibm.com), Information Developer, IBM, Software Group
Carolyn Henry photo
Carolyn Henry is an Information Developer in the DB2 and IMS Tools group at IBM Silicon Valley Laboratory in San Jose, CA. She has worked on the DB2 Change Management Expert team since 2004
Sangeetha Srinivasan
Sangeetha Srinivasan is a developer in the DB2 Tools group at IBM Silicon Valley Laboratory in San Jose, CA. She has been working on DB2 Change Management Expert since 2006. Prior to this position, she was working on DB2 for Z/OS native stress testing. She holds a Masters in computer engineering from San Jose State University.
Jayashree Ramachandran is a Developer in the Information Management Tools group at IBM Silicon Valley Laboratory in San Jose, CA. She has worked on the DB2 Change Management Expert team since 2004.
Tina Chen, Data Studio Enablement, IBM
Author Photo: Tina Chen
Tina Chen is solution architect from IBM's Data Studio development organization, based in IBM Silicon Valley Lab in San Jose, Calif. Tina is responsible for the technical development of partnerships and customer advocates for IBM Data Studio.

Summary:  Do you need to make complex changes to your database, such as moving columns to another table, while at same time preserving your existing data? With Data Studio Administrator for DB2® for Linux®, UNIX®, and Windows® (formerly known as DB2 Change Management Expert) you can preserve your valuable data while making complex changes to the database. This article follows a scenario in which a DBA uses Data Studio Administrator to create a new table and then populates it with data from an existing table.

Date:  25 Sep 2008 (Published 23 Aug 2007)
Level:  Intermediate PDF:  A4 and Letter (1500KB | 24 pages)Get Adobe® Reader®
Activity:  981 views
Comments:  

Note: This article is an update of an article about DB2 Change Management Expert. This article reflects the new name, IBM Data Studio Administrator for DB2 for Linux, UNIX, and Windows, and other changes that are available with Data Studio Administrator Version 1.2.

Introduction

This article illustrates how a DBA at a fictitious company, JK Enterprises, uses IBM Data Studio Administrator to make a change to their DB2 database while preserving the data. The DBA also uses Data Studio Administrator to verify and test changes on target models before they are deployed to the production database.

First, review the steps that Data Studio Administrator will help you perform when making complex changes:

  • Quickly identify which database objects you want to change using a rich, interactive user interface
  • Create a deployment script to manage the changes
  • Implement those changes on a model
  • Identify which views, functions, procedures, and triggers are rendered inoperative by the changes and which ones are still operative after the change
  • Use the deployment script to generate SQL to apply the changes
  • Preserve the data by using export and import commands
  • Deploy changes to the database

Prerequisites

This scenario uses Data Studio Administrator Version 1, Release 2. To complete this scenario, you must have Data Studio Administrator installed on your machine (see the Resources section).

Important: This scenario uses the DSADEMO database. You can install the Data Definition Language (DDL) (CreateDSADEMO.chx) to create and set up this database by downloading the sample02.zip file from the Download section of this article and extracting the contents of the sample02.zip file into a local directory. The following are the steps for setting up the database:

  1. Import the project named test from sample02.zip by selecting File > Import > General > Existing Projects into Workspace.
  2. Click Select archive file, and browse for the location sample02.zip. Click the check box next to test in the Project field.

    Figure 1. Import project into workspace
    Import project into workspace

  3. Click Finish to complete the import of the project test.
  4. In the Data Project Explorer view of Data Studio Administrator, expand the contents of the SQL Scripts folder in the test project, right-click the CreateDSADEMO.chx file and select Run SQL.
  5. Verify that the appropriate database version is selected. Enter your username and password, leave the check box for Create Deployment Project and Script file unchecked, and click Finish.
  6. In the Database Explorer view, verify that the DSADEMO database was created and that a connection exists. You can now continue to complete the steps throughout this article.

Scenario overview

JK Enterprises has determined that they need to separate the salary and compensation information for their employees from the rest of the employee information. For example, they have decided that only management should have access to the sensitive compensation information of their employees. If anyone else wants to run employee reports, this information will not be available. You, as the JK Enterprises DBA, must now make a change to the database to add a new table with four new columns that contain the data from the original table. With just a few clicks, you can do this within Data Studio Administrator and ensure that the original data is preserved so that you can later load that data into the new table.

This article is divided into two parts to reflect the two main tasks:

  1. Creating the new table to hold sensitive salary information and migrating data from the old table to this new one.
  2. Deleting the sensitive columns from the old table and preserving the data from those columns for audit purposes or to ensure the ability to do an undo.

Part 1: Create and populate the new table

To separate the salary and compensation information from the employee table, you need to create a change-in-place deployment script that performs the following steps:

You must complete the following steps to create the new EMPLOYEEPAYINFO table and populate the four new columns:

  1. Create a new table called EMPLOYEEINFO. The EMPLOYEEINFO table will have four columns: EMPO, SALARY, BONUS, and COMM.
  2. Create a foreign key to link this new table with the existing employee table.
  3. Migrate the salary data to populate this table from the EMPLOYEE table.

Complete the following steps to create a new Data Design Project, the EMPLOYEEPAYINFO table, and to populate the four new columns from another table:

  1. Create a new deployment script named EmployeeChanges.deployxml.
    1. From the menu, select File > New > Deployment Script to start the Deployment Script wizard.
    2. Enter EmployeeChangeManagement for the project name and EmployeeChanges for the deployment script name.
    3. Click Next to continue.

      Optional: You may be prompted for a Database Connection. Select DSADEMO as the database you would like to use for the Deployment Script creation.

    4. Choose Change in Place as the process and then select the option Migrate Table Data.

      Figure 2. The Deployment Script Process selection page
      The Deployment Script Process selection page

    5. Click Next to continue.
    6. Select the HR schema when prompted to select a schema.
    7. Click Finish to create a new deployment script for the DSADEMO database.

    After you have created a deployment script using Data Studio Administrator, extend the Data Model Folder to see that the base and target models were also created in the EmployeeChangeManagement project.

    The Overview Page, sometimes referred to as the Deployment Script Editor, is your central resource for managing your changes. The four main areas of the Overview Page of the deployment script contain the following information:

    • Data models
    • Change commands
    • Undo changes



    Figure 3. Use the Overview Page as a guide through the change process
    Use the Overview Page as a guide through the change process

  2. On the Overview Page, clickEdit under Target Model. This opens the physical model, which is called EmployeeChanges_target.dbm.
  3. In the Outline View, located on the right, expand the node next to the DSADEMO database. Right-click the HR schema and select New > Table.
  4. The Data Object Properties pane appears in the main section of Data Studio Administrator. To give the table a name, type EMPLOYEEPAYINFO in the name field.
  5. In the EMPLOYEEPAYINFO table, add the four columns: EMPNO, SALARY, BONUS, and COMM.
    1. Click the Columns tab in the Data Object Properties panel.
    2. Click Add New Button (Add New) for each column that you want to create. Create the following columns: EMPNO, SALARY, BONUS, and COMM.
    3. Specify the data types of the columns by double clicking in the data type field and selecting from the drop down list. For data types that have length specifications, double-click and type the values. Use the following list to specify the data types for the columns in the table:
      • The EMPNO column is the SMALLINT data type.
      • The SALARY column is data type DECIMAL (9 length, 2 scale).
      • The BONUS column is data type DECIMAL (9 length , 2 scale).
      • The COMM column is the INTEGER data type.


      Figure 4. The EMPLOYEEPAYINFO table with four additional columns and their properties
      The EMPLOYEEPAYINFO table with four additional columns and their properties

    4. Select the Tables Spaces tab in the Data Object Properties panel.
    5. From the Regular table space drop-down menu, select USERSPACE1 as the table space for this table.
  6. Add a foreign key on the EMPNO column of the EMPLOYEEPAYINFO table that references the EMPNO column of the EMPLOYEE table.
    1. In the Data Object Properties panel, right-click on EMPLOYEEPAYINFO and select New > Foreign Key, as shown in Figure 5.

      Figure 5. The foreign key on EMPLOYEEPAYINFO table
      The foreign key on EMPLOYEEPAYINFO table

    2. In the Add Foreign Key wizard, select the EMPLOYEE table as a reference and click Next to continue.
    3. Select EMPNO in the left-hand Primary Key Columns field. Hold Shift on the keyboard and select EMPNO in the right-hand Foreign Key Columns. Click the Pair Button (Pair) button to pair up the selections.

      Figure 6. Pair the EMPNO columns in EMPLOYEE and EMPLOYEEPAYINFO
      Pair the EMPNO columns in EMPLOYEE and EMPLOYEEPAYINFO

    4. Click Finish.

      Figure 7. The EMPLOYEEPAYINF_FK foreign key that references the EMPNO column primary key
      The EMPLOYEEPAYINF_FK foreign key that references the EMPNO column primary key

  7. On the main menu, click File > Save to save the target model. Remember to save your changes often.
  8. Return to the deployment script by clicking the on the EmployeeChanges.deployxml tab. EmployeeChanges.deployxml tab
  9. Click Generate in the process flow diagram to generate the change commands for the change that you have just specified. The Generate Change Commands wizard opens.
  10. Figure 8 shows the Generate Change Commands wizard with the DDL for your change so that you can review the change commands that are generated.

    Figure 8. The Generate Change Commands wizard with the DDL for your change
    The Generate Change Commands wizard with the DDL for your change

  11. Data Studio Administrator lets data be loaded and unloaded if there are destructive changes. Since you are creating a new table, there are no destructive changes. Click Next when this wizard appears.
  12. The Unload and Reload Information wizard lets you use auto cast to resolve data type conflicts. Since you do not have any destructive changes, click Next to continue.
  13. On the DB2 Maintenance Commands page, the following options are available:
    • Generate Reorg Commands
    • Generate Runstats Commands
    • Generate Flush Package Commands
    • Generate Rebind Commands

    During the change management process, database packages might become inoperative and statistics might become inaccurate. For example, DB2 marks packages as invalid or inoperative when you drop objects. When you generate change commands from the deployment script editor, you can specify that Data Studio Administrator issue rebind commands to recreate packages and run RUNSTATS to update the statistics, which saves you time and effort.

    Figure 9. The DB2 Maintenance Commands page
    The DB2 Maintenance Commands page

    Click Next to continue.
  14. On the last page, review the change commands that are generated, and click Finish.
  15. Data Studio Administrator lets you to view the change report that gives a summary of changes made to the database.

    Optional: Click Yes to view the change report, or else click No to continue.

  16. To migrate data from the EMPLOYEE table to the EMPLOYEEPAYINFO table, click Migrate Data on the Overview Page Panel.
    1. When the Migrate Data Wizard appears, choose the option Add the generated data migration commands to the deployment script, and click Next.

      Figure 9a. Add data migration commands to deployment script
      Add data migration commands to deployment script

    2. When prompted for a database connection, use an existing connection and select the connection DSADEMO. Click Next to continue.
    3. Select the default data preservation providers that you want to use. For this scenario, select EXPORT for DEL data format in the Data Unload Provider field and LOAD in the Data Load Provider field, as shown in Figure 9b.

      Note: If you are unloading large amounts of data, you can use IBM DB2 High Performance Unload as the unload provider. IBM DB2 High Performance Unload can provide you with high speed data unload capability and improved performance. To learn more about IBM DB2 High Performance Unload, see the IBM DB2 High Performance Unload product page.



    Figure 9b. The correct data preservation options for this change
    The correct data preservation options for this change

    Click Next to continue.
  17. On the Map Unload and Reload Tables (Figure 10), pair the original EMPLOYEE table with the new EMPLOYEEPAYINFO table.
    1. Click the Add icon Add table mapping entries in the upper right-hand corner of the page to add a new entry HR.EMPLOYEE.
    2. From the Unload Table drop-down menu, select HR.EMPLOYEE and double-click the Reload Table drop-down. Then, select the HR.EMPLOYEEPAYINFO and click the Pair icon Pair table mapping entries in the upper right-hand corner of the page to pair the two tables. You must complete this pairing to import the data from the old EMPLOYEE table into the new EMPLOYEEPAYINFO table.

      Figure 10. The mapping tables stage of the data migration process
      The mapping tables stage of the data migration process

      Click Next to continue.

  18. The Customize Unload Commands page shows the export commands for your data. Here, select the columns from the Employee table to be exported to the select statement. Navigate to HR > Employee, and drag the EMPNO, SALARY, BONUS, and COMM columns from Database Objects for Unload field to the Customize Query field.

    Figure 10a. Customizing unload commands
    Customizing unload commands

    Click Next to continue.
  19. On the Customize Reload Commands page, column types being unloaded from the EMPLOYEE table must match the column types being reloaded in the EMPLOYEEPAYINFO table. The column type for COMM in Unload Column is a decimal while the column type in Reload Column is an integer. To remove this discrepancy, you must cast the data type.

    Examining the HR.EMPLOYEEPAYINFO table, you should notice Error next to the table. This indicates that there is a mismatch of column types. Right-click the COMM column and select Autocast so that the decimal data is exported as integers that match the data type of the COMM table in the EMPLOYEEPAYINFO table. Figure 11 shows this process.



    Figure 11. Autocasting the data in the original column to match the data type of the new column
    Autocasting the data in the original column to match the data type of the new column

  20. Click Finish.
  21. Optional: To review the change commands:
    1. Under Change Commands click the Customize link on the Overview Page.
    2. Double-click the change command file /EmployeeChangeManagement/EmployeeChanges_deltadd.chx. This opens the file in the Change Command Editor.

      Figure 12. Review the generated commands
      Review the generated commands

      Note: The commands to migrate data have been interleaved in the generated commands in the proper order.

    3. To return to the Overview Page, click the EmployeChanges.deployxml tab, on the bottom tabs, click Overview.
  22. To deploy the changes to your database:
    1. Click Deploy on the Overview Page to deploy Changes to the Target Database.
    2. Complete the steps in the Deploy Change Commands wizard to deploy your changes to the database.

Part 2: Preserve your data after a destructive change

This scenario covers the data preservation feature in Data Studio Administrator. Now that you have created the new EMPLOYEEINFO table, you need to create a separate change-in-place deployment script that lets you delete the salary columns while preserving the data from those columns.

  1. Create a new deployment script named EmployeeChanges_2.deployxml:
    1. In the Data Project Explorer select the EmployeeChangeManagement project, then right-click on the project, and select New > Deployment Script to start the Deployment Script wizard.
    2. Select EmployeeChangeManagement as the project name and enter EmployeeChanges_2 as the deployment script name.
    3. Select DSADEMO as the Database connection, and click Next.
    4. Choose Change in Place as the process, then click Next.
    5. Select HR as the schema to reverse engineer.
    6. Click Finish to complete the deployment script.
  2. Open the target model in the Data Model Editor by clicking Edit on the Overview Page diagram.
  3. In the Outline view located on the right panel, navigate to the EMPLOYEE table.
  4. Click the columns tab to view the columns in the HR.EMPLOYEE table. To delete the columns SALARY, BONUS, and COMM from the HR.EMPLOYEE table, select the column and click Delete Delete Button.

    Figure 13. Deleting the salary information from the HR
    Deleting the salary information from the HR

  5. Save the file using CTRL-S.
  6. Return to the Overview Page Panel by clicking on the EmployeeChanges_2.deployxml tab.
  7. Generate the change commands by clicking Generate on the Overview Page diagram.

    Note: The data preservation commands are interleaved with the DDL commands. The data preservation commands are generated using the default providers (EXPORT for DEL data format and LOAD).

    Data Studio Administrator lets you preview all the possible commands that can be generated. Click Next to continue.

    1. To override the data preservation, select Shadow table, reload using LOAD FROM CURSOR as the Unload provider in the Specify Data Unload and Reload Information wizard, as shown in Figure 14.

      Figure 14. Select Data Preservation using LOAD FROM CURSOR
      Select Data Preservation using LOAD FROM CURSOR

    2. Click Next until you reach the last page, were you can review the generate change commands, as shown in Figure 15.

      Figure 15. Review the generated change commands
      Review the generated change commands

      Note: The data preservation commands and the maintenance commands are interleaved. When you choose Internal Data Preservation, the data is saved within DB2 in a shadow table. The RENAME command is used to create the shadow table, the data is reloaded into the EMPLOYEE table using the LOAD FROM CURSOR command.

    3. Review the commands and click on Finish. You can also review the commands from the Change Commands tab in the Deployment Script Editor. The easy way to get to it is by clicking Customize Commands in the process flow diagram.
  8. Deploy the commands by clicking Deploy in the Overview Page diagram.

Conclusion

You have now provided the solution to the problem that your company identified. You separated the sensitive employee salary information from the other employee data by moving the sensitive data to a new table and deleting the sensitive columns from the original table without losing any of the original data. By using Data Studio Administrator, you avoid the laborious tasks of manually changing your database, identifying and handling all of the dependencies of your data on other tables in the database, and creating a strategy to preserve data on your own. Data Studio Administrator handles all of these contingencies for you and provides you with easy-to-use wizards that walk you through every step of the process.



Download

DescriptionNameSizeDownload method
Files to create and set up the sample databasesample02.zip5KB HTTP

Information about download methods


Resources

Learn

Get products and technologies

Discuss

About the authors

Carolyn Henry photo

Carolyn Henry is an Information Developer in the DB2 and IMS Tools group at IBM Silicon Valley Laboratory in San Jose, CA. She has worked on the DB2 Change Management Expert team since 2004

Sangeetha Srinivasan

Sangeetha Srinivasan is a developer in the DB2 Tools group at IBM Silicon Valley Laboratory in San Jose, CA. She has been working on DB2 Change Management Expert since 2006. Prior to this position, she was working on DB2 for Z/OS native stress testing. She holds a Masters in computer engineering from San Jose State University.

Jayashree Ramachandran is a Developer in the Information Management Tools group at IBM Silicon Valley Laboratory in San Jose, CA. She has worked on the DB2 Change Management Expert team since 2004.

Author Photo: Tina Chen

Tina Chen is solution architect from IBM's Data Studio development organization, based in IBM Silicon Valley Lab in San Jose, Calif. Tina is responsible for the technical development of partnerships and customer advocates for IBM Data Studio.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=250338
ArticleTitle=Data preservation and data migration in Data Studio Administrator V1.2
publish-date=09252008
author1-email=henryca@us.ibm.com
author1-email-cc=
author2-email=srinivsa@us.ibm.com
author2-email-cc=
author3-email=jayshree@us.ibm.com
author3-email-cc=
author4-email=chenti@us.ibm.com
author4-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers