IBM Support

Checksum difference investigation for db_migrate

General Page

In case of migrating data, there always exists a possibility that it will not be migrated in the right way. The following section presents the short description of the checksum calculation method, checksum difference investigation reasons and the practical example of its usage.

What is a checksum calculation?

When you migrate data, you need to ensure whether the whole process was done correctly. The method performing that is called a checksum calculation. Checksum is calculated by using SQL statements, fired at both ends - source and target. Separate script, db_checksum, does the process internally. 

As a prerequisite, to perform full checksum calculation you need to install Netezza SQL Extension toolkit and enable it on the source database or schema. You can choose to examine all columns or only a specified subset of them. By default, just a number of records is compared.  It is also required that source and target table structure are the same.

Checksum is based on full content of fields with the following data types:

  • BOOLEAN
  • BYTEINT
  • SMALLINT
  • INTEGER
  • BIGINT
  • NUMERIC
  • CHARACTER
  • DATE
  • TIMESTAMP
  • TIME

 

Checksum difference investigation

There are two possible reasons for the checksum difference: 

  1. The same table re-migrates without the cleanup. Probably the option -truncateTargetTable or -recreateTargetTable is missing and the rowcount might differ. 
  2. Incorrect representation in Db2 causes Latin9 data migration with loader load. 

 

When there is no clear reason recognized, further analysis is necessary. The analysis should lead to finding out which record or column causes the issue. 

Checksums can be compared for the data which is already migrated with a use of the db_checksum script. The script can be also useful for the investigation process. 

 

Example process:

The following screenshot presents an observed difference. 

image

To overcome the issue, run the db_checksum tool for a subset of columns (-ckstype and -cksColList options) by using the following command:

image

In this way, you can narrow down affected columns. The resulting checksum in the above example is equal, which means that there is no difference observed. Then, the next set of columns might be checked, and you can continue with the process until you notice checksum difference.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSHRBY","label":"IBM Integrated Analytics System"},"Component":"db_migrate","Platform":[{"code":"PF004","label":"Appliance"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 October 2019

UID

ibm10728211