FKEY migration script
The FKEY migration script is a fix to avoid potential rare collisions in the FKEY file identifier when ingesting billions of records.
About this task
The FKEY migration script fix consists of a script that needs to be run in the Db2 terminal of Data Cataloging.
Procedure
-
Run the following command to do SSH to the
c-Db2u-0 pod
.oc -n ibm-data-cataloging rsh c-isd-db2u-0 su - db2inst1
- Create a file named FKEY_Updater.sh with the FKEY migration
script. The script iterates over a list of data sources, fixes the FKEY in all the files, and ingests them in each data source of the previous Data Cataloging service versions.
# ======================================================== # How to invoke # Send the list of datasources to fix as follows: # ./FKEY_Updater.sh datasource1 datasource2 datasource3 # ======================================================== start_date=`date +%s.%N` datasourcearray=( "$@" ) echo "Connecting to database..." db2 connect to bludb echo "Connected to BLUDB." printf '\n' echo "List of datasources to update:" printf ' - %s\n' "${datasourcearray[@]}" printf '\n' for datasource in "${datasourcearray[@]}" do echo "Starting update procedures for datasource '${datasource}'..." printf '\n' echo "Updating table ACESMAPLOADBASE..." db2 "update bluadmin.acesmaploadbase amlb set amlb.fkey=(mo.cluster || '_' || mo.datasource || '_' || mo.inode) from bluadmin.metaocean mo where amlb.fkey=mo.fkey and mo.datasource='${datasource}';" echo "ACESMAPLOADBASE table updated successfully." printf '\n' echo "Updating table ACOGMAPLOADBASE..." db2 "update bluadmin.acogmaploadbase acmlb set acmlb.fkey=(mo.cluster || '_' || mo.datasource || '_' || mo.inode) from bluadmin.metaocean mo where acmlb.fkey=mo.fkey and mo.datasource='${datasource}';" echo "ACOGMAPLOADBASE table updated successfully." printf '\n' echo "Updating table ACESMAP (This action could take several minutes)..." db2 "update bluadmin.acesmap am set am.fkey=(mo.cluster || '_' || mo.datasource || '_' || mo.inode) from bluadmin.metaocean mo where am.fkey=mo.fkey and mo.datasource='${datasource}';" echo "ACESMAP table updated successfully." printf '\n' echo "Updating table ACOGMAP (This action could take several minutes)..." db2 "update bluadmin.acogmap acm set acm.fkey=(mo.cluster || '_' || mo.datasource || '_' || mo.inode) from bluadmin.metaocean mo where acm.fkey=mo.fkey and mo.datasource='${datasource}';" echo "ACOGMAP table updated successfully." printf '\n' echo "Updating table METAOCEAN (This action could take several minutes)..." db2 "update bluadmin.metaocean mo set mo.fkey=(mo.cluster || '_' || mo.datasource || '_' || mo.inode) where not REGEXP_LIKE(mo.fkey, mo.cluster || '_' || mo.datasource || '_' || mo.inode) and mo.datasource='${datasource}';" echo "METAOCEAN table updated successfully." printf '\n' echo "Updates on datasource '${datasource}' done successfully." printf '\n' done printf '\n' end_date=`date +%s.%N` runtime=$(echo "$end_date - $start_date" | bc -l) echo "Execution time was $runtime seconds."
- Run the script.
./FKEY_Updater.sh datasource1 datasource2 datasource3
The final execution time of the script depends on the number of files that are ingested in the database.