Automatic Hive catalog syncing to the Db2 Big SQL catalog
Tables that are created, altered, or dropped by Hive clients can have the associated catalog changes automatically applied to Db2® Big SQL.
Enabling automatic syncing
Automatic syncing of the Db2 Big SQL catalog and the Hive metastore is enabled by default. If you have disabled automatic syncing (see Disabling automatic syncing), you can enable this feature again . You can use the bigsql-admin Db2 Big SQL cluster administration utility to enable automatic syncing of the Db2 Big SQL catalog and the Hive metastore.
db2 "CALL SYSPROC.ADMIN_TASK_ADD ('Synchronise MetaData Changes from Hive',
NULL, NULL, NULL, NULL, 'SYSHADOOP', 'HCAT_AUTOSYNC_OBJECTS', NULL, NULL, NULL)
db2 "CALL SYSHADOOP.HCAT_SYNC_OBJECTS('.*', '.*', 'a', 'REPLACE', 'CONTINUE', 'TRANSFER OWNERSHIP TO HIVEOWNER')"
- Non-transactional bucketed tables are not supported in Db2 Big SQL and therefore are not synchronized.
- Tables that are created under the Hive default schema are not automatically synced; you must synchronize these tables manually if you want them in Db2 Big SQL.
Monitoring the synchronization
The bigsql.log file receives log information directly from the HCAT_AUTOSYNC_OBJECTS process.
Automatic syncing uses a scheduled DB2® administrative task
scheduler (ATS) sync procedure. The description of the sync procedure is Synchronise MetaData
Changes from Hive
. You can monitor this procedure like any other ATS scheduled procedures. For
information about ATS and how to monitor such procedures, see Administrative task scheduler.
By default, event files are placed in the HDFS under /user/bigsql/sync. The file pattern for those files is fixed, and only files that match this pattern are processed. The general file pattern is DDL-timestamp-UUID.json. For example: DDL-[0-9]{17}-[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}.json. If files are left in this directory for some time, it is good practice to investigate. If you disable automatic syncing, files collect in this directory but are not processed. You must determine whether or not such files should be retained.
Disabling automatic syncing
db2 "CALL SYSPROC.ADMIN_TASK_REMOVE ('Synchronise MetaData Changes from Hive',NULL)"
Synchronization errors
In the event of a synchronization failure, the associated event-file is moved from the events-directory (by default: /biginsights/bigsql/sync) to an "errors" sub-directory (by default: /biginsights/bigsql/sync/errors). It is good practice to monitor this directory for any synchronization errors and address any issues where necessary.
When any issues have been addressed, and where synchronization for the associated object(s) is still required, the relevant event-files can then be moved back into the events-directory (by default: /biginsights/bigsql/sync) where automatic synchronization will take place.
For more information, see Automatic syncing of the Db2 Big SQL and Hive catalogs.