Db2 Big SQL Hive metastore

The Big SQL Hive Metastore (HMS) is a service used to store and retrieve metadata associated with Datalake tables.

The HMS consists of two components:

  • A versioned schema (named SYSHIVE) within the Db2 database where metadata is stored as regular Db2 tables.
  • An Apache Thrift server to interface with the metadata in those regular Db2 tables. The service is a java process running in the Db2 head node container alongside the Big SQL Scheduler process.

Some of the information handled by the HMS (for example, table structure) is also available in the Db2 system catalogs, but other details are only applicable to Datalake tables. The HMS provides a means of storing and interacting with this metadata in a standardized manner. This includes:

  • Table structure (names, columns, data types, partitioning)
  • Data location
  • File and table format (Parquet, ORC, Iceberg, and so on)
  • Table properties
The Big SQL Scheduler interacts with the HMS by using thrift APIs that use an endpoint defined in the Big SQL configuration as shown by running the following example command:
db2 "select varchar(PROPERTY_NAME,20) PROPERTY_NAME, varchar(PROPERTY_VALUE,128) PROPERTY_VALUE from table(SYSHADOOP.GET_Datalake_CONFIG('BIGSQL')) where PROPERTY_NAME = 'hive.metastore.uris'"
resulting in the following example return:

PROPERTY_NAME        PROPERTY_VALUE
-------------------- ------------------------------------------------------
hive.metastore.uris  thrift://c-jun06a-db2u-0.c-jun06a-db2u-internal:9083  

  1 record(s) selected.
The HMS server interacts with Db2 using a JDBC connection by using details stored in the hive configuration as shown by running the following example command:
db2 "select varchar(PROPERTY_NAME,40) PROPERTY_NAME, varchar(PROPERTY_VALUE,128) PROPERTY_VALUE from table(SYSHADOOP.GET_Datalake_CONFIG('HIVE')) where PROPERTY_NAME like 'javax.jdo.option.Connection%'"
resulting in the following example return:

PROPERTY_NAME                            PROPERTY_VALUE                                                                                                                  
---------------------------------------- ---------------------------------------------------------------------------------------------------------------------
javax.jdo.option.ConnectionUserName      hive 
javax.jdo.option.ConnectionDriverName    com.ibm.db2.jcc.DB2Driver 
javax.jdo.option.ConnectionURL           jdbc:db2://c-jun06a-db2u-0.c-jun06a-db2u-internal:50000/BLUDB:currentSchema=SYSHIVE;clientProgramName=HiveMetastore;  

  3 record(s) selected.