SHOW_HIVE_COMPACTIONS stored procedure

The SHOW_HIVE_COMPACTIONS stored procedure returns information about all currently running or recently completed compactions. Compaction is the aggregation of small delta directories and files (which are created when data in a transactional table is updated) into a single directory, thereby preventing the performance degradations that might otherwise occur.

Authorization

Only the bigsql user or a user with Db2® Big SQL administrative privileges can run this Hadoop procedure. However, the bigsql user can grant EXECUTE privileges to any user, group, or role.

Syntax

Description

The function returns the following fields:

Table 1. Information returned by the SHOW_HIVE_COMPACTIONS stored procedure
Field	Description
COMPACTION_ID	Identifier for a compaction.
HIVE_SCHEMA	Name of the Hive schema (or database) to which the compaction applies.
HIVE_TABLE	Name of the Hive table to which the compaction applies.
PARTITION	For partitioned tables, the `partition-spec` of the partition to which the compaction applies. For nonpartitioned tables, this field contains the null value.
TYPE	The type of compaction that is being applied. Valid values are MINOR and MAJOR.
STATE	The current state of the compaction. Valid values are: attempted failed initiated ready for cleaning succeeded working
WORKER_ID	Thread ID of the Java thread for a currently running compaction. If the state is not `working`, this field contains the null value.
START_TIME	Timestamp corresponding to the time at which the compaction started.
DURATION	Length of time, in milliseconds, taken to complete a successful compaction. This field contains the null value for any compactions that have not yet completed or that were unsuccessful.
HADOOP_JOB_ID	Identifier for the Hadoop job that is used to run the compaction.

Usage notes

The number of completed compactions that are reported is determined by the following Hive configuration settings:

hive.compactor.history.retention.succeeded: Specifies the number of successful compactions (per table or partition) to retain in the compaction history (the default is 3).
hive.compactor.history.retention.failed: Specifies the number of unsuccessful compactions (per table or partition) to retain in the compaction history (the default is 3).
hive.compactor.history.retention.attempted: Specifies the number of "failed-to-start" compaction attempts to retain in the compaction history (the default is 3).
hive.compactor.history.reaper.interval: Specifies how often the compaction history is to be purged (the default is 2m).

Example

Retrieve information about all currently running or recently completed compactions.

CALL SYSHADOOP.SHOW_HIVE_COMPACTIONS;

  Result set 1
  --------------

  COMPACTION_ID HIVE_SCHEMA HIVE_TABLE PARTITION     TYPE   STATE      WORKER_ID           START_TIME           DURATION HADOOP_JOB_ID          
  ------------- ----------- ---------- ------------- ------ ---------- ------------------- -------------------- -------- -----------------------
  118           sales       orders     region=SOUTH  MINOR  working    example.ibm.com-35  2020-01-27-17.30.00  -        job_1579190585432_0016 
  117           sales       orders     region=NORTH  MINOR  working    example.ibm.com-41  2020-01-27-17.29.50  -        job_1579190585432_0015 
  116           sales       orders     region=WEST   MINOR  working    example.ibm.com-38  2020-01-27-17.29.45  -        job_1579190585432_0014 
  115           sales       orders     region=EAST   MINOR  working    example.ibm.com-40  2020-01-27-17.29.15  -        job_1579190585432_0013 
  114           hr          emp        -             MAJOR  succeeded  -                   2020-01-27-17.18.30  4000     None                   
  113           hr          emp        -             MAJOR  succeeded  -                   2020-01-27-17.18.30  644000   job_1579190585432_0007 

  6 record(s) selected.