SHOW_HIVE_COMPACTIONS stored procedure

The SHOW_HIVE_COMPACTIONS stored procedure returns information about all currently running or recently completed compactions. Compaction is the aggregation of small delta directories and files (which are created when data in a transactional table is updated) into a single directory, thereby preventing the performance degradations that might otherwise occur.

Authorization

Only the bigsql user or a user with Db2® Big SQL administrative privileges can run this Hadoop procedure. However, the bigsql user can grant EXECUTE privileges to any user, group, or role.

Syntax

Read syntax diagramSkip visual syntax diagramSHOW_HIVE_COMPACTIONS

Description

The function returns the following fields:

Table 1. Information returned by the SHOW_HIVE_COMPACTIONS stored procedure
Field Description
COMPACTION_ID Identifier for a compaction.
HIVE_SCHEMA Name of the Hive schema (or database) to which the compaction applies.
HIVE_TABLE Name of the Hive table to which the compaction applies.
PARTITION For partitioned tables, the partition-spec of the partition to which the compaction applies. For nonpartitioned tables, this field contains the null value.
TYPE The type of compaction that is being applied. Valid values are MINOR and MAJOR.
STATE The current state of the compaction. Valid values are:
  • attempted
  • failed
  • initiated
  • ready for cleaning
  • succeeded
  • working
WORKER_ID Thread ID of the Java thread for a currently running compaction. If the state is not working, this field contains the null value.
START_TIME Timestamp corresponding to the time at which the compaction started.
DURATION Length of time, in milliseconds, taken to complete a successful compaction. This field contains the null value for any compactions that have not yet completed or that were unsuccessful.
HADOOP_JOB_ID Identifier for the Hadoop job that is used to run the compaction.

Usage notes

The number of completed compactions that are reported is determined by the following Hive configuration settings:
hive.compactor.history.retention.succeeded
Specifies the number of successful compactions (per table or partition) to retain in the compaction history (the default is 3).
hive.compactor.history.retention.failed
Specifies the number of unsuccessful compactions (per table or partition) to retain in the compaction history (the default is 3).
hive.compactor.history.retention.attempted
Specifies the number of "failed-to-start" compaction attempts to retain in the compaction history (the default is 3).
hive.compactor.history.reaper.interval
Specifies how often the compaction history is to be purged (the default is 2m).

Example

  • Retrieve information about all currently running or recently completed compactions.
    CALL SYSHADOOP.SHOW_HIVE_COMPACTIONS;
    
      Result set 1
      --------------
    
      COMPACTION_ID HIVE_SCHEMA HIVE_TABLE PARTITION     TYPE   STATE      WORKER_ID           START_TIME           DURATION HADOOP_JOB_ID          
      ------------- ----------- ---------- ------------- ------ ---------- ------------------- -------------------- -------- -----------------------
      118           sales       orders     region=SOUTH  MINOR  working    example.ibm.com-35  2020-01-27-17.30.00  -        job_1579190585432_0016 
      117           sales       orders     region=NORTH  MINOR  working    example.ibm.com-41  2020-01-27-17.29.50  -        job_1579190585432_0015 
      116           sales       orders     region=WEST   MINOR  working    example.ibm.com-38  2020-01-27-17.29.45  -        job_1579190585432_0014 
      115           sales       orders     region=EAST   MINOR  working    example.ibm.com-40  2020-01-27-17.29.15  -        job_1579190585432_0013 
      114           hr          emp        -             MAJOR  succeeded  -                   2020-01-27-17.18.30  4000     None                   
      113           hr          emp        -             MAJOR  succeeded  -                   2020-01-27-17.18.30  644000   job_1579190585432_0007 
    
      6 record(s) selected.