GET_CACHE_FILE_INFO table function

The SYSHADOOP.GET_CACHE_FILE_INFO table function provides a method for retrieving information about the files for tables that are cataloged in the Hive metastore and that might be cached in the scheduler cache.

This information, together with output from the GET_CACHE_TABLE_INFO table function, can be useful for the analysis of scenarios where performance might be impacted by data fragmentation or suboptimal data layout.

Syntax

Read syntax diagramSkip visual syntax diagram GET_CACHE_FILE_INFO ( schema-nameregex-patternnull,table-nameregex-patternnull )

Description

schema-name | regex-pattern
Specifies a valid schema name or a regular expression pattern to match against table schemas in the scheduler cache.
table-name | regex-pattern
Specifies a valid table name or a regular expression pattern to match against table names in the scheduler cache.
The function returns one row per file for all data files in one or more tables. Each row contains the following columns:
Table 1. Information returned by the GET_CACHE_FILE_INFO table function
Column name Data type Description
TABSCHEMA VARCHAR(128) Schema name of the table as it appears in the Db2® catalog
TABNAME VARCHAR(128) Table name as it appears in the Db2 catalog
LOCATION VARCHAR(1000) Directory location of a data file for the table
FILE_NAME VARCHAR(1000) Name of the data file
FILE_SIZE BIGINT Size, in bytes, of the data file
MODIFICATION_TIME TIMESTAMP(3) Time at which the file was last modified, as reported by the file system

Usage

If the arguments identify a fully qualified single table (neither argument is a regular expression or the null value), and the metadata for that table is not available in the cache, the table is loaded from the file system to collect metadata for each file.

If either argument is a regular expression, results are returned only for matching tables that are currently cached. Null input arguments or empty strings are treated as regular expressions that match any schema or table name.

If a table does not exist, or no tables that match the input criteria are found, an empty result set is returned.

This table function does not return any information for regular Db2 tables that are not cataloged in the Hive metastore.

Examples

The following example shows you how to get information about all data files for a specific table, regardless of whether or not the table is currently cached, by explicitly specifying both input parameters:
SELECT
    VARCHAR(TABSCHEMA,12) AS TABSCHEMA,
    VARCHAR(TABNAME,24) AS TABNAME,
    VARCHAR(LOCATION,110) AS LOCATION,
    VARCHAR(FILE_NAME, 60) AS FILE_NAME,
    FILE_SIZE
  FROM TABLE(SYSHADOOP.GET_CACHE_FILE_INFO('my_schema', 'partition_example'));

TABSCHEMA    TABNAME                  LOCATION                                                                                                       FILE_NAME                                                    FILE_SIZE           
------------ ------------------------ -------------------------------------------------------------------------------------------------------------- ------------------------------------------------------------ --------------------
MY_SCHEMA    PARTITION_EXAMPLE        hdfs://eikon1.fyre.ibm.com:8020/warehouse/tablespace/external/hive/my_schema.db/partition_example/part=2       i_1588258707476_-1821197269_20200430080734642_2.0                             521
MY_SCHEMA    PARTITION_EXAMPLE        hdfs://eikon1.fyre.ibm.com:8020/warehouse/tablespace/external/hive/my_schema.db/partition_example/part=2       i_1588258707476_911464508_20200430080608522_2.0                               487
MY_SCHEMA    PARTITION_EXAMPLE        hdfs://eikon1.fyre.ibm.com:8020/warehouse/tablespace/external/hive/my_schema.db/partition_example/part=2       l1587042108424-84-r-00000                                                     423
MY_SCHEMA    PARTITION_EXAMPLE        hdfs://eikon1.fyre.ibm.com:8020/warehouse/tablespace/external/hive/my_schema.db/partition_example/part=1       i_1588258707476_-1821197269_20200430080734642_2.0                             457
MY_SCHEMA    PARTITION_EXAMPLE        hdfs://eikon1.fyre.ibm.com:8020/warehouse/tablespace/external/hive/my_schema.db/partition_example/part=1       i_1588258707476_911464508_20200430080608522_2.0                               432
MY_SCHEMA    PARTITION_EXAMPLE        hdfs://eikon1.fyre.ibm.com:8020/warehouse/tablespace/external/hive/my_schema.db/partition_example/part=1       l1587042108424-84-r-00002                                                     380
MY_SCHEMA    PARTITION_EXAMPLE        hdfs://eikon1.fyre.ibm.com:8020/warehouse/tablespace/external/hive/my_schema.db/partition_example/part=3       i_1588258707476_-1821197269_20200430080734642_2.0                             348
MY_SCHEMA    PARTITION_EXAMPLE        hdfs://eikon1.fyre.ibm.com:8020/warehouse/tablespace/external/hive/my_schema.db/partition_example/part=3       i_1588258707476_911464508_20200430080608522_2.0                               348

  8 record(s) selected.
The following example shows you how to get information about data files for one or more tables that are currently in the cache, by using regular expressions:
SELECT
    VARCHAR(TABSCHEMA,12) AS TABSCHEMA,
    VARCHAR(TABNAME,24) AS TABNAME,
    VARCHAR(FILE_NAME,60) AS FILE_NAME,
    FILE_SIZE
  FROM TABLE(SYSHADOOP.GET_CACHE_FILE_INFO('.*', '(simp|parq).*_example'));

TABSCHEMA    TABNAME                  FILE_NAME                                                    FILE_SIZE           
------------ ------------------------ ------------------------------------------------------------ --------------------
MY_SCHEMA    SIMPLE_EXAMPLE           i_1588258707476_-255077949_20200430080445125_2.0                               82
MY_SCHEMA    SIMPLE_EXAMPLE           i_1588258707476_1307227570_20200430080601151_2.0                              126
MY_SCHEMA    SIMPLE_EXAMPLE           l1587042108424-83-m-00000                                                      44
MY_SCHEMA    PARQUET_EXAMPLE          i_1588258707476_-1266060962_20200430080743210_2.0                             942

  4 record(s) selected.