VIRTUALIZECOSOBJECT stored procedure

Virtualizes a table from a file or directory on a Cloud Object Storage (COS) (Parquet, ORC, .csv, .tsv, .JSON).

Input parameters

URL
The object or directory URL on the Cloud Object Storage in this format: s3a//<bucket_name>/<path to file or directory>.
OPTIONSCSV
Specifies the comma-separated list of strings for extra options. The following are supported options:
  • INCPARTS: This option type is BOOLEAN (True/False). Specifies whether to include partition columns in the results. The default value is False.
  • COLNAMES: This option type is BOOLEAN (True/False). Specifies whether the first row should be interpreted as column names. The default value is False.
  • QUOTECHAR: This option type is CHAR. A single character that you use for delimiting string columns. The value cannot be a comma, tab, backslash or space (blank).
  • UCASENAMES: This option type is BOOLEAN (True/False). If VIRTDEF is False, then this option specifies simple column names [A-Za-z0-9_] to be uppercase in the exposed virtual table.
  • STRING_TO_VARCHAR_SIZE: This option type is INTEGER(32672). Automatically convert STRING type columns to VARCHAR with the specified value as the length of characters.
VIRTSCHEMA
The type of this required parameter is VARCHAR(128). Specifies the local schema of the virtualized table.
VIRTNAME
The type of this required parameter is VARCHAR(128). Specifies the local name of the virtualized table.
VIRTDEF
The type of this optional parameter is VARCHAR(32672). Specifies a definition of the virtualized table. If this parameter is empty or NULL, then the virtual table definition (the column names and types) is inferred. Follow these guidelines to specify the definition:
  • Ensure the value is a non-bracketed CSV list of elements with the format: column name + Db2 column type, i.e. <column_name_1> <column_type_1>[,<column_name_n> <column_type_n>]*. For more information on Db2 column types, see primitive-type.
  • The value can be different from the column names and types in the file source.
  • The number of elements must be equal to either the number of non-partitioning columns or the total number of columns (non-partitioning and partitioning) from the object. You cannot specify subsets of columns.
  • Column names containing non alpha-numeric characters should be wrapped in double quotation marks, particularly if they contain white space characters or nested double quotation marks. In addition, internal double quotation marks must be escaped by doubling them. For example:
    • Original column name: My "best" column1.
    • Wrapped column name (as specified in VIRTDEF): "My ""best"" column1".
REPLACE
The type of this required parameter is CHAR(1). Valid values are:
  • Y: Replaces the nickname if it exists.
  • N: Does not replace the existing nickname.

Output parameters

DIAGS
The type of this parameter is VARCHAR(32672). Represents the diagnostic messages.

Syntax

Use the following syntax and replace the parameters for this stored procedure:

call dvsys.virtualizeCosObject('s3a//<bucket_name>/<path to file or directory>', '<OPTIONSCSV>', '<VIRTSCHEMA>', '<VIRTNAME>', '<VIRTDEF>', '<REPLACE Y/N>', ?);

Example

call dvsys.virtualizeCosObject('s3a://my-bucket/test-files/', 'COLNAMES=true', 'VIRT_SCHEMA', 'VIRT_COS_TABLE', '', 'Y', ?);