CREATE DATALAKE TABLE statement
The CREATE DATALAKE TABLE statement defines a Db2® table that is based on a Hive table for the Datalake environment. While tables typically reside in a database, a Datalake table resides in an external file or group of files which can be defined as ORC, PARQUET or TEXTFILE format outside of a database.
- Store data outside the database while retaining the ability to query that data. To export data
from the Db2 database into a Datalake table, specify the Datalake table as the target table in one
of the following SQL statements:
- INSERT SQL
- SELECT INTO SQL
- CREATE DATALAKE TABLE AS SELECT SQL
- Import data from a Datalake table into a table in the database. You can perform operations such as casts, joins, and dropping columns to manipulate data during importing. To import data into the database from a Datalake table, use a FROM clause in a SELECT SQL statement as you would for any other table.
- Hive
- A Hive Datalake table does not support ACID properties nor UPDATE and DELETE operations. This type of Datalake table is recommended for read-only or append-only tables.
- Iceberg
-
Iceberg tables support ACID table properties, and is recommended for scenarios that require more flexibility on data manipulation operations. At this time, Db2 does not yet support all of the capabilities of Apache Iceberg. For more information about Iceberg tables, see Using Datalake tables.
To create an Iceberg Datalake table, use the STORED BY ICEBERG clause.
The definition must include its name and the names and attributes of its columns. The definition can include other attributes of the table, such as its primary key or check constraints, although the constraints are informational only. For more information, see Informational constraints.
Invocation
This statement can only be executed as a dynamic statement using EXECUTE IMMEDIATE. It is an executable statement that can be dynamically prepared only if DYNAMICRULES run behavior is in effect for the package (SQLSTATE 42509).
Authorization
- One of the following privileges or authorities:
- SYSADM authority
- SYSCTRL authority
- Plus one of these privileges or authorities:
- CREATEIN privilege on the schema, if the schema name of the table refers to an existing schema
- SCHEMAADM authority on the schema, if the schema name of the table refers to an existing schema
- REFERENCES privilege on the table
- REFERENCES privilege on each column of the specified parent key
- CONTROL privilege on the table
- SCHEMAADM authority on the schema, if the schema name of the parent table refers to an existing schema
- DBADM authority
- DATAACCESS authority
- The following privileges on the files and the container (bucket):
- READ and WRITE privileges on the container and all files in the file path, including files in any file paths that might be recursively specified in the LOCATION clause
- EXECUTE privileges on the container and all the file paths in the specified LOCATION clause
Syntax
- 1 LIKE is not supported for ICEBERG tables (STORED BY ICEBERG).
- 2 Each clause can be used only once.
- 3 The LOCATION clause is mandatory.
- 4 You do not need to specify values for column-name when you use as-result-table because the column details can be inferred from the query.
- 5 The CREATE DATALAKE TABLE statement supports the data types that are described in Data types supported by DATALAKE tables and Data types. However, some data types are syntactically supported for compatibility with Hive but are treated as different data types within the Db2 runtime environment. For more information, see the mapping of data types that are used in the CREATE TABLE statement.
- 6 Because the STRING data type has no specific size associated with it, the SQL processor might assume that each value in a STRING column always contains 32672 Bytes of data, and performance might be impacted. If performance is a concern, use types that contain a specific length.
- 7 For Iceberg tables, the AS identifier clause is not supported. For non-Iceberg tables, it is required.
- 8 When using CTAS, an error can occur if a data type is referenced in the select statement that is not support by the target table. For example, Iceberg does not support a TIMESTAMP with a precision greater than 6. Casting must be used in the select statement to a supported data type of the target table.
Description
- DATALAKE
- This keyword is required when creating a Datalake table. You can verify that you have defined a
Datalake table by querying the PROPERTY column in the SYSCAT.TABLES catalog view. For
example:
SELECT SUBSTRING(PROPERTY,22,1) FROM SYSCAT.TABLES WHERE TABNAME='<table-name>';
- IF NOT EXISTS
- Checks whether the specified table name already exists. If the name exists, no error is returned.
- table-name
- Specifies a unique name for the new table. The table name must be a valid identifier.
You can use the RENAME DATALAKE TABLE statement to rename a Datalake table. When you rename a Datalake table, the name in both the local database catalog and the Hive catalog is updated.
- element-list
- Defines the elements of a table, including the definition of columns and constraints on the table.
- column-definition
- Defines the attributes of the columns for a new Datalake table that is not derived from another
table by using the AS clause.
- column-name
- Names a column of the table. The name cannot be qualified, and the same name cannot be used for more than one column of the table. The column name must be a valid identifier.
- data-type
- Specifies the data type of the column.
- primitive-type
- There are several built-in data types:
- TINYINT
- A 1-byte integer with a range of -128 to 127.
Db2 automatically converts the TINYINT to a larger integer type, such as SMALLINT, INT, or BIGINT, or a floating-point type such as FLOAT or DOUBLE.
- SMALLINT | INT2
- A 2-byte integer with a precision of 5 digits and a range of -32,768 to 32,767.
- INTEGER | INT | INT4
- A 4-byte integer with a precision of 10 digits and a range of -2,147,483,648 to +2,147,483,647.
- BIGINT | INT8
- An 8-byte integer with a precision of 19 digits and a range of -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807.
- FLOAT(integer) | FLOAT 8 | REAL | FLOAT4 | DOUBLE | FLOAT8
- A single-precision or double-precision floating-point number. If the specified length is in the range:
- 1 - 24, the number uses single precision.
- 25 - 53, the number uses double-precision.
A single-precision floating-point number is a 32-bit approximation of a real number. The number can be zero or can range from -3.4028234663852886e+38 to -1.1754943508222875e-38, or from 1.1754943508222875e-38 to 3.4028234663852886e+38.
A double-precision floating-point number is a 64-bit approximation of a real number. The number can be zero or can range from -1.7976931348623158e+308 to -2.2250738585072014e-308, or from 2.2250738585072014e-308 to 1.7976931348623158e+308.
A REAL number is a single-precision floating-point number. A DOUBLE number is a double-precision floating-point number.
- CHARACTER(integer) | CHAR(integer)
- A character string is a sequence of characters. The length of the string (integer) is the number of characters in the sequence. The default is 1.
- STRING | VARCHAR(integer) | CHARACTER VARYING(integer) | CHAR VARYING(integer)
- The STRING data type has no specific size associated with it. The default is VARCHAR(32672).
- TIMESTAMP | DATETIME
- A six- or seven-part value (year, month, day, hour, minute, second, and optional fractional
seconds) that designates a date and time.
If specified, integer must be between 0 and 9 and represents the precision of fractional seconds from 0 (seconds) to 9 (nanoseconds).
For Iceberg tables, the default precision is 6 and it is also the maximum for Apache Iceberg.
For Hive Datalake tables, the default precision is 9 and it is also the maximum.
Hive does not support a TIMESTAMP with a precision, which means that for Iceberg tables, the underlying storage will always have a precision of 6 for Iceberg and a precision of 9 for non-Iceberg tables.
- DATE
- A three-part value (year, month, and day) that designates a date. The range of the year part is 0001 to 9999. The range of the month part is 1 to 12. The range of the day part is 1 to x, where x depends on the month.
- DECIMAL | DEC | NUMERIC | NUM (precision-integer, scale-integer)
- A decimal number with defined precision and scale. The precision is the total number of digits and can range from 1 to 31. The scale is the number of digits to the right of the decimal point and can range from 0 to the precision of the number. The default precision is 10 and scale is 0.
- BINARY | VARBINARY (integer)
- A varying length binary value is specified as VARBINARY(length) or
BINARY(length) and can be up to 32,672 bytes long. VARBINARY is an alias for
BINARY, and they are functionally identical in every way. If no length is provided, the length is
assumed to be 32,672.
However, it is recommended that an explicit length be specified. The Db2 engine works most efficiently when the table definition enables a row to fit within 32 KB of memory. When the calculated row size exceeds 32 KB, some queries might see a performance degradation.
- BOOLEAN
- A Boolean value.
- column-options
- You can define column options including a comment, the nullability attribute, or constraints.
- COMMENT column-comment
- A comment that provides additional information about the column and which is stored in the internal Hive metastore.
- NULL | NOT NULL
- NOT NULL specifies that the column will not contain null values. Because the Datalake table is external to any local database server, Db2 enforces nullability at query execution time (SQL5104N). The nullability attribute of a column is enforced during an INSERT operation. Do not assign a null value to a column that is defined as NOT NULL (SQLSTATE 23502).
- CONSTRAINT constraint-name
- Specifies the name of a constraint. Constraints defined on Datalake tables are informational
only. For more information, see Informational
constraints. You can define the following constraints on a column:
- PRIMARY KEY
- A primary key must be unique and must have the NOT NULL attribute. A table cannot have more than one primary key.
- UNIQUE
- A unique key prevents duplicate values in one or more NOT NULL columns of a table. A table can have multiple unique keys.
- REFERENCES table-name
- Ensures that required table relationships are maintained and that data entry rules are followed. The column on which this parameter is specified becomes the foreign key to the referenced parent table table-name.
- CHECK check-condition
- Sets restrictions on data that is added to a specific table. This parameter is enforced during
INSERT operations or on data that is
made available outside of the database manager (by using Hive operations, for example).
- search-condition
- The search-condition must be true or unknown for every row of the table, and
has the following restrictions:
- A column reference must pertain to a column of the table that is being created.
- The search condition cannot contain a TYPE predicate.
- The search condition cannot contain any of the following items (SQLSTATE 42621):
- Subqueries
- XMLQUERY or XMLEXISTS expressions
- Dereference operations or DEREF functions where the scoped reference argument is other than the object identifier (OID) column
- CAST specifications with a SCOPE clause
- Column functions
- Functions that are not deterministic
- Functions that are defined to have an external action
- User-defined functions that are defined with either MODIFIES SQL or READS SQL DATA
- Host variables
- Parameter markers
- sequence-references
- OLAP specifications
- Special registers and built-in functions that depend on the value of a special register
- Global variables
- References to generated columns other than the identity column
- References to columns of type XML (except in a VALIDATED predicate)
- An error tolerant nested-table-expression
Check constraints with search-condition are enforced when rows in the table are inserted or updated. A check constraint that is defined on a table automatically applies to all subtables of that table.
- functional-dependency
- Defines a functional dependency between columns or sets of columns.
The parent set of columns precedes the DETERMINED BY clause, and the child set of columns follows the DETERMINED BY clause. All restrictions on search-condition apply to both parent set and child set columns, and only simple column references are allowed in the column sets (SQLSTATE 42621). The same column must not be identified more than once in a functional dependency (SQLSTATE 42709). No column in the child set of columns can be a nullable column (SQLSTATE 42621).
A functional dependency is not enforced by the database manager during normal operations such as insert, update, or delete. The functional dependency might be used during query rewrite for optimization. Incorrect results might be returned if the integrity of a functional dependency is not maintained.
- constraint-attributes
- Defines attributes that are associated with primary key, unique, referential integrity, or check constraints.
- NOT ENFORCED | ENFORCED
- Specifies whether the constraint is enforced by the database manager during normal operations
such as insert, update, or delete.
- NOT ENFORCED
- Specifies that the constraint is not enforced by the database manager. This is the default for
Datalake tables. A primary key or unique constraint cannot be NOT ENFORCED if there is a dependent
ENFORCED referential constraint.
- TRUSTED | NOT TRUSTED
- Specifies whether the data can be trusted to conform to the constraint.
- TRUSTED
- Specifies that the data can be trusted to conform to the constraint. This is the default. Use this option only if the table data is independently known to conform to the constraint; otherwise, unpredictable results might occur.
- NOT TRUSTED
- Specifies that the data cannot be trusted to conform to the constraint. This option is valid only for referential integrity constraints (SQLSTATE 42613). Use this option when the data conforms to the constraint for most rows, but it is not independently known whether all current and future rows will conform. If a constraint is both not trusted and enabled for query optimization, it will not be used for any optimization that depends on the data being in complete conformity with the constraint.
- ENFORCED
- Although the ENFORCED keyword is available for compatibility with the syntax for local tables that are created with the CREATE TABLE statement, an error is returned when you specify this option for Datalake tables (SQLSTATE 42858).
- ENABLE QUERY OPTIMIZATION | DISABLE QUERY OPTIMIZATION
- Specifies whether the constraint or functional dependency can be used for query optimization
under appropriate circumstances.
- ENABLE QUERY OPTIMIZATION
- Specifies that the constraint is assumed to be true and can be used for query optimization. This is the default.
- DISABLE QUERY OPTIMIZATION
- Specifies that the constraint cannot be used for query optimization. This option cannot be specified for primary key and unique constraints (SQLSTATE 42613).
- column-name
- Names a column of the source table when you create a new Datalake (target) table by using the AS clause. If a column name is not specified, column names for the target table are derived from the result set of a query against the source table. You cannot explicitly specify the data types of these columns, because the types are derived from data types in the result set.
- unique-constraint
- Defines a unique key. This constraint does not allow duplicate values in one or more columns of the table. A table can have multiple unique keys. The columns that are specified in a unique constraint must be defined as NOT NULL.
- referential-constraint
- Defines a referential constraint.
- CONSTRAINT constraint-name
- Specifies a name for the referential constraint.
- FOREIGN KEY (column-name, ...)
- The foreign key of the referential constraint is composed of the identified columns. Each name in the list of column names must identify a column of the parent table, and the same column must not be identified more than once. The number of foreign key columns must be the same as the number of parent key columns, and the data types of the corresponding columns must be compatible (SQLSTATE 42830).
- check-constraint
- Defines a check constraint.
- constraint-name
- Specifies a name for the check constraint.
- check-condition
- See CHECK check-condition.
- LIKE table-name
-
- table-name
- Specifies that the table columns are to have exactly the same name and description as the columns of a table, view, or nickname that exists in the catalog, or a declared temporary table. The table name can refer only to another Datalake table (SQLSTATE 42858).
- COMMENT string-constant
- Specifies a comment that provides additional information about the table and which is stored in the internal Hive metastore. The maximum length of string-constant is 254 bytes.
- PARTITIONED BY partitioning-element-list
- Specifies how the data is to be partitioned. The action of including the PARTITIONED BY clause
creates a file in the file path that contains data for each unique value in the column that is being
partitioned. The partitioned value must be a sequence of Unicode characters with UTF-8 encoding of
up to 1,024 bytes long and are restricted to what can be included in the file path for object
storage. Partition names are case sensitive. At query time, Db2
tries to scan only the file paths, or partitions, that contain data matching
the query's search criteria.
- column-definition
-
Specifies a new column definition that includes the column name, the column data type, and any column options. This will result in a new column being defined for the table and it will also be part of the partitioning-element-list. The following data types cannot be used as partitioning columns:
- BINARY
- BOOLEAN
- DOUBLE
- FLOAT
- REAL
- VARBINARY
Important: Although partitioning by TIMESTAMP is supported, it is not recommended. Partitioning on a TIMESTAMP value will result in many small files which can cause performances issues on reads and writes to the table.Important: A partitioning key value cannot include any trailing blank spaces. This could be a problem if you plan to partition on a CHAR data type, because these values are always padded to their full length with blank spaces. Db2 also does not support partitioning on a zero-length string. - column-name
-
Specifies the name of a column that has been defined as part of the CREATE element-list that will be used as part of the partitioning-element-list.
- partition-expression
-
Specifies one or more partitioning expressions by which data that is inserted into the table is partitioned. The action of including the PARTITIONED BY clause creates a file in the file path that contains data for each unique value in the partitioning expression. At query time, Db2 tries to scan only the file paths, or partitions, that contain data that match the search criteria of the query. For Hive Datalake tables, this creates a generated column in the table for the expression. For more information, see Generated columns.
For Iceberg tables, this creates an implicitly hidden generated column in the table for the expression. For more information see, Generated columns and Hidden columns.
For Hive Datalake tables, the expression can reference one or more columns in the columns list. The AS clause specifies a name for the column that is derived from the expression and is required. The data type of this identifier matches the result type of the expression. If a query references the column that is derived from the expression, a new predicate is automatically injected into the query so that Db2 scans only the qualifying partitions.
For Iceberg tables, the AS clause is not required and is not supported. In addition, the only valid partition expressions that are supported by Iceberg Datalake tables are the following:- identity
- Directly references an existing column in the table. For example, if a table has a column named
C1, you can include a partition clause as
PARTITION BY C1
. - truncate[W]
- Truncates the column value to the specified width (W) to create the partition value.
- year
- Extracts a date or timestamp year from the column value to create the partition value.
- month
- Extracts a date or timestamp month from the column value to create the partition value.
- day
- Extracts a date or timestamp day from the column value to create the partition value.
- hour
- Extracts a timestamp hour from the column value to create the partition value.
The partitioning expression cannot contain any of the following items (SQLSTATE 42621):- Subqueries
- Column functions
- Dereference operations or DEREF functions
- User-defined or built-in functions that are non-deterministic
- User-defined functions that use the EXTERNAL ACTION option
- User-defined functions that are not defined with NO SQL
- Host variables or parameter markers
- Special registers or built-in functions that depend on the value of a special register
- Global variables
- An expression on a partitioning expression column
- ROW FORMAT
- Specifies the format of data rows.
- delimited-row-format
- Specifies a delimiter at the table level.
- DELIMITED
- Valid values for the FIELDS
TERMINATED BY and ESCAPED BY parameters are the following items. Important: It is recommended to use different values for each of these parameters; otherwise, the behavior is undefined.
- A literal single-byte character such as vertical bar (|) or comma (,).
- A value of the format
\nnn
, where nnn is the octal value of a character. For example, \001 is the ASCII character that represents CTRL-A.
- FIELDS TERMINATED BY terminator-char
- Specifies a delimiter for structured fields. You can include any single character, but the default is \001, the ASCII character that represents CTRL-A. Characters that match the field terminator character can be part of your data if you specify the ESCAPED BY clause and those characters are properly escaped.
- LINES TERMINATED BY terminator-char
- Specifies a delimiter for data rows. The only valid delimiter is a newline character (\n). Newline characters must not exist within your data because they cannot be escaped.
- NULL DEFINED AS null-value
- Specifies a value to represent the null value. The default is a literal backslash character followed by an uppercase N (\N).
- serde-row-format
- Specifies the format of data rows as a SerDe class.
- STORED AS file-format
-
Specifies the format of the underlying data files. For more information about the supported file formats, see Supported file formats.
The file format can also be specified as Java input and output classes, and this is shown in the output of the db2look command.
- INPUTFORMAT class-name
- Specifies the name of the Java class that implements the input format interface.
- OUTPUTFORMAT class-name
-
Specifies the name of the Java class which implements the output format interface. For example, for a Datalake table STORED AS PARQUET, the following example output is returned after running the db2look command:
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
- STORED BY ICEBERG
- Specifies that the Datalake table is an Iceberg Datalake table.
- LOCATION string-constant
-
Specifies the location of the table data. The location must be a DB2REMOTE identifier containing an access storage alias that establishes connectivity to the remote storage where the table data is to reside. For more information, see Remote storage connectivity for Datalake tables. Access to the file path that is specified in string-constant can be subject to additional file permissions depending on the storage service being used.
Note: The LOCATION clause must always specify a directory that contains the files for the table that is being created. The files must have the same structure. If you have files with different structures and you want to create a different table for each structure, the files must be in separate directories. - TBLPROPERTIES ('string-constant1' = 'string-constant2')
- Defines job properties that can configure input or output formats, and that you can use to
specify more detailed information about the table. The properties are passed
as is
to the underlying InputFormat and OutputFormat classes. Properties that are not implemented by these classes are ignored. Db2 does not validate the properties. - as-result-table
- Specifies that the table is to be based on a SELECT statement.
- AS (fullselect)
- Specifies that, for each column in the derived result table of the
fullselect, a corresponding column is to be defined for the table. Each defined
column adopts the following attributes from its corresponding column of the result table (if
applicable to the data type):
- Column name
- Column description
- Data type, length, precision, and scale
- Nullability
The following attributes are not included (although the default value and identity attributes can be included by using copy-options):- Default value
- Identity attributes
- Hidden attribute
- ROW CHANGE TIMESTAMP
- Any other optional attributes of the tables or views that are referenced in the fullselect
The following restrictions apply:- Every select list element must have a unique name (SQLSTATE 42711). The AS clause can be used in the select clause to provide unique names.
- The fullselect cannot refer to host variables or include parameter markers.
- The data types of the result columns of the fullselect must be data types that are valid for columns of a table.
- If row or column level access control (RCAC) is activated for any table that is specified in the fullselect, RCAC is not cascaded to the new table.
- The fullselect cannot include a data-change-table-reference clause (SQLSTATE 428FL).
- Any valid fullselect that does not reference a typed table or a typed view can be specified.
- WITH NO DATA
- Specifies that the target table is not to be populated with data.
Usage notes
- The maximum number of Datalake tables that can be created in Db2 is 1,048,576.
- If the current schema, which might have been created with either the USE or SET SCHEMA command, does not exist in the Hive metastore, an attempt is made to automatically create the schema in Hive.
- All referential integrity constraints and nullability indicators are advisory only and are not always enforced.
-
When you use data types such as STRING and BINARY, the SQL processor might assume that such cells always contain 32672 Bytes of data, which can impact performance. Instead, use data types that have a specific length. To improve performance, use a VARCHAR(n) with a defined size instead of a STRING.
This alter operation changes the definition of the table both in Db2 and in the Hive Metastore (HMS). If data in the VARCHAR(n) column exceeds
<size>
, the data is returned, but it is truncated. - If no precision is included on the TIMESTAMP definition, the default is TIMESTAMP(6) for Iceberg tables and TIMESTAMP(9) for Hive Datalake tables.
- If you create a table in a schema that does not exist, the schema is automatically created with the default name (your user ID).
- To prevent the risk of data loss, creation of Iceberg tables in a location where Iceberg metadata files already exist is blocked. To create a table in this location, delete the data and metadata from the storage first.
- CMX compression is supported in Db2.
To enable CMX compression, set the following compression properties before running the CREATE
Datalake TABLE
statement:
SET HADOOP PROPERTY 'mapred.output.compress'='true';
SET HADOOP PROPERTY 'mapred.output.compression.type'='BLOCK';
SET HADOOP PROPERTY 'mapred.output.compression.codec'='org.apache.hadoop.io.compress.CmxCodex';
- For information about the data type mappings that are supported in the CREATE Datalake TABLE syntax, see Data types supported by Datalake tables.
-
Datalake tables and local Db2 tables all use the same namespace. If you synchronize a table from the IBM watsonx.data to Db2, ensure that these tables do not conflict with existing Db2 tables.
Restrictions
For information about restrictions, see Restrictions and limitations on Datalake tables.
Examples
t1
) with a Parquet file format, run the following
statement:CREATE DATALAKE TABLE t1 (i INT, s VARCHAR(10)) STORED AS PARQUET LOCATION 'DB2REMOTE://odfdefault//defaultbucket/t1' TBLPROPERTIES ('external.table.purge'='true');
t2
) by using the structure and contents of another
Datalake table, run the following
statement:CREATE DATALAKE TABLE t2 LOCATION 'DB2REMOTE://odfdefault//defaultbucket/t2' as (SELECT * FROM t1);
orders
) with a partitioning column, run the
following
statement:CREATE DATALAKE TABLE orders (orderkey INT NOT NULL, o_comment VARCHAR(20) COMMENT 'comment column' NOT NULL ) PARTITIONED BY (orderdate VARCHAR(10)) LOCATION 'DB2REMOTE://odfdefault//defaultbucket/orders';
t1_ice
) with a default Parquet file format, run the
following
statement:CREATE DATALAKE TABLE t1_ice(id int, age int, name varchar(30), birth_date date) stored by iceberg location 'db2remote://default//t1_ice';
t1_ice
) with Parquet file format, run the following
statement:CREATE DATALAKE TABLE t1_ice(id int, age int, name varchar(30), birth_date date) stored by iceberg stored as parquet location 'db2remote://default//t1_ice';
t1_ice
) with Orc file format, run the following
statement:CREATE DATALAKE TABLE t1_ice(id int, age int, name varchar(30), birth_date date) stored by iceberg stored as orc location 'db2remote://default//t1_ice';
t1_ice
) specifying the
external.table.purge
property, run the following
statement:CREATE DATALAKE TABLE t1_ice(id int, age int, name varchar(30), birth_date date) stored by iceberg location 'db2remote://default//t1_ice' TBLPROPERTIES('external.table.purge'='true');
t1_ice
) partitioned by the age
column, run the following
statement:CREATE DATALAKE TABLE t1_ice(id int, age int, name varchar(30), birth_date date) partitioned by(age) stored by iceberg location 'db2remote://default//t1_ice';
t1_ice
) partitioned by the year
expression applied to the birth_date
column, run the following
statement:CREATE DATALAKE TABLE t1_ice(id int, age int, name varchar(30), birth_date date) partitioned by(year(birth_date)) stored by iceberg location 'db2remote://default//t1_ice';
t1_ice
) partitioned by the month
expression applied to the birth_date
column, run the following
statement:CREATE DATALAKE TABLE t1_ice(id int, age int, name varchar(30), birth_date date) partitioned by(month(birth_date)) stored by iceberg location 'db2remote://default//t1_ice';
t1_ice
) partitioned by the truncate
expression applied to the name
column, run the following
statement:CREATE DATALAKE TABLE t1_ice(id int, age int, name varchar(30), birth_date date) partitioned by(truncate(10, name)) stored by iceberg location 'db2remote://default//t1_ice';