Restrictions and limitations on Datalake tables
There are a number of restrictions and limitations to be aware about when using Datalake tables.
General restrictions to both regular and Iceberg tables
The following is a list of restrictions on Datalake tables:
- The data stored in Datalake tables is external to Db2 in cloud object storage. Datalake tables
cannot be defined in a Db2 tablespace (that is, you cannot specify the
IN tablespaceclause). The TBSPACEID parameter in the SYSCAT.TABLES catalog shows a-12for Datalake tables. - No support for the following Hive data types:
- ARRAY
- FIXED
- LIST
- MAP
- ROW
- STRUCT
- TIMESTAMP WITH TIME ZONE
- UNION
- UNIONTYPE
- UUID
- The following Db2 data types are not supported for Datalake tables:
- BINARY. BINARY is a synonym for VARBINARY when used with Datalake tables.
- DECFLOAT
- GRAPHIC / VARGRAPHIC
- LOB types (CLOB, BLOB, DBCLOB)
- NCHAR
- NVARCHAR
- Spatial data types, such as:
- ST_Point
- ST_LineString
- ST_Polygon
- ST_GeomCollection
- ST_MultiPolygon
- ST_MultiLineString
- ST_MultiPoint
- TIME
- User-defined type (UDT)
- XML
- Other items not supported on Datalake tables:
- OCTETS, CODEUNITS16, CODEUNITS32, FOR BIT DATA, CCSID ASCII, CCSID UNICODE, CCSID 1208
- WITH DEFAULT clause
- INLINE LENGTH
- SYSPROC.DB2SECURITYLABEL
- UPDATE, DELETE, and MERGE statements are not supported for Hive Datalake tables.
- Text indexes cannot be created on Datalake tables
- Triggers cannot be created on or be a reference to Datalake tables
- Section actuals are not available for Datalake tables
- A Datalake table cannot be a:
- Range-partitioned table
- Multi-dimensional clustered table
- Typed table
- Materialized query table
- Staging table
- System period temporal table, application period temporal table, bi-temporal table, or a history table
- Unsupported functionality:
- DATASLICEID
- DBPARTITIONNUM()
- EXPORT command
- IMPORT command
- INGEST command
- LOAD command
- REORG TABLE command
- RID()
- RID_BIT()
- ROWID
- RUNSTATS
- Cannot ALTER a Datalake table to DROP a column
- Cannot ALTER a Datalake table that has been imported from an external metastore
- Cannot create triggers (with CREATE TRIGGER statement)
- Cannot create indexes (with CREATE INDEX statement)
- Cannot TRUNCATE a Datalake table
- The following Db2 ALTER DATALAKE TABLE features are not supported for externally managed Datalake tables:
- Features on externally managed Hive Datalake tables that are not supported:
- Adding a column
- Changing column types
- Setting table properties
- Renaming a column
- Renaming the table
- Adding a partition
- Dropping a partition
- The DELETE DATA option is not supported for DROP DATALAKE TABLE or DROP PARTITION.
- Dropping an imported/exported table will only remove the meta data from the Db2 catalog and the
internal metastore, but the drop will not remove the table data. The table metadata will still exist
in the external metastore and the table data will also still exist. An
EXT_METASTORE_SYNCwithREPLACEoption will re-fetch the table metadata from the external metastore to Db2 catalog and local meta store, and the table data will be visible again in Db2. - For information on externally managed tables, see Terminology.
- Features on externally managed Hive Datalake tables that are not supported:
- Do not run DDL or DML statements against the same table concurrently, because the behavior of
such statements is undefined. Certain error conditions can result from such concurrent operations.
For example:
- SELECT or INSERT statements fail when they run concurrently with the the ALTER Datalake TABLE…DROP…data-partition statement against the same table. For more information, see SELECT or INSERT fails when run concurrently with DROP PARTITION.
- The ALTER Datalake TABLE…ADD…data-partition statement does not retain the partition in the cache when it runs concurrently with the SELECT statement against the same table. For more information, see Incorrect results when querying recently refreshed or loaded partitioned tables.
- Table names used for Datalake tables must be unique across all tenants. This is different than Db2 local tables that only need to be unique within a tenant.
- Identifiers (schema names, table names, column names) must be unique regardless of case
difference. For example, the second table definition is not
valid:
CREATE Datalake TABLE "Foo" (c1 INT);
For more information, see Identifier handling for Datalake tables.CREATE Datalake TABLE "FOO" (c1 INT); - The ENFORCED attribute for UNIQUE, PRIMARY KEY, FOREIGN KEY, and CHECK constraints cannot be applied to Db2 Datalake tables. Only constraints with the 'NOT ENFORCED' attribute are allowed.
- The use of scrollable cursors for Datalake tables is not supported
- Generated columns cannot be added to existing Datalake tables
- Label-based access control (LBAC) cannot be used with Datalake tables
- The
ROW CHANGE TIMESTAMPgenerated column option cannot be specified for columns in a Datalake table - Db2 isolation levels are not applicable to Datalake tables.
- Schemas that include Datalake tables cannot be transported
- Event monitors cannot write results to Datalake tables
- Created global temporary tables cannot be Datalake tables
- Declared global temporary tables using
NOT LOGGED ON ROLLBACK PRESERVE ROWScannot be Datalake tables - 'DATA CAPTURE CHANGES' is not supported on Datalake tables
Additional restrictions applicable to Iceberg
- You cannot create two Iceberg tables in the same base location.
- The only file formats supported by Iceberg are PARQUET, ORC and AVRO.
- The CREATE DATALAKE TABLE statement using the LIKE clause is not supported.
- Partitioning by expression is restricted to Iceberg transform functions, except bucket[N] and void that are not supported.
- DELETE and UPDATE statements are supported for Iceberg Datalake tables, but MERGE statements are not supported.
General limitations
Before you start using Datalake tables, ensure that you have a system with functionality that is compatible.
The following is a list of limitations on Datalake tables:
- Operations on Datalake tables are not within Db2 transactional control. For more information, see Transaction handling for Datalake tables.
- The value that can be used as a partitioning key must be a sequence of Unicode characters with UTF-8 encoding of up to 1,024 bytes long and are restricted to what can be included in the file path for object storage provider. Partition names are case sensitive.
- When Datalake tables are created, dropped, or altered after a backup, and when you attempt to restore the old backup, there will be an inconsistency between the metastore and the object store. This inconsistency might cause -5105 errors when you try to insert or select from the Datalake tables.