Restrictions and limitations on Datalake tables

There are a number of restrictions and limitations to be aware about when using Datalake tables.

General restrictions to both regular and Iceberg tables

The following is a list of restrictions on Datalake tables:
  • The data stored in Datalake tables is external to Db2 in cloud object storage. Datalake tables cannot be defined in a Db2 tablespace (that is, you cannot specify the IN tablespace clause). The TBSPACEID parameter in the SYSCAT.TABLES catalog shows a -12 for Datalake tables.
  • No support for the following Hive data types:
    • ARRAY
    • FIXED
    • LIST
    • MAP
    • ROW
    • STRUCT
    • TIMESTAMP WITH TIME ZONE
    • UNION
    • UNIONTYPE
    • UUID
  • The following Db2 data types are not supported for Datalake tables:
    • BINARY. BINARY is a synonym for VARBINARY when used with Datalake tables.
    • DECFLOAT
    • GRAPHIC / VARGRAPHIC
    • LOB types (CLOB, BLOB, DBCLOB)
    • NCHAR
    • NVARCHAR
    • Spatial data types, such as:
      • ST_Point
      • ST_LineString
      • ST_Polygon
      • ST_GeomCollection
      • ST_MultiPolygon
      • ST_MultiLineString
      • ST_MultiPoint
    • TIME
    • User-defined type (UDT)
    • XML
  • Other items not supported on Datalake tables:
    • OCTETS, CODEUNITS16, CODEUNITS32, FOR BIT DATA, CCSID ASCII, CCSID UNICODE, CCSID 1208
    • WITH DEFAULT clause
    • INLINE LENGTH
    • SYSPROC.DB2SECURITYLABEL
    • UPDATE, DELETE, and MERGE statements are not supported for Hive Datalake tables.
    • Text indexes cannot be created on Datalake tables
    • Triggers cannot be created on or be a reference to Datalake tables
    • Section actuals are not available for Datalake tables
  • A Datalake table cannot be a:
    • Range-partitioned table
    • Multi-dimensional clustered table
    • Typed table
    • Materialized query table
    • Staging table
    • System period temporal table, application period temporal table, bi-temporal table, or a history table
  • Unsupported functionality:
    • DATASLICEID
    • DBPARTITIONNUM()
    • EXPORT command
    • IMPORT command
    • INGEST command
    • LOAD command
    • REORG TABLE command
    • RID()
    • RID_BIT()
    • ROWID
    • RUNSTATS
    • Cannot ALTER a Datalake table to DROP a column
    • Cannot ALTER a Datalake table that has been imported from an external metastore
    • Cannot create triggers (with CREATE TRIGGER statement)
    • Cannot create indexes (with CREATE INDEX statement)
    • Cannot TRUNCATE a Datalake table
  • The following Db2 ALTER DATALAKE TABLE features are not supported for externally managed Datalake tables:
    • Features on externally managed Hive Datalake tables that are not supported:
      • Adding a column
      • Changing column types
      • Setting table properties
      • Renaming a column
      • Renaming the table
      • Adding a partition
      • Dropping a partition
    • The DELETE DATA option is not supported for DROP DATALAKE TABLE or DROP PARTITION.
    • Dropping an imported/exported table will only remove the meta data from the Db2 catalog and the internal metastore, but the drop will not remove the table data. The table metadata will still exist in the external metastore and the table data will also still exist. An EXT_METASTORE_SYNC with REPLACE option will re-fetch the table metadata from the external metastore to Db2 catalog and local meta store, and the table data will be visible again in Db2.
    • For information on externally managed tables, see Terminology.
  • Do not run DDL or DML statements against the same table concurrently, because the behavior of such statements is undefined. Certain error conditions can result from such concurrent operations. For example: For more information, see ALTER Datalake TABLE statement.
  • Table names used for Datalake tables must be unique across all tenants. This is different than Db2 local tables that only need to be unique within a tenant.
  • Identifiers (schema names, table names, column names) must be unique regardless of case difference. For example, the second table definition is not valid:
    CREATE Datalake TABLE "Foo" (c1 INT);
    CREATE Datalake TABLE "FOO" (c1 INT);
    For more information, see Identifier handling for Datalake tables.
  • The ENFORCED attribute for UNIQUE, PRIMARY KEY, FOREIGN KEY, and CHECK constraints cannot be applied to Db2 Datalake tables. Only constraints with the 'NOT ENFORCED' attribute are allowed.
  • The use of scrollable cursors for Datalake tables is not supported
  • Generated columns cannot be added to existing Datalake tables
  • Label-based access control (LBAC) cannot be used with Datalake tables
  • The ROW CHANGE TIMESTAMP generated column option cannot be specified for columns in a Datalake table
  • Db2 isolation levels are not applicable to Datalake tables.
  • Schemas that include Datalake tables cannot be transported
  • Event monitors cannot write results to Datalake tables
  • Created global temporary tables cannot be Datalake tables
  • Declared global temporary tables using NOT LOGGED ON ROLLBACK PRESERVE ROWS cannot be Datalake tables
  • 'DATA CAPTURE CHANGES' is not supported on Datalake tables

Additional restrictions applicable to Iceberg

  • You cannot create two Iceberg tables in the same base location.
  • The only file formats supported by Iceberg are PARQUET, ORC and AVRO.
  • The CREATE DATALAKE TABLE statement using the LIKE clause is not supported.
  • Partitioning by expression is restricted to Iceberg transform functions, except bucket[N] and void that are not supported.
  • DELETE and UPDATE statements are supported for Iceberg Datalake tables, but MERGE statements are not supported.

General limitations

Before you start using Datalake tables, ensure that you have a system with functionality that is compatible.

The following is a list of limitations on Datalake tables:
  • Operations on Datalake tables are not within Db2 transactional control. For more information, see Transaction handling for Datalake tables.
  • The value that can be used as a partitioning key must be a sequence of Unicode characters with UTF-8 encoding of up to 1,024 bytes long and are restricted to what can be included in the file path for object storage provider. Partition names are case sensitive.
  • When Datalake tables are created, dropped, or altered after a backup, and when you attempt to restore the old backup, there will be an inconsistency between the metastore and the object store. This inconsistency might cause -5105 errors when you try to insert or select from the Datalake tables.