Restrictions and limitations on Datalake tables

There are a number of restrictions and limitations to be aware about when using Datalake tables.

General restrictions to both regular and Iceberg tables

The following is a list of restrictions on Datalake tables:
  • The data stored in Datalake tables is external to Db2 in cloud object storage. Datalake tables cannot be defined in a Db2 tablespace (that is, you cannot specify the IN tablespace clause). The TBSPACEID parameter in the SYSCAT.TABLES catalog shows a -12 for Datalake tables.
  • No support for the following Hive data types:
    • ARRAY
    • FIXED
    • LIST
    • MAP
    • ROW
    • STRUCT
    • TMESTAMPZ
    • UNION
    • UNIONTYPE
    • UUID
  • The following Db2 data types are not supported for Datalake tables:
    • BINARY. BINARY is a synonym for VARBINARY when used with Datalake tables.
    • DECFLOAT
    • GRAPHIC / VARGRAPHIC
    • LOB types (CLOB, BLOB, DBCLOB)
    • NCHAR
    • NVARCHAR
    • Spatial data types, such as:
      • ST_Point
      • ST_LineString
      • ST_Polygon
      • ST_GeomCollection
      • ST_MultiPolygon
      • ST_MultiLineString
      • ST_MultiPoint
    • TIME
    • User-defined type (UDT)
    • XML
  • Other items not supported on Datalake tables:
    • OCTETS, CODEUNITS16, CODEUNITS32, FOR BIT DATA, CCSID ASCII, CCSID UNICODE, CCSID 1208
    • WITH DEFAULT clause
    • INLINE LENGTH
    • SYSPROC.DB2SECURITYLABEL
    • Update, delete, or merge statements
    • Text indexes cannot be created on Datalake tables
    • Triggers cannot be created on or be a reference to Datalake tables
    • Section actuals are not available for Datalake tables
  • A Datalake table cannot be a:
    • Range-partitioned table
    • Multi-dimensional clustered table
    • Typed table
    • Materialized query table
    • Staging table
    • System period temporal table, application period temporal table, bi-temporal table, or a history table
  • Unsupported functionality:
    • DATASLICEID
    • DBPARTITIONNUM()
    • EXPORT command
    • IMPORT command
    • INGEST command
    • LOAD command
    • REORG TABLE command
    • RID()
    • RID_BIT()
    • ROWID
    • RUNSTATS
    • Cannot ALTER a Datalake table to DROP a column
    • Cannot ALTER a Datalake table that has been imported from an external metastore
    • Cannot create triggers (with CREATE TRIGGER statement)
    • Cannot create indexes (with CREATE INDEX statement)
    • Cannot TRUNCATE a Datalake table
  • Externally managed tables have additional restrictions:
    • The following are not supported for ALTER DATALAKE TABLE:
      • Adding a column
      • Changing column names
      • Changing column types
      • Exceptions to this are that we allow the following operations for Iceberg Datalake tables only:
        • Changing STRING to VARCHAR
        • Changing the length of a VARCHAR
      • Renaming the table
      • Setting table properties
      • Setting the SerDe or SerDe properties
      • Setting the table location
      • Setting the file format
      • Add a partition
      • Dropping a partition
    • The table data cannot be deleted as part of DROP DATALAKE TABLE.
    • Dropping an imported/exported table will only remove the meta data from the Db2 catalog and the internal metastore, but it will not remove the table data. The table meta data will still exist in the external metastore and the table data also still exist. An EXT_METASTORE_SYNC with REPLACE mode will re-fetch the table meta data and its from external metastore to Db2 catalog and the table data will be visible.
    • For information on externally managed tables see, Terminology.
  • Do not run DDL or DML statements against the same table concurrently, because the behavior of such statements is undefined. Certain error conditions can result from such concurrent operations. For example: For more information, see ALTER Datalake TABLE statement.
  • Identifiers (schema names, table names, column names) must be unique regardless of case difference. For example, the second table definition is not valid:
    CREATE Datalake TABLE "Foo" (c1 INT);
    CREATE Datalake TABLE "FOO" (c1 INT);
    For more information, see Identifier handling for Datalake tables.
  • The ENFORCED attribute for UNIQUE, PRIMARY KEY, FOREIGN KEY, and CHECK constraints cannot be applied to Db2 Datalake tables. Only constraints with the 'NOT ENFORCED' attribute are allowed.
  • The use of scrollable cursors for Datalake tables is not supported
  • Generated columns cannot be added to existing Datalake tables
  • Label-based access control (LBAC) cannot be used with Datalake tables
  • Row and Column Access Control (RCAC) is not supported for Datalake tables
  • The ROW CHANGE TIMESTAMP generated column option cannot be specified for columns in a Datalake table
  • Db2 isolation levels are not applicable to Datalake tables.
  • Schemas that include Datalake tables cannot be transported
  • Event monitors cannot write results to Datalake tables
  • Created global temporary tables cannot be Datalake tables
  • Declared global temporary tables using NOT LOGGED ON ROLLBACK PRESERVE ROWS cannot be Datalake tables
  • 'DATA CAPTURE CHANGES' is not supported on Datalake tables

Additional restrictions applicable to Iceberg

  • You cannot create two Iceberg tables in the same base location.
  • The only file formats supported by Iceberg are PARQUET, ORC and AVRO.
  • No support for CREATE Datalake TABLE … LIKE for Iceberg tables
  • No support for time travel queries
  • Partitioning by expression is restricted to Iceberg transform functions, except bucket[N] and void that are not supported.
  • UPDATE and DELETE are not yet supported.

General limitations

Before you start using Datalake tables, ensure that you have a system with functionality that is compatible.

The following is a list of limitations on Datalake tables:
  • Operations on Datalake tables are not within Db2 transactional control. For more information, see Transaction handling for Datalake tables.
  • The value that can be used as a partitioning key must be a sequence of Unicode characters with UTF-8 encoding of up to 1,024 bytes long and are restricted to what can be included in the file path for object storage provider. Partition names are case sensitive.
  • When Datalake tables are created, dropped, or altered after a backup, and when you attempt to restore the old backup, there will be an inconsistency between the metastore and the object store. This inconsistency might cause -5105 errors when you try to insert or select from the Datalake tables.