Schema evolution support

Schema evolution is the process of modifying the structure of a database over time as new requirements and data models emerge. It involves changing the schema of tables to accommodate new data elements or changes to existing ones.

Db2 provides extensive support for schema evolution changes on local Db2 tables to accommodate the changing of the table while preserving the integrity and consistency of the data. Support for schema evolution of Datalake tables is more complex and more problematic given that the data is stored externally and is not owned by Db2.

Schema evolution can be challenging, particularly in large or complex databases, so it is recommended to handle such changes very carefully. Some of the challenges of schema evolution include:
  1. Data loss: Changes to the schema can result in the loss of data or errors accessing the data if not managed carefully.
  2. Compatibility issues: Changes to the schema can cause compatibility issues with existing applications.
  3. Performance issues: Changes to the schema can impact the performance of reads and writes to the table.
  4. Data inconsistencies: Changes to the schema can result in data inconsistencies if not managed carefully, which can lead to inaccurate or incomplete data.
For externally managed Datalake tables, schema evolution changes are quite restricted. Only the following changes are allowed:
  • Adding/dropping/altering Informational constraints
    • Unique constraint
      • Unique columns
      • Primary key
    • Referential (foreign key) constraint
    • Check constraint
  • Setting most column options
    • Column constraints
    • Column comment
  • Setting NOT NULL/NULL for a column
  • Enabling/disabling RCAC

For more information on externally managed tables see, Terminology.

For Iceberg tables, the only supported data type changes that are permitted for numeric data types are those that support expanding the size of data type. For example, changing a column defined as an INTEGER to a BIGINT. For CHAR and VARCHAR, an alter can be used to change the size of the column. These are also the only recommended data type changes for Hive Datalake tables but any alter supported by Hive is permitted. If you alter a table column to specify a new data type for the column, and the length of the new type is smaller than that of the original data type length, table data values might be returned in a truncated form. For more details regarding changes to the schema can be made on Datalake tables, see ALTER DATALAKE TABLE statement.