ALTER TABLE (HADOOP/HBASE) statement

The ALTER TABLE (HADOOP/HBASE) statement alters the definition of a Hadoop or HBase table.

Invocation

This statement can be embedded in an application program or issued by using dynamic SQL statements. It is an executable statement that can be dynamically prepared only if DYNAMICRULES run behavior is in effect for the package (SQLSTATE 42509).

Authorization

The privileges that are held by the authorization ID of the statement must include at least one of the following authorities:

ALTER privilege on the table to be altered
CONTROL privilege on the table to be altered
ALTERIN privilege on the schema of the table
SCHEMAADM authority on the schema of the table
DBADM authority

To use the LOCATION clause, the authorization ID of the statement must have one of the following privileges:

DATAACCESS authority
The following privileges on the files and the directory in the HDFS:
- READ and WRITE privileges on the directory and all files in the directory, including files in any subdirectories that might be recursively specified in the LOCATION clause
- EXECUTE privileges on all the directories and subdirectories in the specified LOCATION clause

If impersonation is not enabled, the bigsql user needs the same permissions that the Hive user needs for this operation. For example, with impersonation enabled in Hive, the bigsql user must have READ, WRITE, and EXECUTE privileges on the directory that is specified in the LOCATION clause, and READ and EXECUTE privileges on the parent directory.

Important: The LOCATION clause is not applicable to HBase tables.

Description

table-name

The table-name must identify an existing Hadoop or HBase table.

ADD column-definition

Defines the attributes of a new column. The column-name cannot be qualified, and the same name cannot be used for more than one column of the table. For a description of the valid data types, see data-type.

Important: When you add columns to an HBase table, you must specify the UPDATE HBASE COLUMN MAPPING clause.

column-options

You can define column options including a comment, the nullability attribute, or constraints.

COMMENT column-comment

A comment that provides additional information about the column.

NULL | NOT NULL

NOT NULL specifies that the column will not contain null values. Because the Hadoop table is external to any local database server, Db2 Big SQL enforces nullability at query execution time (SQL5104N). The nullability attribute of a column is enforced during an INSERT or LOAD USING operation. Do not assign a null value to a column that is defined as NOT NULL (SQLSTATE 23502).

CONSTRAINT constraint-name

Specifies the name of a constraint. You can define the following constraints on a column:

PRIMARY KEY: A primary key must be unique and must have the NOT NULL attribute. A table cannot have more than one primary key.
UNIQUE: A unique key prevents duplicate values in one or more NOT NULL columns of a table. A table can have multiple unique keys.
REFERENCES table-name: Ensures that required table relationships are maintained and that data entry rules are followed. The column on which this parameter is specified becomes the foreign key to the referenced parent table table-name.

constraint-attributes

Defines attributes that are associated with primary key, unique, referential integrity, or check constraints.

NOT ENFORCED | ENFORCED

Specifies whether the constraint is enforced by the database manager during normal operations such as insert, update, or delete.

NOT ENFORCED

Specifies that the constraint is not enforced by the database manager. This is the default for Hadoop tables. A primary key or unique constraint cannot be NOT ENFORCED if there is a dependent ENFORCED referential constraint.

TRUSTED | NOT TRUSTED

Specifies whether the data can be trusted to conform to the constraint.

TRUSTED: Specifies that the data can be trusted to conform to the constraint. This is the default. Use this option only if the table data is independently known to conform to the constraint; otherwise, unpredictable results might occur.
NOT TRUSTED: Specifies that the data cannot be trusted to conform to the constraint. This option is valid only for referential integrity constraints (SQLSTATE 42613). Use this option when the data conforms to the constraint for most rows, but it is not independently known whether all current and future rows will conform. If a constraint is both not trusted and enabled for query optimization, it will not be used for any optimization that depends on the data being in complete conformity with the constraint.

ENFORCED

Although the ENFORCED keyword is available for compatibility with the syntax for local tables that are created with the CREATE TABLE statement, an error is returned when you specify this option for Hadoop tables (SQLSTATE 42858).

ENABLE QUERY OPTIMIZATION | DISABLE QUERY OPTIMIZATION

Specifies whether the constraint or functional dependency can be used for query optimization under appropriate circumstances.

ENABLE QUERY OPTIMIZATION: Specifies that the constraint is assumed to be true and can be used for query optimization. This is the default.
DISABLE QUERY OPTIMIZATION: Specifies that the constraint cannot be used for query optimization. This option cannot be specified for primary key and unique constraints (SQLSTATE 42613).

hbase-column-mapping

Specifies how the column that is being added to the table should be mapped to an HBase column.

UPDATE HBASE COLUMN MAPPING

A column can be appended to an existing HBase column, or mapped to a new HBase column.

APPEND

Specifies that the new column is to be appended to an existing HBase column or to the HBase row key.

cfamily:cname: The existing HBase column to which the new column is to be appended is specified by its name and by the name of the column family to which it belongs. The values are case sensitive: cf:cq is not the same as cf:CQ, for example.
KEY: Specifies that the new column is to be appended to the HBase row key.

ADD

Specifies that the new column is to be mapped to a new HBase column that is not being referenced by an existing column mapping.

cfamily:cname

The new HBase column to which the new column is to be mapped is specified by its name and by the name of the column family to which it belongs. The values are case sensitive: cf:cq is not the same as cf:CQ, for example.

hbase-cfamily-opts

Specifies the HBase configuration options that are applied to a specific column family. You specify the options in the form of option name/value pairs. In the case of a Boolean option, you can omit the option value to specify true.

COMPRESSION

Specifies a compression algorithm for the column family. Valid values are NONE (default), GZ (GZIP), SNAPPY, and LZ4.

BLOOM FILTER

Bloom filters help to identify whether a specific row and column combination exists in a block without having to load it. Bloom filters use extra space but can improve lookup times and help to reduce block cache churn by loading only the blocks that are required.

NONE: Specifies that no bloom filter is maintained. This is the default.
ROW: Specifies that the hash of the row is added to the bloom filter on each key insert. Use this option when queries have predicates on row keys.
ROWCOL: Specifies that the hash of the row plus column family plus column family qualifier is added to the bloom filter on each key insert. Use this option when queries have column projections.

IN MEMORY

Specifies whether data is to be cached for as long as possible to improve performance. Valid values are true and false (default).

NO IN MEMORY

This clause is retained for compatibility with prior releases. Use the IN MEMORY clause instead.

BLOCKCACHE

Specifies whether storage memory blocks are to be cached. Block caching can improve read performance. Valid values are true and false. If block caching is enabled and some blocks have to be evicted to free up space, the column family blocks for whom IN MEMORY is set to true are the last blocks to be evicted.

BLOCKSIZE

Specifies the memory block size (in bytes). This value is used to control how much data HBase is required to read from storage files during retrieval and what is cached in memory for subsequent access. The default is 64 KB.

DATA_BLOCK_ENCODING

Specifies in-cache key compression, which enables you to take advantage of sorted key redundancy in an HFile block. You can choose to store only the differences between consecutive keys.

KEEP_DELETED_CELLS

Specifies that deleted cells can be retrieved with a GET or SCAN operation, as long as these operations have a specified time range that ends before the deletion timestamp. This allows for point-in-time queries even in the presence of delete operations. Valid values are true and false. The default is false.

VERSIONS

An integer value that specifies the number of different versions of column values to store in the HBase table. The default is 1. Db2 Big SQL reads only the latest version of a column value.

MIN_VERSIONS

Specifies the minimum number of row versions to keep for each column family. The default is 0, which disables the option. If the TTL clause is set to a specific value, the data is deleted after the TTL value is passed. You can use the MIN_VERSIONS clause if you want to retain at least a few versions of the data.

REPLICATION_SCOPE

Specifies whether replication is to be enabled on the column family to minimize the effects of system failure. The default is 0, which means that the column family is not replicated. A value of 1 means that the column family is replicated.

TTL

Specifies a time to live (TTL) value (in seconds) for the column family. HBase automatically deletes all versions of the row (including the current version) that have exceeded the specified TTL value. The default is forever.

ADD constraint

Constraints are not enforced, but are used to help optimize query performance.

unique-constraint

Defines a unique key. This constraint does not allow duplicate values in one or more columns of the table. A table can have multiple unique keys. The columns that are specified in a unique constraint must be defined as NOT NULL.

referential-constraint

Defines a referential constraint.

CONSTRAINT constraint-name: Specifies a name for the referential constraint.
FOREIGN KEY (column-name, ...): The foreign key of the referential constraint is composed of the identified columns. Each name in the list of column names must identify a column of the parent table, and the same column must not be identified more than once. The number of foreign key columns must be the same as the number of parent key columns, and the data types of the corresponding columns must be compatible (SQLSTATE 42830).

check-constraint

Defines a check constraint.

constraint-name

Specifies a name for the check constraint.

CHECK check-condition

Sets restrictions on data that is added to a specific table. This parameter is enforced during INSERT operations, but is not enforced during a LOAD HADOOP operation, or on data that is made available outside of the database manager (by using Hive operations, for example).

search-condition

The search-condition must be true or unknown for every row of the table, and has the following restrictions:

A column reference must pertain to a column of the table that is being created.
The search condition cannot contain a TYPE predicate.
The search condition cannot contain any of the following items (SQLSTATE 42621):
- Subqueries
- CAST specifications with a SCOPE clause
- Column functions
- Functions that are not deterministic
- Functions that are defined to have an external action
- User-defined functions that are defined with either MODIFIES SQL or READS SQL DATA
- OLAP specifications
- Global variables
- References to generated columns other than the identity column
- An error tolerant nested table expression

Check constraints with search-condition are enforced when rows in the table are inserted or updated. A check constraint that is defined on a table automatically applies to all subtables of that table.

functional-dependency

Defines a functional dependency between columns or sets of columns.

The parent set of columns precedes the DETERMINED BY clause, and the child set of columns follows the DETERMINED BY clause. All restrictions on search-condition apply to both parent set and child set columns, and only simple column references are allowed in the column sets (SQLSTATE 42621). The same column must not be identified more than once in a functional dependency (SQLSTATE 42709). No column in the child set of columns can be a nullable column (SQLSTATE 42621).

A functional dependency is not enforced by the database manager during normal operations such as insert, update, or delete. The functional dependency might be used during query rewrite for optimization. Incorrect results might be returned if the integrity of a functional dependency is not maintained.

ADD partition-spec

Specifies a new partition definition. For Hadoop tables, a partition is a split of table data into file system directories based on the partitioning values in the table. The table must already have one or more partitioning columns specified. All columns that are referenced in the partitioning key must be specified in the new partition value. If you specify a set of partition column values that already exists, an error is returned. Partitions are not valid for HBase tables.

IF NOT EXISTS

Checks whether the specified partition already exists.

PARTITION (column-name=constant-value)

Specifies a column name and a value on which you define the partition. The following data types cannot be used as partitioning columns:

ARRAY
BINARY
BOOLEAN
DOUBLE
FLOAT
MAP
REAL
ROW
STRUCT
TIMESTAMP
VARBINARY

Important: A partitioning key value cannot include any trailing blank spaces. This could be a problem if you plan to partition on a CHAR data type, because these values are always padded to their full length with blank spaces. Db2 Big SQL also does not support partitioning on a zero-length string.

LOCATION location

Specifies an HDFS location if the data is already in a nonstandard location. The location value must be delimited by single quotation marks. The default location for partitioned data in a partitioned table named tab1 with partition columns pcol1...pcoln and values pval1...pvaln is shown in the following example:

/apps/hive/warehouse/schema.dbname/tab1/pcol1=pval1/.../pcoln=pvaln/...

Use the LOCATION clause to specify a non-default location. If the specified directory does not already exist, a directory is created. A drop table operation removes the location directory.

DROP constraint

Drops an existing primary key, foreign key, unique, or check constraint from the table.

DROP partition-spec

Specifies a partition that is to be dropped.

IF EXISTS: Checks whether the specified partition exists.
PARTITION column-name=constant-value: Specifies a column name and a value that identifies the partition.
PURGE: Specifies that the partition is to be removed from the catalogs and that the data is not to be passed from an encrypted zone to the Hive metastore trash. Specify this option if the partition was initially created in an encrypted zone, because moving the data to a non-encrypted zone returns an error during the drop operation.

ALTER constraint

Changes an existing foreign key or check constraint.

constraint-alteration: For details, see constraint-attributes.

ALTER COLUMN column-alteration

Alters the definition of a column. Only the specified attributes are altered; others remain unchanged.

column-name: Specifies the name of the column that is to be altered. The column-name must identify an existing column of the table (SQLSTATE 42703). The name must not be qualified. The name must not identify a column that is otherwise being added, altered, or dropped in the same ALTER TABLE statement (SQLSTATE 42711).
SET DATA TYPE: Specifies a new data type for the column. The new data type must be castable from the existing data type of the column (SQLSTATE 42837). The column alteration must also be a data type change that is supported in Hive. When a column is altered in this way, only the metadata for the table is affected; existing data on disk remains unchanged. For a description of the supported data types, see data-type.
SET NOT NULL: Specifies that the column cannot contain null values. No value for this column in existing rows of the table should be null; otherwise, an error might be returned when the data is selected, even if the row with the null value is not included in the final result set. This clause is not allowed if the column is specified in the foreign key of a referential constraint with a DELETE rule of SET NULL, and no other nullable columns exist in the foreign key (SQLSTATE 42831).
DROP NOT NULL: Drops the NOT NULL attribute of the column, thereby enabling the column to have null values. This clause is not allowed if the column is specified in the primary key or in a unique constraint on the table (SQLSTATE 42831).

ACTIVATE | DEACTIVATE ROW ACCESS CONTROL

Specifies whether row-level access control is to be activated on the table.

ACTIVATE ROW ACCESS CONTROL

Activates row-level access control on the table. The table must not be a typed table, a catalog table (SQLSTATE 55019), a nickname (SQLSTATE 42809), or a view (SQLSTATE 42809).

A default row permission is implicitly created and allows no access to any rows of the table, unless permitted by a row permission explicitly created by a user with SECADM authority.

When the table is referenced in a data manipulation statement, all enabled row permissions that have been created for the table, including the default row permission, are applied implicitly by the database manager to control the set of rows in the table that are accessible.

If a materialized query table (or a staging table) that depends on the table (directly or indirectly through a view) for which row-level access control is being activated and that materialized query table (or a staging table) does not already have row-level access control activated, row-level access control is implicitly activated for the materialized query table (or a staging table). This restricts direct access to the contents of the materialized query table (or a staging table). A query that explicitly references the table before such a row permission is defined will return a warning that there is no data in the table (SQLSTATE 02000). To provide access to the materialized query table (or a staging table), an appropriate row permission can be created, or an ALTER TABLE DEACTIVATE ROW ACCESS CONTROL statement on the materialized query table (or a staging table) can be issued to remove the row-level protection, if that is appropriate.

This clause is ignored if row access control is already defined as activated for the table.

DEACTIVATE ROW ACCESS CONTROL

Deactivates row-level access control on the table. When the table is referenced in a data manipulation statement, any existing enabled row permissions defined on the table are not applied by the database manager to control the set of rows in the table that are accessible.

This clause is ignored if row access control is not activated for the table.

ACTIVATE | DEACTIVATE COLUMN ACCESS CONTROL

Specifies whether column-level access control is to be activated on the table.

ACTIVATE COLUMN ACCESS CONTROL

Activates column-level access control on the table. The table must not be a typed table, a catalog table (SQLSTATE 55019), a nickname (SQLSTATE 42809) or a view (SQLSTATE 42809).

The access to the table is not restricted but when the table is referenced in a data manipulation statement, all enabled column masks that have been created for the table are applied implicitly by the database manager to mask the values returned for the columns referenced in the final result table of the queries.

If a materialized query table that depends on the table (directly or indirectly through a view) for which column-level access control is being activated and that materialized query table does not already have row-level access control activated, row-level access control is implicitly activated for the materialized query table. This restricts direct access to the contents of the materialized query table. A query that explicitly references the table before such a row permission is defined returns a warning that there is no data in the table (SQLSTATE 02000). To provide access to the materialized query table, an appropriate row permission can be created, or an ALTER TABLE DEACTIVATE ROW ACCESS CONTROL statement on the materialized query table can be issued to remove the row-level protection, if that is appropriate.

This clause is ignored if column-level access control is already defined as activated for the table.

DEACTIVATE COLUMN ACCESS CONTROL

Deactivates column-level access control on the table. When the table is referenced in a data manipulation statement, any existing enabled column masks defined on the table are not applied by the database manager to control the values returned for the columns referenced in the final result table of the queries.

This clause is ignored if column access control is not activated for the table.

RENAME COLUMN source-column-name TO target-column-name

Renames the specified column with target-column-name.

RENAME TO table-name

Renames the table. You cannot change the schema name when you rename a table.

SET TBLPROPERTIES ('string-constant1'='string-constant2')

Defines job properties that can configure input or output formats, and that you can use to specify more detailed information about the table. The properties are passed as is to the underlying InputFormat and OutputFormat classes. Properties that are not implemented by these classes are ignored. Db2 Big SQL does not validate the properties.

SET SERDE serde-class

Specifies the name of a Java™ class in the Db2 Big SQL server CLASSPATH that implements the Hive SerDe interface org.apache.hadoop.hive.serde2.SerDe.

WITH SERDEPROPERTIES ('string-constant1'='string-constant2'): Associates SerDe properties with the SerDe class. For example, if you want to eventually load HBase data into a Hive table, create the table by using this option and the hbase.columns.mapping parameter.

SET SERDEPROPERTIES ('string-constant1'='string-constant2')

Associates SerDe properties with the SerDe class. For example, if you want to eventually load HBase data into a Hive table, create the table by using this option and the hbase.columns.mapping parameter.

SET TABLE OPTIONS table-opt-name=table-opt-value

You can set one or more of the following table options.

MEMSTORE_FLUSHSIZE=n

Specifies the maximum size (in bytes) of the region server main memory storage before its contents are flushed to the file system. The memory storage for all column families is flushed together. The default is 64 MB.

READONLY

Specifies whether the table can be updated. Valid values are true or false. The default is false.

COMPACTION_ENABLED

Specifies whether the table is enabled for HBase compaction. Valid values are true or false. The default is determined in the HBase configuration.

DURABILITY = ASYNC_WAL | FSYNC_WAL | SKIP_WAL | SYNC_WAL | USE_DEFAULT

Specifies the durability of the write-ahead log (WAL). The WAL contains a record of the data changes. Choose one of the following values:

ASYNC_WAL: Writes mutations to the WAL asynchronously.
FSYNC_WAL: Writes mutations to the WAL synchronously and forces the entries to disk.
SKIP_WAL: Does not write mutations to the WAL.
SYNC_WAL: Writes mutations to the WAL synchronously.
USE_DEFAULT: Uses the HBase global default value (SYNC_WAL).

MAX_FILESIZE = n

Sets the maximum size (in bytes) to which a table region can grow. A region represents a set of table rows that sort between a specific start key and end key. If this size is exceeded, a region split is triggered. The default is 256 MB (268435456 bytes).

SET COLUMN FAMILY OPTIONS FOR (hbase-cfamily hbase-cfamily-opts)

Changes the HBase column family options for hbase-cfamily. For details, see hbase-cfamily-opts.

SET LOCATION location

Changes the specific distributed file system directory that is used to store data files.

Remember: If you alter the location of a table or a partition that is cached, the old location is uncached first, and then the new location is cached automatically.

SET FILEFORMAT file-format

Specifies the data file format. For information about the supported file formats, see File formats supported by Db2 Big SQL.

CHANGE COLUMN

Specify this clause to change the attributes of an existing column.

column-name

Specifies the name of the column whose attributes are to be changed.

column-definition

column-name: Specifies a new name for the column. If you do not want to change the column name when you change other attributes of the column, you must set the new column name to be the same as the current column name.
data-type: Specifies a new data type for the column. The new data type must be castable from the existing data type of the column (SQLSTATE 42837). The column alteration must also be a data type change that is supported in Hive. When a column is altered in this way, only the metadata for the table is affected; existing data on disk remains unchanged. For a description of the supported data types, see data-type.
column-options: You can define column options including a comment, the nullability attribute, or constraints. For details, see column-options.

TOUCH partition-spec

Specifies that the timestamp of the last partition column modification be updated in the catalog to the current time.

PARTITION column-name=constant-value

Specifies a column name and a value on which you define the partition.

SET LOCATION location

For details, see LOCATION.

RENAME TO partition-spec

Updates the value of a partition column and renames the partition. If the partition is part of an external table, Db2 Big SQL renames the file system directory that contains the partition data if the following conditions are met:

The data location for the partition has not been set explicitly by using the LOCATION clause.
The data location is not in object storage.
The table property external.partition.rename has not been set to false.

The table property external.partition.rename is unset by default, but you can use it to override default behavior when partitions are being renamed. If the property is set to true, and an external table has partition data in object storage, Db2 Big SQL attempts to rename the data location.

COMPACT

Triggers a compaction operation on a transactional Hive table or partition. Compaction rewrites a transactional table's underlying data files and helps to maintain query performance. For more information, see File compaction. If the table is not defined as transactional, the statement completes successfully but has no effect. If the table is partitioned, a partition-spec must be specified; otherwise, an error is returned.

MAJOR | MINOR: If a major compaction is specified, all base and delta files for a table or partition are rewritten as a single base file. If a minor compaction is specified, all delta files for a table or partition are rewritten as a single delta file. Although a minor compaction is less expensive than a major compaction, the former is less effective because the data might remain distributed across both base files and delta files.
AND WAIT: Specifies that the statement is to run until the compaction operation has completed. If this option is omitted, compaction is queued as a background process and the statement returns immediately before compaction is complete. For more information, see File compaction.
WITH OVERWRITE TBLPROPERTIES ('string-constant1'='string-constant2'): Specifies that the table properties are to be updated. You can use this clause to change compaction properties for the table or to update any other table properties for the table.

EXCHANGE PARTITION

Db2 Big SQL 7.2 and later service This feature requires the Db2 Big SQL 7.2 and later service on IBM Cloud Pak for Data 4.0 and later. Exchanging multiple partitions is supported in Hive 1.2.2, 1.3.0, 2.0.0, and later.

Moves a partition from source-table-name to the table being altered (table-name) and updates each table's metadata in the Hive metastore. The source table's partition folder in HDFS is renamed to move it to the destination table's partition folder. When using this option, ensure that all of the following conditions are true:

The source and destination table have the same schema name.
The destination table does not contain the partition that is to be exchanged.
Neither the source table nor the destination table is a transactional table. (To move partitions across transactional tables, you can use the LOAD HADOOP statement or the INSERT…OVERWRITE statement.)
There is no index.

partition-spec: This can be either a full or partial partition specification.
WITH TABLE source-table-name: Specifies the table whose partition is to be moved.

For examples, see 13.

Usage notes

Altering a column in a Hive table to specify a new data type for the column does not modify the table data, and only the metadata for the table is affected. To change the data, you must update the files in HDFS or create a new table and populate that table with data in the new format.
If you alter a table column to specify a new data type for the column, and the domain of the new type is lower than that of the original data type, table data values might be returned in a truncated form. Ensure that the entire range of values in your table column fits within the domain of the new data type.
Column alterations to tables in the Avro file format require a matching update to either the 'avro.schema.literal' property or to the contents of the file that is referenced by the 'avro.schema.url' property. For more information, see File formats that are supported by Db2 Big SQL.
Because of a current incompatibility between Hive and Db2 Big SQL, altering a Hive table to drop a column is not supported. Create a new table that does not include the unneeded column.
You cannot alter an HBase column from one data type to another data type.
When you alter a table to add a column that is derived from an HBase data source, the new column and its data are created. In the case of external data sources (such as delimited files, for example), data for the column that is being added must already exist.
Foreign key constraints are possible between Hadoop and local database tables.
The following rules apply when you alter the data type of a ROW or ARRAY column:
- Any ARRAY data type can be converted to any other ARRAY data type, regardless of cardinality and element type.
- Any ROW data type can be converted to any other ROW data type.
- Any MAP data type can be converted to any other MAP data type.
When the ALTER TABLE…DROP partition-spec statement runs concurrently with a SELECT statement, the SELECT statement might return SQLCODE -5105, and you might see FileNotFoundException in the bigsql.log file on your worker hosts. For more information, see General SQL limitations and restrictions.
When the ALTER TABLE…DROP partition-spec statement runs concurrently with an INSERT statement, you might see FileNotFoundException in the bigsql-sched.log file on your scheduler host. For more information, see General SQL limitations and restrictions.
When the COMPACT...AND WAIT option is specified, a lock timeout (SQLCODE -911 with reason code 68) might occur if the compaction operation takes a long time to complete.
The following syntax alternatives are supported for compatibility with previous versions of this product and with other database products:
- You can use the ARRAY syntax that Hive uses to define an ordinary ARRAY data type. For example, you can specify ARRAY<VARCHAR(8)> instead of VARCHAR(8) ARRAY[ ].
- You can use the MAP syntax that Hive uses to define an associative ARRAY data type. For example, you can specify MAP<VARCHAR(10), DOUBLE> instead of DOUBLE ARRAY [VARCHAR(10)].
- You can use the STRUCT syntax that Hive uses to define a ROW data type. For example, you can specify STRUCT<F1:INTEGER, F2:DOUBLE> instead of ROW(F1 INTEGER, F2 DOUBLE).

Restrictions

You can run the ALTER TABLE (HADOOP/HBASE) statement only as a top-level statement. It cannot be run from routines, compound blocks, or prepared statements.
Constraints that are defined on Hadoop tables cannot be enforced (SQLSTATE 42858).
You cannot alter a partitioning column in a Hadoop table (SQLSTATE 42858).

Examples

Create a table and then change the data type of one of its columns.

CREATE HADOOP TABLE orders (
    orderkey INT NOT NULL,
    orderdate TIMESTAMP,
    o_comment STRING COMMENT 'comment column' NOT NULL
  );

LOAD HADOOP
  USING FILE URL '/tmp/somedata/orders.txt'
  WITH SOURCE PROPERTIES (
    'ignore.extra.fields'='true',
    'field.delimiter'='|'
  )
  INTO TABLE orders
  OVERWRITE;

SELECT orderkey
  FROM orders 
  WHERE orderdate < '1992-07-01'  
  FETCH FIRST 3 ROWS ONLY;

ORDERKEY
---------
     6
    37
   128

ALTER TABLE orders
  ALTER COLUMN orderkey
  SET DATA TYPE DOUBLE;

SELECT orderkey
  FROM orders
  WHERE orderdate < '1992-07-01'
  FETCH FIRST 3 ROWS ONLY;

     ORDERKEY
---------------------
+6.00000000000000E+000
+3.70000000000000E+001  
+1.28000000000000E+002

Create a table named COMP and then alter it to add an SQL column (C3), mapping that column to an existing HBase column.

CREATE HBASE TABLE comp (
    k0 VARCHAR(10),
    k1 INT,
    k2 VARCHAR(10),
    k3 INT,
    c0 VARCHAR(10),
    c1 INT,
    c2 INT
  )
  COLUMN MAPPING (
    KEY MAPPED BY (k0, k1, k2, k3),
    cf:cq MAPPED BY (c0, c1, c2)
  );

ALTER TABLE comp 
  ADD c3 INT
  UPDATE HBASE COLUMN MAPPING APPEND cf:cq;

Alter the COMP table to add an SQL column (C3), mapping that column to a new HBase column in an existing HBase column family.
```
ALTER TABLE comp
  ADD c3 INT
  UPDATE HBASE COLUMN MAPPING ADD cf:cq2;
```
Alter the COMP table to add an SQL column (C3), mapping that column to a new HBase column family.
```
ALTER TABLE comp
  ADD c3 INT
  UPDATE HBASE COLUMN MAPPING ADD cf2:cq;
```

Create a table named COMP1 and then alter it to change the properties of multiple column families.

CREATE HBASE TABLE comp (
    k0 VARCHAR(3),
    k1 INT,
    k2 VARCHAR(2),
    k3 INT,
    c0 VARCHAR(10),
    c1 INT,
    c2 INT
  )
  COLUMN MAPPING (
    KEY MAPPED BY (k0, k1, k2, k3),
    f1:q1 MAPPED BY (c0, c1),
    f2:q2 MAPPED BY (c2)
  );

ALTER TABLE comp
  SET COLUMN FAMILY OPTIONS FOR (
    f1 TTL(10) REPLICATION_SCOPE(1),
    f2 MIN_VERSIONS(1)
  );

Alter an HBase table to change its table options.

ALTER TABLE hb1
  SET TABLE OPTIONS (
    "MEMSTORE_FLUSHSIZE"="1000",
    "READONLY"="true"
  );

Alter a table to set its table properties.

ALTER TABLE sample.t1
  SET TBLPROPERTIES (
    'numFiles'='1',
    'rawDataSize'='4', 
    'totalSize'='200',
    'numPartitions'='0'
  );

Alter a table to add or drop unique constraints.

ALTER TABLE t1
  ADD CONSTRAINT uq1 UNIQUE (i) NOT ENFORCED;
ALTER TABLE t1
  ADD UNIQUE (j) NOT ENFORCED;
ALTER TABLE t1
  DROP UNIQUE uq1;

Alter a table to add or drop primary key constraints.

ALTER TABLE t1
  ADD CONSTRAINT pk1 PRIMARY KEY (i) NOT ENFORCED;
    or
ALTER TABLE t1
  ADD PRIMARY KEY (i) NOT ENFORCED;

ALTER TABLE t1
  DROP PRIMARY KEY;

Alter a table to add or drop foreign key constraints.

ALTER TABLE t1
  ADD CONSTRAINT fk1 FOREIGN KEY (i) REFERENCES t2 NOT ENFORCED;
ALTER TABLE t1
  ADD FOREIGN KEY (j) REFERENCES t2 (k) NOT ENFORCED;
ALTER TABLE t1
  DROP FOREIGN KEY fk1;

You can dynamically change how files in a table are split for scanning by setting the scheduler.minSplitSize table property. By default, files are split at block boundaries, but you can set a size value that is larger than the block size, or you can specify that the split should follow the file boundaries.
- You can set the minimum split size to a specific value. In this example, 256 MB splits (268435456 bytes) are specified:
```
ALTER TABLE t3
  SET TBLPROPERTIES (
    'scheduler.minSplitSize'='268435456'
  );
```
- You can specify that the split should follow the file boundaries. In this example, each file is assigned to one reader:
```
ALTER TABLE t3
  SET TBLPROPERTIES (
    'scheduler.minSplitSize'='file'
  );
```
- The following example shows you how to reset the scheduler.minSplitSize table property to its default value:
```
ALTER TABLE t3
  SET TBLPROPERTIES (
    'scheduler.minSplitSize'=''
  );
```
Trigger a major compaction operation on a transactional Hive table named T1. Specify that the statement is to run until the compaction operation has completed. Update the compactor.mapreduce.map.memory.mb compaction map job property to have a value of 4096 MB.
```
ALTER TABLE t1
  COMPACT MAJOR AND WAIT WITH OVERWRITE TBLPROPERTIES(
    'compactor.mapreduce.map.memory.mb'='4096'
  );
```

You can use the EXCHANGE PARTITION clause to move a partition from a source table to the target table and alter each table's metadata.Db2 Big SQL 7.2 and later service This feature requires the Db2 Big SQL 7.2 and later service on IBM Cloud Pak for Data 4.0 and later.

Create two tables, each of which is partitioned by DS. Add a partition to the first table T1, and then move that partition to the second table T2.

CREATE TABLE t1 (
    a VARCHAR(100),
    b VARCHAR(100)
  ) PARTITIONED BY (ds VARCHAR(100));

CREATE TABLE t2 (
    a VARCHAR(100),
    b VARCHAR(100)
  ) PARTITIONED BY (ds VARCHAR(100));

ALTER TABLE t1
  ADD PARTITION (ds='1');

ALTER TABLE t2
  EXCHANGE PARTITION (ds='1') WITH TABLE t1;

Create two tables with multiple partitioning columns. Add three partitions to the first table T1, and then move all three partitions to the second table T2.

CREATE TABLE t1 (
    a VARCHAR(100),
    b VARCHAR(100)
  ) PARTITIONED BY (ds VARCHAR(100), hr VARCHAR(100));

CREATE TABLE t2 (
    a VARCHAR(100),
    b VARCHAR(100)
  ) PARTITIONED BY (ds VARCHAR(100), hr VARCHAR(100));

ALTER TABLE t1
  ADD PARTITION (ds = '1', hr = '00');

ALTER TABLE t1
  ADD PARTITION (ds = '1', hr = '01');

ALTER TABLE t1
  ADD PARTITION (ds = '1', hr = '03');

ALTER TABLE t2
  EXCHANGE PARTITION (ds='1') WITH TABLE t1;

Create two tables with multiple partitioning columns. Add a partition to the first table T1, and then move partitioned data (d1=1, d2=2) to the second table T2.

CREATE TABLE t1 (
    a INT
  ) PARTITIONED BY (d1 INT, d2 INT);

CREATE TABLE t2 (
    a INT
  ) PARTITIONED BY (d1 INT, d2 INT);

ALTER TABLE t1
  ADD PARTITION (d1=1, d2=2);

ALTER TABLE t2
  EXCHANGE PARTITION (d1 = 1, d2 = 2) WITH TABLE t1;