CREATE INDEX statement

The CREATE INDEX statement is used to define an index on a database table.

An index can be defined on XML data, or on relational data. The CREATE INDEX statement is also used to create an index specification (metadata that indicates to the optimizer that a data source table has an index).

Invocation

This statement can be embedded in an application program or issued through the use of dynamic SQL statements. It is an executable statement that can be dynamically prepared only if DYNAMICRULES run behavior is in effect for the package (SQLSTATE 42509).

Authorization

The privileges held by the authorization ID of the statement must include at least one of the following authorities:

One of:
- CONTROL privilege on the table or nickname on which the index is defined
- INDEX privilege on the table or nickname on which the index is defined
- SCHEMAADM authority on the schema containing the table or nickname on which the index is defined
and one of:
- IMPLICIT_SCHEMA authority on the database, if the implicit or explicit schema name of the index does not exist
- CREATEIN privilege on the schema, if the schema name of the index refers to an existing schema
- SCHEMAADM authority on the schema, if the schema name of the index refers to an existing schema
DBADM authority

No explicit privilege is required to create an index on a declared temporary table.

Syntax

Notes:

¹ In a federated system, table-name must identify a table in the federated database. It cannot identify a data source table.
² If nickname is specified, the CREATE INDEX statement creates an index specification. In this case, INCLUDE, xml-index-specification, CLUSTER, EXTEND USING, PCTFREE, MINPCTUSED, DISALLOW REVERSE SCANS, ALLOW REVERSE SCANS, PAGE SPLIT, or COLLECT STATISTICS cannot be specified.
³ The BUSINESS_TIME WITHOUT OVERLAPS clause can be specified only if UNIQUE is specified.
⁴ The IN tablespace-name clause can be specified only for a nonpartitioned index on a partitioned table.
⁵ The INCLUDE clause can be specified only if UNIQUE is specified.
⁶ If xml-index-specification is specified, column-name DESC, INCLUDE, or CLUSTER cannot be specified.

xml-index-specification

Notes:

¹ The alternative syntax GENERATE KEYS USING XMLPATTERN can be used.

xmlpattern-clause

namespace-declaration

pattern-expression

forward-axis

xmlname-test

xml-wildcard

xmlkind-test

function-step

xmltype-clause

data-type

sql-data-type

Notes:

¹ If you specify a function name, such as fn:upper-case, at the end of the XML pattern, the supported index data types might be a subset of the index data types shown here. You can check for valid index data types in the description for xmlpattern-clause.

Description

UNIQUE

If ON table-name is specified, UNIQUE prevents the table from containing two or more rows with the same value of the index key. The uniqueness is enforced at the end of the SQL statement that updates rows or inserts new rows.

The uniqueness is also checked during the execution of the CREATE INDEX statement. If the table already contains rows with duplicate key values, the index is not created.

If the index is on an XML column (the index is an index over XML data), the uniqueness applies to values with the specified pattern-expression for all rows of the table. Uniqueness is enforced on each value after the value has been converted to the specified sql-data-type. Because converting to the specified sql-data-type might result in a loss of precision or range, or different values might be hashed to the same key value, multiple values that appear to be unique in the XML document might result in duplicate key errors. The uniqueness of character strings depends on XQuery semantics where trailing blanks are significant. Therefore, values that would be duplicates in SQL but differ in trailing blanks are considered unique values in an index over XML data.

When UNIQUE is used, null values are treated as any other values. For example, if the key is a single column that may contain null values, that column may contain no more than one null value.

If the UNIQUE option is specified, and the table has a distribution key, the columns in the index key must be a superset of the distribution key. That is, the columns specified for a unique index key must include all the columns of the distribution key (SQLSTATE 42997).

Primary or unique keys cannot be subsets of dimensions (SQLSTATE 429BE).

If ON nickname is specified, UNIQUE should be specified only if the data for the index key contains unique values for every row of the data source table. The uniqueness will not be checked.

For an index over XML data, UNIQUE can be included only if the context step of the pattern-expression specifies a single complete path and does not contain a descendant or descendant-or-self axis, "//", an xml-wildcard, node(), or processing-instruction() (SQLSTATE 429BS).

In a partitioned database environment, the following rules apply to a table with one or more XML columns:

A distributed table cannot have a unique index over XML data.
A unique index over XML data is supported only on a table that does not have a distribution key and that is on a single node multi-partition database.
If a unique index over XML data exists on a table, the table cannot be altered to add a distribution key.

INDEX index-name

Names the index or index specification. The name, including the implicit or explicit qualifier, must not identify an index or index specification that is described in the catalog, or an existing index on a declared temporary table (SQLSTATE 42704). The qualifier must not be SYSIBM, SYSCAT, SYSFUN, or SYSSTAT (SQLSTATE 42939).

ON table-name or nickname

The table-name identifies a table on which an index is to be created. The table must be a base table (not a view), a created temporary table, a declared temporary table, a materialized query table that exists at the current server, or a declared temporary table. The name of a declared temporary table must be qualified with SESSION.

The table-name must not identify a catalog table (SQLSTATE 42832) or a column-organized table (SQLSTATE 42858).

If UNIQUE is specified and table-name is a typed table, it must not be a subtable (SQLSTATE 429B3).

nickname is the nickname on which an index specification is to be created. The nickname references either a data source table whose index is described by the index specification, or a data source view that is based on such a table. The nickname must be listed in the catalog.

If the index key contains at least one key-expression, the table-name cannot be any of the following objects:

A materialized query table (MQT) (SQLSTATE 429BX)
A staging table (SQLSTATE 429BX)
A typed table (SQLSTATE 429BX)
A declared or created user temporary table (SQLSTATE 42995)
A table that is an event monitor target (SQLSTATE 429BX)
A nickname (SQLSTATE 42601)

column-name

For an index, column-name identifies a column that is to be part of the index key. For an index specification, column-name is the name by which the federated server references a column of a data source table.

The number of columns plus twice the number of identified periods cannot exceed 64 (SQLSTATE 54008). If table-name is a typed table, the number of columns cannot exceed 63 (SQLSTATE 54008). If table-name is a subtable, at least one column-name must be introduced in the subtable; that is, not inherited from a supertable (SQLSTATE 428DS). No column-name can be repeated (SQLSTATE 42711). The maximum number of columns in an index with random ordering is reduced by one for each column that is specified with random ordering

The sum of the stored lengths of the specified columns must not be greater than the index key length limit for the page size. For key length limits, see SQL limits. If table-name is a typed table, the index key length limit is further reduced by 4 bytes. If the index has random ordering, the index key length is further reduced by 2 bytes.

Note that this length can be reduced by system overhead, which varies according to the data type of the column and whether it is nullable. For more information on overhead affecting this limit, see Byte Counts in CREATE TABLE.

No LOB column or distinct type column based on a LOB can be used as part of an index, even if the length attribute of the column is small enough to fit within the index key length limit for the page size (SQLSTATE 54008). A structured type column can only be specified if the EXTEND USING clause is also specified (SQLSTATE 42962). If the EXTEND USING clause is specified, only one column can be specified, and the type of the column must be a structured type or a distinct type that is not based on a LOB (SQLSTATE 42997).

If an index has only one column, and that column has the XML data type, and the GENERATE KEY USING XMLPATTERN clause is also specified, the index is an index over XML data. A column with the XML data type can be specified only if the GENERATE KEY USING XMLPATTERN clause is also specified (SQLSTATE 42962). If the GENERATE KEY USING XMLPATTERN clause is specified, only one column can be specified, and the type of the column must be XML.

key-expression

Specifies an expression that must evaluate to a scalar value with the following restrictions:

The expression must return a scalar value which can be indexed (no LOBs, XMLs, LONG VARCHAR or LONG VARGRAPHIC) (SQLSTATE 429BX)
The following data types are not supported as input to the expression-based index key:
- LONG VARCHAR and LONG VARGRAPHIC (deprecated data types)
- XML
- User defined distinct types on any of the types listed previously
- User-defined weakly typed distinct types that include a data type constraint
- User-defined structured types and reference types
- Array, cursor, and row types
The expression must contain at least one column reference (SQLSTATE 429BX)
The expression cannot contain any of the following (SQLSTATE 429BX):
- Subqueries
- Aggregate functions
- Non-deterministic functions
- Functions with external actions
- User-defined functions
- Text search functions, such as SCORE, CONTAINS
- Partitioning scalar functions, such as HASHEDVALUE
- Dynamic data type scalar functions, such TYPE_ID, TYPE_NAME, TYPE_SCHEMA
- Host Variables
- Parameter markers
- Sequence references
- Special registers and built-in functions that depend on the value of a special register
- Global variables and built-in functions that depend on the value of a global variable
- A TYPE predicate
- Regular expression functions or the REGEXP_LIKE predicate
- A LIKE predicate
- String scalar functions INSTR, INSTRB, LOCATE, LOCATE_IN_STRING, POSITION or POSSTR
- OLAP specifications
- Dereference operations or DEREF functions where the scoped reference argument is other than the object identifier (OID) column
- CAST specifications with a SCOPE clause
- Error tolerant nested-table-expressions

If an index key includes at least one key-expression, the index key is referred to as an expression-based index key.

ASC

Specifies that index entries are to be kept in ascending order of the column values; this is the default setting. ASC cannot be specified for indexes that are defined with EXTEND USING (SQLSTATE 42601).

DESC

Specifies that index entries are to be kept in descending order of the column values. DESC cannot be specified for indexes that are defined with EXTEND USING, or if the index is an index over XML data (SQLSTATE 42601).

RANDOM

Specifies that index entries are to be kept in random order of the column values. RANDOM cannot be specified in the following cases:

With the EXTENDED USING clause (SQLSTATE 42613).
With the SPECIFICATION ONLY clause (SQLSTATE 42613).
For an index that is created on a declared or created globally temporary table (DGTT or CGTT) (SQLSTATE 42995).
For an index that is created on a column-organized table (SQLSTATE 42858).
If the CLUSTER option is specified (SQLSTATE 42613).
On an indexed column that is of type CHAR or VARCHAR with ICU collations, except when the columns are declared as FOR BIT DATA (SQLSTATE 42997).
On an indexed column that is of type GRAPHIC or VARGRAPHIC with ICU collations (SQLSTATE 42997).
On an indexed column that is of type XML (SQLSTATE 42613).
On an index which includes a key-expression (SQLSTATE 42997).

BUSINESS_TIME WITHOUT OVERLAPS

BUSINESS_TIME WITHOUT OVERLAPS can only be specified for an index defined as UNIQUE (SQLSTATE 428HW) to indicate that for the rest of the specified keys, the values are unique with respect to any period of time. BUSINESS_TIME WITHOUT OVERLAPS can only be specified as the last item in the list. When BUSINESS_TIME WITHOUT OVERLAPS is specified, the end column and begin column of the period BUSINESS_TIME are automatically added to the index key in ascending order and enforce that there are no overlaps in time. When BUSINESS_TIME WITHOUT OVERLAPS is specified, the columns of the BUSINESS_TIME period must not be specified as key columns, as columns in the partitioning key, or as columns in the distribution key (SQLSTATE 428HW).

PARTITIONED

Indicates that a partitioned index should be created. The table-name must identify a table defined with data partitions (SQLSTATE 42601).

If the table is partitioned and neither PARTITIONED nor NOT PARTITIONED is specified, the index is created as partitioned (with a few exceptions). A nonpartitioned index is created instead of a partitioned index if any of the following situations apply:

UNIQUE is specified and the index key does not include all the table partitioning key columns.
A spatial index is created.

A partitioned index with a definition that duplicates the definition of a nonpartitioned index is not considered to be a duplicate index. For more details, see the Rules section in this topic.

The PARTITIONED keyword cannot be specified for the following indexes:

An index on a nonpartitioned table (SQLSTATE 42601)
A unique index where the index key does not explicitly include all the table partitioning key columns (SQLSTATE 42990)
A spatial index (SQLSTATE 42997)

A partitioned index cannot be created on a partitioned table that has detached dependent tables, for example, MQTs (SQLSTATE 55019).

The table space placement for an index partition of the partitioned index is determined by the following rules:

If the table being indexed was created using the partition-tablespace-options INDEX IN clause of the CREATE TABLE statement, the index partition is created in the table space specified in that INDEX IN clause.
If the CREATE TABLE statement for the table being indexed did not specify the partition-tablespace-options INDEX IN clause, the index partition partitioned index is created in the same table space as the corresponding data partition that it indexes.

The IN clause of the CREATE INDEX statement is not supported for partitioned indexes (SQLSTATE 42601). The tablespace-clauses INDEX IN clause of the CREATE TABLE statement is ignored for partitioned indexes. If BUSINESS_TIME WITHOUT OVERLAPS is specified for the index key, the partitioning key columns must not include the begin or end column of the BUSINESS_TIME period (SQLSTATE 428HW).

NOT PARTITIONED

Indicates that a nonpartitioned index should be created that spans all of the data partitions defined for the table. The table-name must identify a table defined with data partitions (SQLSTATE 42601).

A nonpartitioned index with a definition that duplicates the definition of a partitioned index is not considered to be a duplicate index. For more details, see the Rules section in this topic.

The table space placement for a the nonpartitioned index is determined by the following rules:

If you specify the IN clause of the CREATE INDEX statement, the nonpartitioned index is placed in the table space specified in that IN clause.
If you do not specify the IN clause of the CREATE INDEX statement, the following rules determine the table space placement of the nonpartitioned index:
- If the table being indexed was created using the tablespace-clauses INDEX IN clause of the CREATE TABLE statement, the nonpartitioned index is placed in the table space specified in that INDEX IN clause.
- If the table being indexed was created without using the tablespace-clauses INDEX IN clause of the CREATE TABLE statement, the nonpartitioned index is created in the table space of the first visible or attached data partition of the table. The first visible or attached data partition of the table is the first partition in the list of data partitions that are sorted on the basis of range specifications. Also, the authorization ID of the statement is not required to have the USE privilege on the default table space.

IN tablespace-name

Specifies the table space in which the nonpartitioned index on a partitioned table is created. This clause cannot be specified for a partitioned index or an index on a nonpartitioned table (SQLSTATE 42601). The specification of a table space specifically for the index overrides a specification made using the INDEX IN clause when the table was created.

The table space specified by tablespace-name must be in the same database partition group as the data table spaces for the table and manage space in the same way as the other table spaces of the partitioned table (SQLSTATE 42838); it must be a table space on which the authorization ID of the statement holds the USE privilege.

If the IN clause is not specified, the index is created in the table space that was specified by the INDEX IN clause on the CREATE TABLE statement. If no INDEX IN clause was specified, the table space of the first visible or attached data partition of the table is used. This is the first partition in the list of data partitions that are sorted on the basis of range specifications. If the IN clause is not specified, the authorization ID of the statement is not required to have the USE privilege on the default table space.

SPECIFICATION ONLY

Indicates that this statement will be used to create an index specification that applies to the data source table referenced by nickname. SPECIFICATION ONLY must be specified if nickname is specified (SQLSTATE 42601). It cannot be specified if table-name is specified (SQLSTATE 42601).

If the index specification applies to an index that is unique, the database manager does not verify that the column values in the remote table are unique. If the remote column values are not unique, queries against the nickname that include the index column might return incorrect data or errors.

This clause cannot be used when creating an index on a created temporary table or declared temporary table (SQLSTATE 42995).

INCLUDE

This keyword introduces a clause that specifies additional columns to be appended to the set of index key columns. Any columns included with this clause are not used to enforce uniqueness. These included columns might improve the performance of some queries through index only access. The columns must be distinct from the columns used to enforce uniqueness (SQLSTATE 42711). UNIQUE must be specified when INCLUDE is specified (SQLSTATE 42613). The limits for the number of columns and sum of the length attributes apply to all of the columns in the unique key and in the index.

This clause cannot be used with created temporary tables or declared temporary tables (SQLSTATE 42995).

column-name: Identifies a column that is included in the index but not part of the unique index key. The same rules apply as defined for columns of the unique index key. The keywords ASC, DESC, or RANDOM can be specified following column-name but have no effect on the order.
key-expression: Specifies an expression that is included in the index but not part of the unique index key. The same rules apply as defined for expressions of the unique index key. The keywords ASC, DESC, or RANDOM can be specified after key-expression but have no effect on the order.

INCLUDE cannot be specified for indexes that are defined with EXTEND USING, if nickname is specified, or if the index is defined on an XML column (SQLSTATE 42601).

xml-index-specification

Specifies how index keys are generated from XML documents that are stored in an XML column. xml-index-specification cannot be specified if there is more than one index column, or if the column does not have the XML data type.

This clause only applies to XML columns (SQLSTATE 429BS).

GENERATE KEY USING XMLPATTERN xmlpattern-clause

Specifies the parts of an XML document that are to be indexed. XML pattern values are the indexed values generated by the xmlpattern-clause. List data type nodes are not supported in the index. If a node is qualified by the xmlpattern-clause and an XML schema exists that specifies that the node is a list data type, then the list data type node cannot be indexed (SQLSTATE 23526 for CREATE INDEX statements, or SQLSTATE 23525 for INSERT and UPDATE statements).

xmlpattern-clause

Contains a pattern expression that identifies the nodes that are to be indexed. It consists of an optional namespace-declaration and a required pattern-expression.

namespace-declaration

If the pattern expression contains qualified names, a namespace-declaration must be specified to define namespace prefixes. A default namespace can be defined for unqualified names.

DECLARE NAMESPACE namespace-prefix=namespace-uri: Maps namespace-prefix, which is an NCName, to namespace-uri, which is a string literal. The namespace-declaration can contain multiple namespace-prefix-to-namespace-uri mappings. The namespace-prefix must be unique within the list of namespace-declaration (SQLSTATE 10503).
DECLARE DEFAULT ELEMENT NAMESPACE namespace-uri: Declares the default namespace URI for unqualified element names or types. If no default namespace is declared, unqualified names of elements and types are in no namespace. Only one default namespace can be declared (SQLSTATE 10502).

pattern-expression

Specifies the nodes in an XML document that are indexed. The pattern-expression can contain pattern-matching characters (*).

/ (forward slash)

Separates path expression steps.

// (double forward slash)

This is the abbreviated syntax for /descendant-or-self::node()/. You cannot use // (double forward slash) if you also specify UNIQUE.

forward-axis

child::: Specifies children of the context node. This is the default, if no other forward axis is specified.
@: Specifies attributes of the context node. This is the abbreviated syntax for attribute::.
attribute::: Specifies attributes of the context node.
descendant::: Specifies the descendants of the context node. You cannot use descendant:: if you also specify UNIQUE.
self::: Specifies just the context node itself.
descendant-or-self::: Specifies the context node and the descendants of the context node. You cannot use descendant-or-self:: if you also specify UNIQUE.

xmlname-test

Specifies the node name for the step in the path using a qualified XML name (xml-qname) or a wildcard (xml-wildcard).

xml-ncname

An XML name as defined by XML 1.0. It cannot include a colon character.

xml-qname

Specifies a qualified XML name (also known as a QName) that can have two possible forms:

xml-nsprefix:xml-ncname, where the xml-nsprefix is an xml-ncname that identifies an in-scope namespace
xml-ncname, which indicates that the default namespace should be applied as the implicit xml-nsprefix

xml-wildcard

Specifies an xml-qname as a wildcard that can have three possible forms:

* (a single asterisk character) indicates any xml-qname
xml-nsprefix:* indicates any xml-ncname within the specified namespace
*:xml-ncname indicates a specific XML name in any in-scope namespace

You cannot use xml-wildcard in the context step of a pattern expression if you also specify UNIQUE.

xmlkind-test

Use these options to specify what types of nodes you pattern match. The following options are available to you:

node(): Matches any node. You cannot use node() if you also specify UNIQUE.
text(): Matches any text node.
comment(): Matches any comment node.
processing-instruction(): Matches any processing instruction node. You cannot use processing-instruction() if you also specify UNIQUE.

function-step

Use these function calls to specify indexes with special properties, such as case insensitivity. Only one function step is allowed per XMLPATTERN clause. Function steps can be applied only on elements or attributes. No xmlkind-test option can be placed immediately before the function step. The function cannot be used in the middle of the XMLPATTERN, and must appear only in the final step. Currently, only the fn:upper-case and fn:exists functions are supported.

Note that instead of specifying the prefix fn: for the function name, you can specify another valid namespace, or you can omit fn: entirely.

fn:upper-case: Force the index values to be stored in the uppercase form. The first parameter of fn:upper-case is mandatory, and must be a context item expression (' . '); the second parameter is optional, and is the locale. If fn:upper-case appears in the pattern, VARCHAR and VARCHAR HASHED are the only index types supported.
fn:exists: Check for the existence of an element or attribute item in the XML document. If the item exists, this predicate returns true. The parameter of fn:exists is mandatory, and must be an element or attribute. If this function is used in the index path, the index type must be defined as VARCHAR(1).

xmltype-clause

AS data-type

Specifies the data type to which indexed values are converted before they are stored. Values are converted to the index XML data type that corresponds to the specified index SQL data type.

Table 1. Corresponding index data types
Index XML data type	Index SQL data type
xs:string	VARCHAR(integer), VARCHAR HASHED
xs:double	DOUBLE
xs:int	INTEGER
xs:decimal	DECIMAL
xs:date	DATE
xs:dateTime	TIMESTAMP

For VARCHAR(integer) and VARCHAR HASHED, the value is converted to an xs:string value using the XQuery function fn:string. The length attribute of VARCHAR(integer) is applied as a constraint to the resulting xs:string value. An index SQL data type of VARCHAR HASHED applies a hash algorithm to the resulting xs:string value to generate a hash code that is inserted into the index.

For indexes using the data types DOUBLE, DATE, INTEGER, DECIMAL, and TIMESTAMP, the value is converted to the index XML data type using the XQuery cast expression.

If the index is unique, the uniqueness of the value is enforced after the value is converted to the indexed type.

data-type

The following data type is supported:

sql-data-type

Supported SQL data types are:

VARCHAR(integer[OCTETS])

If this form of VARCHAR is specified, integer is used as a constraint. If document nodes that are to be indexed have values that are longer than integer, the documents are not inserted into the table if the index already exists. If the index does not exist, the index is not created. integer is a value between 1 and a page size-dependent maximum. Table 2 shows the maximum value for each page size.

Table 2. Maximum length of document nodes by page size
Page size	Maximum length of document node (bytes)
4KB	817
8KB	1841
16KB	3889
32KB	7985

XQuery semantics are used for string comparisons, where trailing blanks are significant. This differs from SQL semantics, where trailing blanks are insignificant during comparisons.

OCTETS: Specifies that the units of the length attribute is bytes.

When no string units are specified for a character string data type in a Unicode database, the string units are implicit and determined by the value of the NLS_STRING_UNITS global variable or string_units database configuration parameter. When the implicit string units are CODEUNITS32, the OCTETS keyword must be specified (SQLSTATE 42601).

In a non-Unicode database, the string units for character string data types are OCTETS.

VARCHAR HASHED

Specify VARCHAR HASHED to handle indexing of arbitrary length character strings. The length of an indexed string has no limit. An eight-byte hash code is generated over the entire string. Indexes that use these hashed character strings can be used only for equality lookups. XQuery semantics are used for string equality comparisons, where trailing blanks are significant. This differs from SQL semantics, where trailing blanks are insignificant during comparisons. The hash on the string preserves XQuery semantics for equality and not SQL semantics.

DOUBLE

Specifies that the data type DOUBLE is used for indexing numeric values. Unbounded decimal types and 64 bit integers may lose precision when they are stored as a DOUBLE value. The values for DOUBLE may include the special numeric values NaN, INF, -INF, +0, and -0, even though the SQL data type DOUBLE itself does not support these values.

INTEGER

Specifies that the data type INTEGER is used for indexing XML values. Note that the XML schema data type xs:integer allows a greater range of values than does the integer SQL data type. If an out-of-range value is encountered, an error is returned. If a value conforms to the lexical format of xs:double but does not conform to the lexical format of xs:int, such as 3.5, 3.0, or 3E1, an error is also returned.

DECIMAL(integer, integer)

Specifies that the data type DECIMAL is used for indexing XML values. The DECIMAL type takes two parameters, precision and scale. The first parameter, precision, is an integer constant with a value in the range of 1 to 31 that specifies the total number of digits. The second parameter, scale, is an integer constant that is greater than or equal to zero, and less than or equal to precision. The scale specifies the number of digits to the right of the decimal point.

Digits are not truncated from the end of a decimal number. An error is returned if the number of digits to the right of the decimal separator character is greater than the scale. Also, an error is returned if the number of significant digits to the left of the decimal character (the whole part of the number) is greater than precision.

DATE

Specifies that the data type DATE is used for indexing XML values. Note that the XML schema data type for xs:date allows greater range of values than the pureXML® xs:date data type that corresponds to the SQL data type. If an out-of-range value is encountered, an error is returned.

TIMESTAMP

Specifies that the data type TIMESTAMP is used for indexing XML values. Note that the XML schema data type for xs:dateTime allows greater range of values and fractional seconds precision than the pureXML xs:dateTime data type that corresponds to the SQL data type. If an out-of range value is encountered, an error is returned.

IGNORE INVALID VALUES

Specifies that XML pattern values that are invalid lexical forms for the target index XML data type are ignored and that the corresponding values in the stored XML documents are not indexed by the CREATE INDEX statement. By default, invalid values are ignored. During insert and update operations, the invalid XML pattern values are not indexed, but XML documents are still inserted into the table. No error or warning is raised, because specifying these data types is not a constraint on the XML pattern values (XQuery expressions that search for the specific XML index data type will not consider these values).

The rules for what XML pattern values can be ignored are determined by the specified SQL data type.

If the SQL data type is VARCHAR(integer) or VARCHAR HASHED, XML pattern values are never ignored since any sequence of characters is valid.
If the SQL data type is DOUBLE, DECIMAL, or INTEGER, any XML pattern value that does not conform to the lexical format of the XML data type xs:double is ignored. If the SQL data type is DECIMAL or INTEGER and the XML pattern value conforms to the lexical format of the XML data type xs:double but not to the lexical format of xs:decimal or xs:int, respectively, an error is returned. For example, if the SQL data type is INTEGER, the XML pattern values of 3.5, 3.0, and 3e0 conform to the lexical format of xs:double but return an error (SQLSTATE 23525) because they do not conform to the lexical format of xs:int. XML pattern values such as 'A123' or 'hello' are ignored for the same index.
If the SQL data type is a datetime data type, any XML pattern value that does not conform to the lexical format of the corresponding XML data type (xs:date or xs:dateTime) is ignored.

If an XML pattern value does conform to the appropriate lexical format, an error is returned if the value is outside the value space for the data type or exceeds the maximum length or precision and scale of the specified SQL data type. If the index does not exist, the index is not created (SQLSTATE 23526).

REJECT INVALID VALUES

All XML pattern values must be valid in the context of the lexical definition of the index XML data type. In addition the value must be in the range of the value space of the index XML data type. See the Related reference section, later, for links to details on the lexical definition and value space for each data type. For example, when you specify the REJECT INVALID VALUES clause, if you create an index of INTEGER type, XML pattern values such as 3.5, 3.0, 3e0, 'A123' and 'hello' will return an error (SQLSTATE 23525). XML data is not inserted or updated in the table if the index already exists (SQLSTATE 23525). If the index does not exist, the index is not created (SQLSTATE 23526).

CLUSTER

Specifies that the index is the clustering index of the table. The cluster factor of a clustering index is maintained or improved dynamically as data is inserted into the associated table, by attempting to insert new rows physically close to the rows for which the key values of this index are in the same range. Only one clustering index may exist for a table so CLUSTER may not be specified if it was used in the definition of any existing index on the table (SQLSTATE 55012). A clustering index may not be created on a table that is defined to use append mode (SQLSTATE 428D8).

CLUSTER is disallowed if nickname is specified, or if the index is an index over XML data (SQLSTATE 42601).

This clause cannot be used with the following types of tables:

Created temporary tables or declared temporary tables (SQLSTATE 42995)
Range-clustered tables (SQLSTATE 429BG)

EXTEND USING index-extension-name

Names the index-extension used to manage this index. If this clause is specified, then there must be only one column-name specified and that column must be a structured type or a distinct type (SQLSTATE 42997). The index-extension-name must name an index extension described in the catalog (SQLSTATE 42704). For a distinct type, the column must exactly match the type of the corresponding source key parameter in the index extension. For a structured type column, the type of the corresponding source key parameter must be the same type or a supertype of the column type (SQLSTATE 428E0).

This clause cannot be used with created temporary tables or declared temporary tables (SQLSTATE 42995).

This clause cannot be used if the index key contains at least one key-expression (SQLSTATE 42601).

constant-expression: Identifies values for any required arguments for the index extension. Each expression must be a constant value with a data type that exactly matches the defined data type of the corresponding index extension parameters, including length or precision, and scale (SQLSTATE 428E0). This clause must not exceed 32␠768 bytes in length in the database code page (SQLSTATE 22001).

PCTFREE integer

Specifies what percentage of each index page to leave as free space when building the index. The first entry in a page is added without restriction. When additional entries are placed in an index page at least integer percent of free space is left on each page. The value of integer can range from 0 to 99. If a value greater than 10 is specified, only 10 percent free space will be left in non-leaf pages.

If an explicit value for PCTFREE is not provided, and if DB2_INDEX_PCTFREE_DEFAULT is not set, then PCTFREE will have a default value of 10.

PCTFREE is disallowed if nickname is specified (SQLSTATE 42601). This clause cannot be used with created temporary tables or declared temporary tables (SQLSTATE 42995).

LEVEL2 PCTFREE integer

Specifies what percentage of each index level 2 page to leave as free space when building the index. The value of integer can range from 0 to 99. If LEVEL2 PCTFREE is not set, a minimum of 10 or PCTFREE percent of free space is left on all non-leaf pages. If LEVEL2 PCTFREE is set, integer percent of free space is left on level 2 intermediate pages, and a minimum of 10 or integer percent of free space is left on level 3 and higher intermediate pages.

LEVEL2 PCTFREE is disallowed if nickname is specified (SQLSTATE 42601). This clause cannot be used with created temporary tables or declared temporary tables (SQLSTATE 42995).

MINPCTUSED integer

Indicates whether index leaf pages are merged online, and the threshold for the minimum percentage of space used on an index leaf page. If, after a key is removed from an index leaf page, the percentage of space used on the page is at or below integer percent, an attempt is made to merge the remaining keys on this page with those of a neighboring page. If there is sufficient space on one of these pages, the merge is performed and one of the pages is deleted. The value of integer can be from 0 to 99. A value of 50 or below is recommended for performance reasons. Specifying this option will have an impact on update and delete performance. Merging is only done during update and delete operations when an exclusive table lock is held. If an exclusive table lock does not exist, keys are marked as pseudo deleted during update and delete operations, and no merging is done. Consider using the CLEANUP ONLY ALL option of REORG INDEXES to merge leaf pages instead of using the MINPCTUSED option of CREATE INDEX.

MINPCTUSED is disallowed if nickname is specified (SQLSTATE 42601). This clause cannot be used with created temporary tables or declared temporary tables (SQLSTATE 42995).

DISALLOW REVERSE SCANS

Specifies that an index only supports forward scans or scanning of the index in the order that was defined at index creation time.

DISALLOW REVERSE SCANS cannot be specified together with nickname (SQLSTATE 42601).

ALLOW REVERSE SCANS

Specifies that an index can support both forward and reverse scans; that is, scanning of the index in the order that was defined at index creation time, and scanning in the opposite order.

ALLOW REVERSE SCANS cannot be specified together with nickname (SQLSTATE 42601).

PAGE SPLIT

Specifies the page split behavior when values are inserted into an index. The default is SYMMETRIC.

SYMMETRIC

Specifies that pages are to be split roughly in the middle. Use this option in the following situations:

When the insertion into an index is random
When the insertion into an index does not follow the patterns that are addressed by the PAGE SPLIT HIGH and PAGE SPLIT LOW options

HIGH

Specifies an index page split behavior that uses the space on index pages efficiently when there are ever-increasing values in the index. Increasing values in the index might occur when the following conditions are met:

There is an index with multiple key parts and there are multiple index pages of values where all except the last key part has the same value.
All insert operations into the table consist of a new value, which has the same value as existing keys for all but the last key part.
The last key part of the inserted value is larger than the values of the existing keys.

For example, assume that an index has the following key values:

   (1,1),(1,2),(1,3), ... (1,n),
   (2,1),(2,2),(2,3), ... (2,n),
   ...
   (m,1),(m,2),(m,3), ... (m,n)

The next key to be inserted would have the value (x,y) where

1 <=
x <= m

and y > n.

In such cases, use the PAGE SPLIT HIGH clause so that page splits do not result in many pages that are 50 percent empty.

LOW

Specifies an index page split behavior that uses the space on index pages efficiently when there are ever-decreasing values in the index. Decreasing values in the index might occur when the following conditions are met:

There is an index with multiple key parts and there are multiple index pages of values where all except the last key part has the same value.
All insert operations into the table consist of a new value, which has the same value as existing keys for all but the last key part.
The last key part of the inserted value is smaller than the values of the existing keys.

In such cases, use the PAGE SPLIT LOW clause so that page splits do not result in many pages that are 50 percent empty.

COLLECT STATISTICS

Specifies that basic index statistics are to be collected during index creation.

SAMPLED: Specifies that a sampling technique is to be used when processing index entries to collect extended index statistics. This option is used to balance performance considerations with the need for accuracy of the statistics. This option is the default when DETAILED is specified immediately following the keyword COLLECT.
UNSAMPLED: Specifies that sampling is not to be used when processing index entries to collect extended index statistics. Instead, each index entry is examined individually. This option can significantly increase CPU and memory consumption.
DETAILED: Specifies that extended index statistics (CLUSTERFACTOR and PAGE_FETCH_PAIRS) are also to be collected during index creation.

COMPRESS

Specifies whether index compression is enabled. By default, index compression will be enabled if data row compression is enabled or if the table is a declared global temporary table (DGTT) or created global temporary table (CGTT); index compression will be disabled if data row compression is disabled. This option can be used to override the default behavior. COMPRESS is disallowed if nickname is specified (SQLSTATE 42601).

YES: Specifies that index compression is enabled. Insert and update operations on the index will be subject to compression.
NO: Specifies that index compression is disabled.

INCLUDE NULL KEYS

Specifies that an index entry is created when all parts of the index key contain the null value. This is the default setting.

EXCLUDE NULL KEYS

Specifies that an index entry is not created when all parts of the index key contain the null value. When any part of the index key is not a null value, an index entry is created. You cannot specify EXCLUDE NULL KEYS with the following syntax elements:

A nickname
The GENERATE KEY USING XMLPATTERN clause
The EXTEND USING clause.

If an index is defined as unique, rows with null keys are not considered when enforcing uniqueness.

Rules

The CREATE INDEX statement fails (SQLSTATE 01550) when attempting to create an index that matches an existing index.
A number of factors are used to determine if two indexes match. These factors are combined in various different ways into the rules that determine if two indexes match. The following factors are used to determine if two indexes match:
1. The sets of index columns and key expressions, including any INCLUDE columns and key expressions, are the same in both indexes.
2. The ordering of index key columns and key expressions, including any INCLUDE columns, is the same in both indexes.
3. The key columns and key expressions of the new index are the same or a superset of the key columns and key expressions in the existing index.
4. The ordering attributes of the columns and key expressions are the same in both indexes.
5. The existing index is unique.
6. Both indexes are non-unique.
The following combinations of these factors form the rules that determine when two indexes are considered duplicates:
- 1 + 2 + 4 + 5
- 1 + 2 + 4 + 6
- 1 + 2 + 3 + 5
Exceptions:
- If one of the compared indexes is partitioned and the other of the compared indexes is nonpartitioned, the indexes are not considered duplicates if the indexes have different names, even if other matching index conditions are met.
- For indexes over XML data, the index descriptions are not considered duplicates if the index names are different, even if the indexed XML column, the XML patterns, and the data type, including its options, are identical.
Unique indexes on system-maintained MQTs are not supported (SQLSTATE 42809).
The COLLECT STATISTICS options are not supported if a nickname is specified (SQLSTATE 42601).
The creation of an index with an expression-based key in a partitioned database environment is supported only from the catalog database partition (SQLSTATE 42997).

Indexes cannot be created on column-organized temporary tables.

Notes

Concurrent read/write access during index creation, and default index creation behavior differs for indexes on nonpartitioned tables, nonpartitioned indexes, partitioned indexes:
- For nonpartitioned indexes, concurrent read/write access to the table is permitted while an index is being created, except when the EXTEND USING clause is specified. Once the index has been built, changes that were made to the table during index creation time are forward-fitted to the new index. Write access to the table is then blocked while index creation completes, after which the new index becomes available.
- For partitioned indexes, concurrent read/write access to the table is permitted while an index is being created, except when the EXTEND USING clause is specified. Once the index partition has been built, changes that were made to the partition during creation time of that index partition are forward-fitted to the new index partition. Write access to the data partition is then blocked while index creation completes on the remaining data partitions. After the index partition for the last data partition is built and the transaction is committed, all data partitions are available for read and write.
CREATE INDEX tries to forward-fit the concurrent changes before blocking writers. However, if there is a lot of concurrent database activity such that CREATE INDEX is not able to keep up with the changes coming in, it will block new writers sooner so it can complete the forward-fit. CREATE INDEX does as much of the create completion work as it can before writers are blocked, to keep the period of unavailability as short as possible. The size of the index and the amount of current activity impact this period of time.

To circumvent this default behavior, use the LOCK TABLE statement to explicitly lock the table before issuing a CREATE INDEX statement. (The table can be locked in either SHARE or EXCLUSIVE mode, depending on whether read access is to be allowed.)
If the named table already contains data, CREATE INDEX creates the index entries for it. If the table does not yet contain data, CREATE INDEX creates a description of the index; the index entries are created when data is inserted into the table.
Once the index is created and data is loaded into the table, it is advisable to issue the RUNSTATS command. The RUNSTATS command updates statistics collected on the database tables, columns, and indexes. These statistics are used to determine the optimal access path to the tables. By issuing the RUNSTATS command, the database manager can determine the characteristics of the new index. If data has been loaded before the CREATE INDEX statement is issued, it is recommended that the COLLECT STATISTICS option on the CREATE INDEX statement be used as an alternative to the RUNSTATS command.
If you collect statistics during index creation, the resulting statistics might be inconsistent. If the table has been modified since you last collected statistics on the table and its existing indexes, you should subsequently run the RUNSTATS command to provide a set of consistent statistics across the table and all of its indexes.
Creating an index with a schema name that does not already exist will result in the implicit creation of that schema provided the authorization ID of the statement has IMPLICIT_SCHEMA authority. The schema owner is SYSIBM. The CREATEIN privilege on the schema is granted to PUBLIC.
The optimizer can recommend indexes before creating the actual index.
If an index specification is being defined for a data source table that has an index, the name of the index specification does not have to match the name of the index.
The explain facility, the Design Advisor, or the EXPLAIN option in Data Server Manager can be used to recommend indexes before creating the actual index. However, none of these methods will recommend indexes containing expression-based keys.
Collecting index statistics: The UNSAMPLED DETAILED option is available to change the way index statistics are collected. However, it should be used only in cases where it is clear that DETAILED does not yield accurate statistics.
Generated Objects: If an index is created with expression-based keys, a system-generated statistical view and a system-generated package will also be created and associated with the index.
Syntax alternatives: The following syntax is tolerated and ignored:
- CLOSE
- DEFINE
- FREEPAGE
- GBPCACHE
- PIECESIZE
- TYPE 2
- using-block
The following syntax is accepted as the default behavior:
- COPY NO
- DEFER NO

Examples

Example 1: Create an index named UNIQUE_NAM on the PROJECT table. The purpose of the index is to ensure that there are not two entries in the table with the same value for project name (PROJNAME). The index entries are to be in ascending order.
```
   CREATE UNIQUE INDEX UNIQUE_NAM
      ON PROJECT(PROJNAME)
```
Example 2: Create an index named JOB_BY_DPT on the EMPLOYEE table. Arrange the index entries in ascending order by job title (JOB) within each department (WORKDEPT).
```
   CREATE INDEX JOB_BY_DPT
      ON EMPLOYEE (WORKDEPT, JOB)
```
Example 3: The nickname EMPLOYEE references a data source table called CURRENT_EMP. After this nickname was created, an index was defined on CURRENT_EMP. The columns chosen for the index key were WORKDEBT and JOB. Create an index specification that describes this index. Through this specification, the optimizer will know that the index exists and what its key is. With this information, the optimizer can improve its strategy to access the table.
```
   CREATE UNIQUE INDEX JOB_BY_DEPT
     ON EMPLOYEE (WORKDEPT, JOB)
     SPECIFICATION ONLY        
```
Example 4: Create an extended index type named SPATIAL_INDEX on a structured type column location. The description in index extension GRID_EXTENSION is used to maintain SPATIAL_INDEX. The literal is given to GRID_EXTENSION to create the index grid size.
```
   CREATE INDEX SPATIAL_INDEX ON CUSTOMER (LOCATION)
     EXTEND USING (GRID_EXTENSION (x'000100100010001000400010'))
```
Example 5: Create an index named IDX1 on a table named TAB1, and collect basic index statistics on index IDX1.
```
   CREATE INDEX IDX1 ON TAB1 (col1) COLLECT STATISTICS
```
Example 6: Create an index named IDX2 on a table named TAB1, and collect detailed index statistics on index IDX2.
```
   CREATE INDEX IDX2 ON TAB1 (col2) COLLECT DETAILED STATISTICS
```
Example 7: Create an index named IDX3 on a table named TAB1, and collect detailed index statistics on index IDX3 using sampling.
```
   CREATE INDEX IDX3 ON TAB1 (col3) COLLECT SAMPLED DETAILED STATISTICS
```
Example 8: Create a unique index named A_IDX on a partitioned table named MYNUMBERDATA in table space IDX_TBSP.
```
   CREATE UNIQUE INDEX A_IDX ON MYNUMBERDATA (A) IN IDX_TBSP
```
Example 9: Create a non-unique index named B_IDX on a partitioned table named MYNUMBERDATA in table space IDX_TBSP.
```
   CREATE INDEX B_IDX ON MYNUMBERDATA (B)
     NOT PARTITIONED IN IDX_TBSP
```
Example 10: Create an index over XML data on a table named COMPANYINFO, which contains an XML column named COMPANYDOCS. The XML column COMPANYDOCS contains a large number of XML documents similar to the one below:
```
<company name="Company1">
  <emp id="31201" salary="60000" gender="Female">
    <name>
      <first>Laura</first>
      <last>Brown</last>
    </name>
    <dept id="M25">
      Finance
    </dept>
  </emp>
</company>
```
Users of the COMPANYINFO table often need to retrieve employee information using the employee ID. An index like the following one can make that retrieval more efficient.
```
   CREATE INDEX EMPINDEX ON COMPANYINFO(COMPANYDOCS)
     GENERATE KEY USING XMLPATTERN '/company/emp/@id'
       AS SQL DOUBLE
```

Example 11: The following index is logically equivalent to the index created in the previous example, except that it uses unabbreviated syntax.

   CREATE INDEX EMPINDEX ON COMPANYINFO(COMPANYDOCS)
     GENERATE KEY USING XMLPATTERN '/child::company/child::emp/attribute::id'
       AS SQL DOUBLE

Example 12: Create an index on a column named DOC, indexing only the book title as a VARCHAR(100). Because the book title should be unique across all books, the index must be unique.
```
   CREATE UNIQUE INDEX MYDOCSIDX ON MYDOCS(DOC)
     GENERATE KEY USING XMLPATTERN '/book/title'
       AS SQL VARCHAR(100)
```

Example 13: Create an index on a column named DOC, indexing the chapter number as a DOUBLE. This example includes namespace declarations.

   CREATE INDEX MYDOCSIDX ON MYDOCS(DOC)
     GENERATE KEY USING XMLPATTERN
       'declare namespace b="http://www.example.com/book/";
         declare namespace c="http://acme.org/chapters";
           /b:book/c:chapter/@number'
       AS SQL DOUBLE

Example 14: Create a unique index named IDXPROJEST on table PROJECT and include column PRSTAFF to allow index-only access of the estimated mean staffing information.
```
   CREATE UNIQUE INDEX IDXPROJEST ON PROJECT (PROJNO) INCLUDE (PRSTAFF)
```
Example 15: Create a unique index on a column named USER_ID and exclude null keys from that index.
```
   CREATE UNIQUE INDEX IDXUSERID ON CUSTOMER (USER_ID) EXCLUDE NULL KEYS
```
Example 16: Create an index with an expression-based key using upper case of employee's name and ID:
```
   CREATE INDEX EMP_UPPERNAME ON EMPLOYEE (UPPER(NAME), ID)
```