Index XML pattern expressions

Only those parts of an XML document stored in an XML column that satisfy an XML pattern expression are indexed. To index on an XML pattern, you provide an index specification clause together with the CREATE INDEX statement.

The index specification clause begins with GENERATE KEY USING XMLPATTERN, followed by an XML pattern and a data type for the index over XML data. Alternatively, you can specify the clause GENERATE KEYS USING XMLPATTERN.

Only one index specification clause is allowed per CREATE INDEX statement. Multiple XML indexes can be created on an XML column.

XML pattern expressions

To identify those parts of the document that will be indexed, an XML pattern is used to specify a set of nodes within the XML document. This pattern expression is similar to the path expression defined in the XQuery language, but it differs in that only a subset of the XQuery language is supported.

Path expression steps are separated by the forward slash (/). The double forward slash (//) which is the abbreviated syntax for /descendant-or-self::node()/ can also be specified. In each step, a forward axis (child::, @, attribute::, descendant::, self::, and descendant-or-self::) is chosen, followed by an XML name test or XML kind test. If no forward axis is specified, the child axis is used as the default.

If the XML name test is used, a qualified XML name or a wildcard is used to specify the node name to match for the step in the path. Instead of matching node names, the XML kind test can also be used to specify what kind of nodes to match in the pattern: text nodes, comment nodes, processing instruction nodes, or any other type of node.

Pattern expressions can include calls to supported functions to create indexes with special properties, such as case insensitivity. Only one function step is allowed per XMLPATTERN clause.

The following examples show statements that are logically equivalent with different pattern expressions.

Statements 1 and 2 are logically equivalent. Statement 1 uses the abbreviated syntax.
Figure 1. Statement 1
  CREATE INDEX empindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/company/emp/@id' AS SQL DOUBLE 
Figure 2. Statement 2
  CREATE INDEX empindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/child::company/child::emp/attribute::id'
         AS SQL DOUBLE
Statement 3 and Statement 4 are logically equivalent. Statement 3 uses the abbreviated syntax.
Figure 3. Statement 3
  CREATE INDEX idindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '//@id' AS SQL DOUBLE
Figure 4. Statement 4
  CREATE INDEX idindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/descendant-or-self::node()/attribute::id'
         AS SQL DOUBLE
Statement 5 uses the XML kind test to match text type nodes in the specified pattern:
Figure 5. Statement 5
  CREATE INDEX empindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/company/emp/name/last/text()' AS SQL 
         VARCHAR(25)
Statements 6 and 7 create a case-insensitive index. Statement 7 specifies in which locale the values stored in the index should be converted.
Figure 6. Statement 6
  CREATE INDEX empindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/company/emp/name/last/fn:upper-case(.)' 
         AS SQL VARCHAR(25)
Figure 7. Statement 7
 CREATE INDEX empindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/company/emp/name/last/fn:upper-case(., 
          "en_US")' AS SQL VARCHAR(25)  
Statement 8 creates an index that indicates the existence of the middle name of an employee in the XML document structure.
Figure 8. Statement 8
CREATE INDEX empindex on company(companydocs)
       GENERATE KEY USING XMLPATTERN 
       '/company/emp/name/fn:exists(middle)' AS SQL VARCHAR(1)
Statement 9 creates a VARCHAR index.
Figure 9. Statement 9
CREATE INDEX varcharidx on company(companydocs)
       GENERATE KEY USING XMLPATTERN '/company/emp/name/last'
       AS SQL VARCHAR(30)
The optimizer can choose to use VARCHAR-type indexes for queries that have predicates containing the fn:starts-with function. The fn:starts-with function determines whether a string begins with a specific substring. No changes are necessary to existing VARCHAR indexes, and no special syntax needs to be used in the CREATE INDEX statement for new indexes.

Qualifying Paths and Nodes

Consider a table named company with XML documents stored in an XML column, companydocs. The XML documents have a hierarchy with the two paths: '/company/emp/dept/@id' and '/company/emp/@id'. If the XML pattern specifies a single path, then a set of nodes in the document might qualify.

For example, if you want to search for a specific employee id attribute (@id) on the employee elements, they could create an index on the XML pattern '/company/emp/@id'. Then queries with predicates of the form '/company/emp[@id=42366]' could utilize the index on an XML column. In this case, the XMLPATTERN '/company/emp/@id' in the CREATE INDEX statement specifies a single path that refers to many different nodes in the document, since every employee element in the document might have an employee id attribute.

CREATE INDEX empindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/company/emp/@id' AS SQL DOUBLE 

If the XML pattern uses wildcard expressions, the descendant axis, or the descendant-or-self axis, then a set of paths and nodes might qualify. In the following example, the descendant-or-self axis is specified, so that the XML pattern '//@id' references paths for both department id attributes and employee id attributes since they both contain @id.

CREATE INDEX idindex on company(companydocs)
         GENERATE KEYS USING XMLPATTERN '//@id' AS SQL DOUBLE 
If you want to search for a specific employee last name (<last>) on the name elements in a case-insensitive manner, for example, you can create an index on the XML pattern '/company/emp/name/last/fn:upper-case(.)'. Then queries with predicates of the form '/company/emp/name/last[fn:upper-case(.)="SMITH"]" could utilize the index on an XML column. In this case, the context step of the XML pattern specifies a single path, '/company/emp/name/last'. For this path, many different nodes in the document might qualify, since each name element in the document might have a <last> element. All employee last names are indexed in the uppercase form.
CREATE INDEX empindex on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/company/emp/name/last/fn:upper-case(.)' 
         AS SQL VARCHAR(25)
If the XML pattern contains the fn:exists function, the index value is stored as a single character T or F, to indicate whether it is true or false that the item to be indexed exists in the XML document structure. In the following example, all employee middle names are indexed in a Boolean fashion.
CREATE INDEX midnameidx on company(companydocs)
         GENERATE KEY USING XMLPATTERN '/company/emp/name/fn:exists(middle)' 
         AS SQL VARCHAR(1)