XML text search
Based on a subset of the XPath language with extensions for text search, XML text search allows you to index and search XML documents so that structural elements can be used separately or can be combined with free text in queries.
Structural elements are tag names, attribute names, and attribute values.
The following list highlights the key features of XML search:
- XML structural search
- By including special opaque terms in queries, you can search XML documents for structural elements (tag names, attribute names, and attribute values) and text that is scoped by those elements.
- XML query tokenization
- Free text in XML query terms is tokenized the same way that text in non-XML query terms is tokenized, except that (nested) opaque terms are not supported. Synonyms, wildcard characters, phrases, and lemmatization are supported.
- Numeric values
- Predicates that compare attribute values to number, date, or dateTime data types are supported.
- Complete match
- The = (equal sign) operator with a string argument in a predicate calls for a an exact match of all tokens in the string with all tokens in the identified text span. The order is NOT significant.
- No UIMA access
- Unstructured Information Management Architecture (UIMA) is used for tokenization in XML search, but user-written annotators are not supported.