Semantics of locator (XPath) expressions

An XPath expression needs to be interpreted with respect to a context node, and denotes a set of nodes. When used as Net Search Extender selector patterns, the context node is free, that is, a relative path pattern p is interpreted as //p.

According to the XML data model, XML documents are viewed as trees containing these kinds of nodes:
  • The root node
  • Element nodes
  • Text nodes
  • Attribute nodes
  • Namespace nodes
  • Processing instruction nodes
  • Comment nodes
The links between those nodes, that is the tree-forming relationship, reflect the immediate containment relationship in the XML document.

The root node can appear only at the root and nowhere else in the tree. It contains, as its children, the document element and optional comments and processing instructions.

Element nodes can contain any kinds of nodes except for the root node. The other kinds of nodes are only allowed as leaf nodes of the tree.

There are three kinds of containment links: 'child', 'attribute', and 'namespace'. The 'attribute' and 'namespace' containment links must lead to attribute and namespace nodes. So, to access the children of an element node (in terms of graph theory) you need to follow 'attribute' links to find all contained attributes, follow 'namespace' links to find all contained namespace declarations, and follow 'child' links to find contained elements, text nodes, processing instructions, and comments.

These are the Net Search Extender XPath selector patterns:
  • Pattern '|' LocationPathPattern in context N denotes the union of the nodes matched by Pattern and LocationPathPattern, both in context N.
  • '/'RelativePathPattern in context N denotes whatever this RelativePathPattern denotes in the context of the root.
  • '//'RelativePathPattern in context N denotes the union of the denotations of this RelativePathPattern interpreted in any context that is a descendant (on the child axis) of the root.
  • RelativePathPattern '/' StepPattern matches a node in context N, if and only if that node is matched by StepPattern in the context of its parent, and its parent node is matched by RelativePathPattern in context N.
  • RelativePathPattern '//' StepPattern matches a node in context N, if and only if that node is matched by StepPattern in the context of its parent, and it has an ancestor node that is matched by RelativePathPattern in context N.
  • 'child'::NodeTest (abbreviated syntax: NodeTest) in context N matches a node that is a child of N (on the child axis) and that satisfies NodeTest.
  • 'attribute'::NodeTest (abbreviated syntax: @NodeTest) in context N matches a node that is an attribute of N and that satisfies NodeTest.
  • NodeType '(' ')' is satisfied for a node if and only if it is of the specified type.
  • 'processing-instruction' '(' Literal ')' is satisfied for any processing-instruction-type node that has Literal as its name.
  • '*' is satisfied for any element or attribute node (name mask for element name).
  • NCName ':' '*' is satisfied for any element node that has NCName as its name prefix.
  • QName is satisfied for any node with the specified name.

Note

A NodeTest of the form NameTest assumes the node to be of the principal type on the selected axis, which is attribute type on the attribute axis and child type on the child axis. Consequently, NameTest cannot be used to choose comments or processing instruction nodes, but only child and attribute nodes. Moreover, the patterns allow for the selection of any kind of node, except for namespace nodes, because the axis specifier 'namespace' is not allowed.

Examples of patterns:
  • chapter | appendix denotes all chapter elements and appendix elements
  • table denotes all table elements
  • * denotes all elements (note that this is the abbreviation of child::*)
  • ulist/item denotes all item elements that have a ulist parent
  • appendix//subsection denotes all subsection elements with an appendix ancestor
  • / denotes the singleton set containing just the root node
  • comment() denotes all comment nodes
  • processing-instruction() denotes all processing instructions
  • attribute::* (or @*) denotes all attribute nodes
This is the syntax of the locator element:
Locator     ::= LocationPathPattern
           | Locator '|'  LocationPathPattern
 LocationPathPattern  ::= '/' RelativePathPattern ?
           | '//'? RelativePathPattern
 RelativePathPattern  ::= StepPattern
           | RelativePathPattern '/' StepPattern
           | RelativePathPattern '//' StepPattern
 StepPattern    ::= ChildOrAttributeAxisSpecifier NodeTest
 ChildOrAttributeAxisSpecifier ::=
           ('child' | 'attribute') '::'
           | '@'?
 NodeTest    ::= NameTest
           | NodeType '(' ')'
           | 'processing-instruction' '(' Literal ')'
 NameTest    ::= '*' | NCName ':' '*' | QName
 NodeType    ::= 'comment' | 'processing-instruction'

NCName and QName are as defined in the XML Names Recommendation:

NCName
An XML name containing no colons
QName
A NCName that can be preceded by a NCName followed by a colon. For example: NCName:NCName