fn:tokenize function
The fn:tokenize function breaks a string into a sequence of substrings.
Syntax
- source-string
- A string that is to be broken into a sequence of substrings.
source-string is a literal string, or an XQuery expression that resolves to an xs:string value or the empty sequence.
- pattern
- The delimiter between substrings in source-string.
pattern is a string literal that contains a regular expression. A regular expression is a set of characters, pattern-matching characters, and operators that define a string or group of strings in a search pattern.
- flags
- A string literal that can contain any of the following values
that control matching of pattern to source-string:
- s
- Indicates that the dot (.) matches any character.
If the s flag is not specified, the dot (.) matches any character except the new line character (#x0A).
- m
- Indicates that the caret (^) matches the start of any line (the
position after a new line character), and the dollar sign ($) matches
the end of any line (the position before a new line character).
If the m flag is not specified, the caret (^) matches the start of the entire string, and the dollar sign ($) matches the end of the entire string.
- i
- Indicates that matching is case-insensitive for the letters "a"
through "z" and "A" through "Z".
If the i flag is not specified, case-sensitive matching is done.
- x
- Indicates that whitespace characters (#x09, #x0A, #x0D, and #x20)
within pattern are ignored, unless they are within
a character class. Whitespace characters in a character class are
never ignored.
If the x flag is not specified, whitespace characters are used for matching.
Returned value
- source-string is searched for characters that match pattern.
- If pattern contains two or more alternative sets of characters, and the alternative sets of characters match characters that start at the same position in source-string, the first set of characters in pattern that matches characters in source-string is considered to match pattern.
- Each set of characters that does not match pattern becomes an item in the result sequence.
- If pattern matches characters at the beginning of source-string, the first item in the returned sequence is a string of length 0.
- If two successive matches for pattern are found within source-string, a string of length 0 is added to the sequence.
- If pattern matches characters at the end of source-string, the last item in the returned sequence is a string of length 0.
If pattern is not found in source-string, source-string is returned.
If pattern matches a string of length zero, an error is returned.
If source-string is the empty sequence, or is a zero-length string, the result is the empty sequence.
Example
fn:tokenize("?A?B?C?D?","\?")The returned value is the sequence ("", "A", "B", "C", "D", "").
