Start of change

REGEXP_INSTR

The REGEXP_INSTR returns the starting position or the position after the end of the matched substring, depending on the value of the return_option argument.

Read syntax diagramSkip visual syntax diagramREGEXP_INSTR(source-string ,pattern-expression ,start,occurrence,return-option,flags,group )
source-string
An expression that specifies the string in which the search is to take place. The expression must return a value that is a built-in character string, graphic string, numeric, or datetime data type. If the value is not a UTF-16 DBCLOB, it is implicitly cast to a UTF-16 DBCLOB before searching for the regular expression pattern. A character string with the FOR BIT DATA attribute or a binary string is not supported. The length of a string must not be greater than 1 gigabyte.
pattern-expression
An expression that specifies the regular expression string that is the pattern for the search. The expression must return a value that is a built-in character string, graphic string, numeric, or datetime data type. If the value is not a UTF-16 DBCLOB, it is implicitly cast to a UTF-16 DBCLOB before searching for the regular expression pattern. A character string with the FOR BIT DATA attribute or a binary string is not supported. The length of the string must not be greater than 32K.

A valid pattern-expression consists of a set of characters and control characters that describe the pattern of the search. For a description of the valid control characters, see Regular expression control characters.

start
An expression that specifies the position within source-string at which the search is to start. The expression must return a value of any built-in numeric, character-string, or graphic-string data type. The argument is cast to INTEGER before evaluating the function. For more information about converting to INTEGER, see INTEGER or INT. The value of the integer must be greater than or equal to 1. If the value of the integer is greater than the actual length of the source-string, the result is 0.
occurrence
An expression that specifies which occurrence of the pattern-expression within source-string to search for. The expression must return a value of any built-in numeric, character-string, or graphic-string data type. The argument is cast to INTEGER before evaluating the function. For more information about converting to INTEGER, see INTEGER or INT. The value of the integer must be greater than or equal to 1. If occurrence is not specified, the default value is 1 which indicates that only the first occurrence of pattern-expression is considered.
return-option
An expression that specifies whether to return the starting position or the position after the end of the string that matches the pattern. The expression must return a value of any built-in numeric, character-string, or graphic-string data type. The argument is cast to INTEGER before evaluating the function. For more information about converting to INTEGER, see INTEGER or INT. The value of the integer must be equal to 0 or 1. A value of 0 returns the starting position of the occurrence. A value of 1 returns the ending position of the occurrence. If return-option is not specified, the default value is 0.
flags
An expression that specifies flags that control aspects of the pattern matching. The expression must return a value that is a built-in character string or graphic string data type. A character string with the FOR BIT DATA attribute or a binary string is not supported. The string can include one or more valid flag values and the combination of flag values must be valid. An empty string is the same as the value 'c'.

For a description of the valid flag characters, see Regular expression flag values.

group
An expression that specifies which capture group of the pattern-expression is used to determine the position within source-string to return. The expression must return a value of any built-in numeric, character-string, or graphic-string data type. The argument is cast to INTEGER before evaluating the function. For more information about converting to INTEGER, see INTEGER or INT. The value of the integer must be greater than or equal to 0 and must not be greater than the number of capture groups in the pattern-expression. If group is not specified, the default is 0 which indicates the entire string that matches the entire pattern is returned.

The result of the function is a large integer. If the pattern-expression is found, the result is a number from 1 to n, where n is the actual length of the source-string plus 1. The result value represents the position used to process the function. If the pattern-expression is not found and no argument is null, the result is 0.

If any argument of the REGEXP_INSTR function can be null, the result can be null. If any argument is null, the result is the null value.

Notes

Prerequisites: In order to use the REGEXP_INSTR function, the International Components for Unicode (ICU) option must be installed

Processing: The regular expression processing is done using the International Components for Unicode (ICU) regular expression interface. For more information see, http://userguide.icu-project.org/strings/regexp.

Examples

  • Example 1: Find the first occurrence of a 'o' which has a character preceding it.
    SELECT REGEXP_INSTR('hello to you', '.o',1,1) 
    FROM sysibm.sysdummy1
    The result is 4, which is the position of the second 'l' character.
  • Example 2: Find the second occurrence of a 'o' which has a character preceding it.
    SELECT REGEXP_INSTR('hello to you', '.o',1,2) 
    FROM sysibm.sysdummy1
    The result is 7, which is the position of the character 't'.
  • Example 3: Find the position after the third occurrence of the first capture group of the regular expression '(.o).' using case insensitive matching.
    SELECT REGEXP_INSTR('hello to you', '(.o).', 1,3,1,'i',1) 
    FROM sysibm.sysdummy1
    The result is 12, which is the position of the character 'u' at the end of the string.
End of change