SUBSTR4 scalar function

The SUBSTR4 function returns a substring from a string. The resulting substring starts at a specified position in the string and continues for a specified or default length. The start and length arguments are expressed in 32-bit UTF-32 string units (CODEUNITS32).
Read syntax diagramSkip visual syntax diagramSUBSTR4(string, start,length)

The schema is SYSIBM.

string
An expression that specifies the string from which the resulting substring is derived. The expression must return a value that is a built-in character string, graphic string, numeric, or datetime data type. A character string cannot have the FOR BIT DATA attribute (SQLSTATE 428GC). If the value is not a string data type, it is implicitly cast to VARCHAR before evaluating the function. A substring of string is zero or more contiguous string units of string.
start
An expression that specifies the starting position in string for the beginning of the result substring in 32-bit UTF-32 string units. The expression must return a value that is a built-in numeric, CHAR, VARCHAR, GRAPHIC, or VARGRAPHIC data type. If the value is not of type INTEGER, it is implicitly cast to INTEGER before evaluating the function.

If start is positive, then the starting position is calculated from the beginning of the string. If start is greater than the length of string in 32-bit UTF-32 string units, then a zero length string is returned.

If start is negative, then the starting position is calculated from the end of the string by counting backwards. If the absolute value of start is greater than the length of string in 32-bit UTF-32 string units, then a zero length string is returned.

If start is 0, then a starting position of 1 is used.

length
An expression that specifies the length of the resulting substring in 32-bit UTF-32 string units. If length is specified, the expression must return a value that is a built-in numeric, CHAR, VARCHAR, GRAPHIC, or VARGRAPHIC data type. If the value is not of type INTEGER, it is implicitly cast to INTEGER before evaluating the function.

If the value for length is greater than the number of 32-bit UTF-32 string units from the starting position to the end of the string, then the length of the resulting substring is the length of the first argument in 32-bit UTF-32 string units minus the starting position plus one.

If the value for length is less than or equal to zero, the result is a zero length string.

The default value for length is the number of CODEUNITS32 from the position specified by start to the last string unit of string.

If string is a CHAR or VARCHAR data type, the result of the function is a VARCHAR data type. If string is a CLOB, the result of the function is a CLOB. If string is a GRAPHIC or VARGRAPHIC data type, the result of the function is a VARGRAPHIC data type. If string is a DBCLOB, the result of the function is a DBCLOB. If the first argument is a host variable, the code page of the result is the section code page; otherwise, it is the code page of the first argument.
The length attribute of the result is the same as the length attribute of the first argument, unless the start or length arguments are specified as constants. When a constant is specified, the length attribute of the result is based on the first applicable row in the following table. The string unit of the result is the same as the string unit of the first argument.
Table 1. Length attribute of SUBSTR4 result when arguments include constants
String Argument Start Argument1 Length Argument Length Attribute of Result2
character type with length attribute A any valid argument constant value L<=0 0
character type with length attribute A constant value S and |(S)|>A not specified or any valid argument 0
character type with length attribute A not a constant constant value L>0 MIN(A, L×4) if the string units of string is OCTETS
MIN(A, L) if the string units of string is CODEUNITS32
character type with length attribute A constant value S>0 not specified or not a constant A-S+1
character type with length attribute A constant value S<0 not specified or not a constant MIN(A, |(S)×4|) if the string units of string is OCTETS
MIN(A, |S|) if the string units of string is CODEUNITS32
character type with length attribute A constant value S>0 constant value L>0 MIN(A-S+1, L×4) if the string units of string is OCTETS
MIN(A-S+1, L) if the string units of string is CODEUNITS32
character type with length attribute A constant value S<0 constant value L>0 MIN(A,|(S)×4|, L×4) if the string units of string is OCTETS
MIN(A, |S|, L) if the string units of string is CODEUNITS32
graphic type with length attribute A any valid argument constant value L<=0 0
graphic type with length attribute A constant value S and |(S)|>A not specified or any valid argument 0
graphic type with length attribute A not a constant constant value L>0 MIN(A, L*2) if the string units of string is double-bytes or CODEUNITS16
MIN(A, L) if the string units of string is CODEUNITS32
graphic type with length attribute A constant value S>0 not specified or not a constant A-S+1
graphic type with length attribute A constant value S<0 not specified or not a constant MIN(A, |(S) * 2|) if the string units of string is double-bytes or CODEUNITS16
MIN(A, |S|) if the string units of string is CODEUNITS32
graphic type with length attribute A constant value S>0 constant value L>0 MIN(A-S+1, L*2) if the string units of string is double-bytes or CODEUNITS16
MIN(A-S+1, L) if the string units of string is CODEUNITS32
graphic type with length attribute A constant value S<0 constant value L>0 MIN(A, |(S)*2|, L*2) if the string units of string is double-bytes or CODEUNITS16
MIN(A, |S|, L) if the string units of string is CODEUNITS32
Notes:

1 If a start argument value of 0 is specified, then use a value of 1 for S when referencing this table.

2 The length attribute of the result for some of the character result types involves a constant that is multiplied by a factor of 4. This multiplier covers the worst case expansion derived from multiplying by 4 to switch from counting in 32-bit UTF-32 string units to counting in bytes used for the length attributes of a character data type with string units of OCTETS.
The length attribute of the result for some of the graphic result types involves a constant that is multiplied by a factor of 2. This multiplier covers the worst case expansion derived from multiplying by 2 because a 4-byte character in UTF-32 could be represented by up to 2 double-byte characters in a graphic string.

If any argument of the SUBSTR4 function can be null, the result can be null. If any argument is null, the result is the null value.

Notes

  • If string contains combining characters, the result might contain base characters without their combining characters or combining characters without their base characters depending on the values of start and length.
  • SUBSTR4 is similar to the SUBSTR function, with the following exceptions:
    • SUBSTR4 supports a negative start value, which indicates that processing is to start from the end of the string.
    • SUBSTR4 supports a length value that is greater than the calculated result length. In such cases, a shorter string is returned, rather than an error.
    • SUBSTR4 returns a result data type of VARCHAR if the input data type is CHAR. VARGRAPHIC is the result data type returned if the input type is GRAPHIC.
    • The length attribute of the result for SUBSTR4 is either the same as the length attribute of the first argument, or it is derived based on the start or length attributes, if either of these are constants.
    • SUBSTR4 returns a result with a length attribute that is the same as the length attribute of the first argument, unless the start or length arguments are specified as constants. When a constant is specified, the length attribute of the result is derived based on the start or length attributes (see the preceding table).