USUBSTR

The USUBSTR function returns a substring of the data in a character data item argument that contains UTF-8 or UTF-16 data.

The function type is alphanumeric, national, or UTF-8, depending on the class of argument-1.

Format

Read syntax diagramSkip visual syntax diagramFUNCTION USUBSTR(argument-1argument-2 argument-3)
argument-1
Must be of class alphabetic, alphanumeric, national or UTF-8. argument-1 must contain valid UTF-8 or UTF-16 encoded characters:
  • If argument-1 is of class alphabetic, alphanumeric or UTF-8, it must contain valid UTF-8 data.
  • If argument-1 is of class national, it must contain valid UTF-16 data.
argument-2
Must be an integer that is greater than zero. It represents the starting position of a substring in argument-1.
argument-3
Must be an integer that is greater than or equal to zero. It represents the length of a substring in argument-1.
Note: The sum of argument-2 and argument-3 minus one must be less than or equal to ULENGTH(argument-1).

Suppose argument-1 is alphabetic or alphanumeric, argument-2 = n and argument-3 = m, the returned value is an alphanumeric item that contains m UTF-8 characters from argument-1, starting with the nth UTF-8 character. Suppose argument-1 is a national data item, argument-2 = n and argument-3 = m, the returned value is a national item that contains m UTF-16 characters from argument-1, starting with the nth UTF-16 character.

Example 1

If A is an alphanumeric item that contains the UTF-8 value x'4BC3A4666572' ('Käfer'), the returned values are as follows:
  • USUBSTR(A 1 2) returns x'4BC3A4' ('')
  • USUBSTR(A 2 1) returns x'C3A4' ('ä')
  • USUBSTR(A 2 2) returns x'C3A466' ('äf')
  • USUBSTR(A 3 2) returns x'6665' ('fe')

Example 2

If B is a national item that contains the UTF-16 value nx'005400F6006200750072D858DC6B0073' ('Töber𦁫s'), the returned values are as follows:

  • USUBSTR(B 1 2) returns x'005400F6' ('')
  • USUBSTR(B 2 1) returns x'00F6' ('ö')
  • USUBSTR(B 2 2) returns x'00F60062' ('öb')
  • USUBSTR(B 3 2) returns x'00620075' ('be')
  • USUBSTR(B 5 2) returns x'0072D858DC6B' ('r𦁫')
  • USUBSTR(B 6 2) returns x'D858DC6B0073' ('𦁫s')

Example 3

If argument-1 is a UTF-8 encoded item and the UTF-8 argument contains composed characters, the combining characters are counted individually. For example, when encoded in UTF-8, the Unicode character ä can be x'C3A4' or x'61CC88'. With either of the UTF-8 characters in argument-1, the returned values of the USUBSTR function are different. See the following table for details.

Table 1. Returned values of the USUBSTR function
argument-1 Unicode encoding UTF-8 encoding Returned values of the USUBSTR function
C = äK
U+00E4 + U+004B
(precomposed form,
latin small letter a with diaeresis + latin capital letter K)
x'C3A44B' (äK) USUBSTR (C 1 1) returns x'C3A4' (ä)
USUBSTR (C 2 1) returns x'4B' (K)
USUBSTR (C 1 2) returns x'C3A44B' (äK)
U+0061 + U+0308 + U+004B
(canonical decomposition,
latin small letter a + combining diaeresis + latin capital letter K)
x'61CC884B' (äK) USUBSTR (C 1 1) returns x'61' (a)
USUBSTR (C 2 1) returns x'CC88' (¨)
USUBSTR (C 1 2) returns x'61CC88' (ä)
USUBSTR (C 1 3) returns x'61CC884B' (äK)