ULENGTH

The ULENGTH function returns an integer value that is equal to the number of UTF-8 or UTF-16 characters in a character data item argument that contains UTF-8 or UTF-16 data.

The function type is integer.

Format

Read syntax diagramSkip visual syntax diagramFUNCTION ULENGTH(argument-1)
argument-1
Must be of class alphabetic, alphanumeric, national or UTF-8. argument-1 must contain valid UTF-8 or UTF-16 encoded characters:
  • If argument-1 is of class alphabetic, alphanumeric or UTF-8, it must contain valid UTF-8 data.
  • If argument-1 is of class national, it must contain valid UTF-16 data.

The returned value is the number of UTF-8 or UTF-16 characters in argument-1. If LP(32) is in effect, the returned value is a 9-digit integer; if LP(64) is in effect, the returned value is an 18-digit integer.

Example 1

If argument-1 is a UTF-8 encoded item and the UTF-8 argument contains composed characters, the combining characters are counted individually in determining the length. For example, when encoded in UTF-8, the Unicode character ä can be x'C3A4' or x'61CC88'. With either of the UTF-8 characters as argument-1, the returned values of the ULENGTH function are different. See the following table for details.
Table 1. ULENGTH function of character ä
Character Unicode encoding UTF-8 encoding Returned value of the ULENGTH function
ä
U+00E4
(precomposed form,
latin small letter a with diaeresis)
x'C3A4' 1
U+0061 + U+0308
(canonical decomposition,
latin small letter a + combining diaeresis)
x'61CC88' 2

Example 2

If argument-1 is a national data item that contains UTF-16 data and argument-1 contains surrogate pairs, each pair of low and high surrogates will be counted as one UTF-16 character. For example, if B is a national item that contains the UTF-16 value nx'005400F6006200750072D858DC6B0073' ('Töber𦁫s'), the returned value from ULENGTH(B) will be 7. Character 𦁫 = X'D858DC6B' is counted as one UTF-16 character.