UWIDTH

Start of changeThe UWIDTH function returns an integer value that is equal to the width in bytes of the nth UTF-8 or UTF-16 character in a character data item argument that is encoded in UTF-8 or UTF-16.End of change

The function type is integer.

Format

Read syntax diagramSkip visual syntax diagramFUNCTION UWIDTH(argument-1argument-2 )
argument-1
Must be of class Start of changealphabetic, alphanumeric, or nationalEnd of change. Start of changeargument-1 must contain valid UTF-8 or UTF-16 encoded characters:End of changeStart of change
  • If argument-1 is of class alphabetic or alphanumeric, it must contain valid UTF-8 data.
  • If argument-1 is of class national, it must contain valid UTF-16 data.
End of change
argument-2
Must be an integer.

The returned value is an integer.

If argument-2 is not positive or if argument-2 is larger than ULENGTH(argument-1), zero is returned. Otherwise, if argument-2=n, the returned value is the width in bytes of the nth UTF-8Start of change or UTF-16End of change character in argument-1.

Example 1

If A is an alphanumeric item that contains the UTF-8 value x'4BC3A4666572' ('Käfer'), the returned values are as follows:

  • UWIDTH(A 1) returns 1
  • UWIDTH(A 2) returns 2
  • UWIDTH(A 3) returns 1
  • UWIDTH(A 4) returns 1
  • UWIDTH(A 5) returns 1
Start of change

Example 2

If B is a national item that contains the UTF-16 value nx'005400F6006200750072D858DC6B0073' ('Töber𦁫s'), the returned values are as follows:

  • UWIDTH (B 1) returns 2
  • UWIDTH (B 2) returns 2
  • UWIDTH (B 3) returns 2
  • UWIDTH (B 4) returns 2
  • UWIDTH (B 5) returns 2
  • UWIDTH (B 6) returns 4
  • UWIDTH (B 7) returns 2
End of change
Start of change

Example 3

If argument-1 is a UTF-8 encoded item and the UTF-8 argument contains composed characters, the combining characters are counted individually. For example, when encoded in UTF-8, the Unicode character ä can be x'C3A4' or x'61CC88'. With either of the UTF-8 characters in argument-1, the returned values of the UWIDTH function are different. See the following table for details.

Table 1. Returned values of the UWIDTH function
argument-1 Unicode encoding UTF-8 encoding Returned values of the UWIDTH function
C = äK
U+00E4 + U+004B
(precomposed form,
latin small letter a with diaeresis + latin capital letter K)
x'C3A44B' (äK) UWIDTH (C 1) returns 2
UWIDTH (C 2) returns 1
UWIDTH (C 3) returns 0
U+0061 + U+0308 + U+004B
(canonical decomposition,
latin small letter a + combining diaeresis + latin capital letter K)
x'61CC884B' (äK) UWIDTH (C 1) returns 1
UWIDTH (C 2) returns 2
UWIDTH (C 3) returns 1
End of change