UWIDTH
The UWIDTH function returns an integer value that is equal to the width in bytes of the nth UTF-8 or UTF-16 character in a character data item argument that is encoded in UTF-8 or UTF-16.
The function type is integer.
- argument-1
- Must be of class alphabetic, alphanumeric, national , or UTF-8. argument-1 must contain valid UTF-8
or UTF-16 encoded characters:
- If argument-1 is of class alphabetic, alphanumeric , or UTF-8, it must contain valid UTF-8 data.
- If argument-1 is of class national, it must contain valid UTF-16 data.
- argument-2
- Must be an integer.
The returned value is an integer.
If argument-2 is not positive or if argument-2 is larger than ULENGTH(argument-1), zero is returned. Otherwise, if argument-2=n, the returned value is the width in bytes of the nth UTF-8 or UTF-16 character in argument-1.
Example 1
If A is an alphanumeric item that contains the UTF-8 value x'4BC3A4666572' ('Käfer'), the returned values are as follows:
- UWIDTH(A 1) returns 1
- UWIDTH(A 2) returns 2
- UWIDTH(A 3) returns 1
- UWIDTH(A 4) returns 1
- UWIDTH(A 5) returns 1
Example 2
If B is a national item that contains the UTF-16 value nx'005400F6006200750072D858DC6B0073' ('Töber𦁫s'), the returned values are as follows:
- UWIDTH (B 1) returns 2
- UWIDTH (B 2) returns 2
- UWIDTH (B 3) returns 2
- UWIDTH (B 4) returns 2
- UWIDTH (B 5) returns 2
- UWIDTH (B 6) returns 4
- UWIDTH (B 7) returns 2
Example 3
If argument-1 is a UTF-8 encoded item and the UTF-8 argument contains composed characters, the combining characters are counted individually. For example, when encoded in UTF-8, the Unicode character ä can be x'C3A4' or x'61CC88'. With either of the UTF-8 characters in argument-1, the returned values of the UWIDTH function are different. See the following table for details.
argument-1 | Unicode encoding | UTF-8 encoding | Returned values of the UWIDTH function |
---|---|---|---|
C = äK | U+00E4 + U+004B
(precomposed form,
latin small letter a with diaeresis + latin capital letter K) |
x'C3A44B' (äK) | UWIDTH (C 1) returns 2 UWIDTH (C 2) returns 1 UWIDTH (C 3) returns 0 |
U+0061 + U+0308 + U+004B
(canonical decomposition,
latin small letter a + combining diaeresis + latin capital letter K) |
x'61CC884B' (äK) | UWIDTH (C 1) returns 1 UWIDTH (C 2) returns 2 UWIDTH (C 3) returns 1 |