UTF-8 comparisons
A UTF-8 comparison is a comparison of two operands of class UTF-8. When the relation condition specifies an operand that is not class UTF-8, that operand is converted to a data item of category UTF-8 before the comparison.
- Alphabetic, alphanumeric, alphanumeric-edited, numeric-edited with usage DISPLAY, national, national-edited and numeric-edited with usage NATIONAL
- The operand is treated as though it were moved to a temporary
data item of category UTF-8 of the length needed to represent the
number of character positions in that operand. Alphanumeric characters
are converted to the corresponding UTF-8 characters. The source code
page used for the conversion is the one in effect for the CODEPAGE
compiler option when the source code was compiled.
The implicit moves for the conversions are carried out in accordance with the rules of the MOVE statement.
The resulting category UTF-8 data item is used in the comparison of two UTF-8 operands.
- Comparison of two UTF-8 operands
- If the operands are of unequal length, the comparison proceeds
as though the shorter operand were padded on the right with the default
UTF-8 space character (UX'20') to make the operands of equal length.
The comparison then proceeds according to the rules for the comparison
of operands of equal length.
If the operands are of equal length, the comparison proceeds by comparing corresponding UTF-8 character positions in the two operands, starting from the leftmost position, until either unequal UTF-8 characters are encountered or the rightmost UTF-8 character position is reached, whichever comes first. The operands are determined to be equal if all corresponding UTF-8 characters are equal.
The first-encountered unequal UTF-8 character in the operands is compared to determine the relation of the operands. The operand that contains the UTF-8 character with the higher collating value is the greater operand.Note: The higher collating value is determined using the hexadecimal value of characters, and the PROGRAM COLLATING SEQUENCE clause has no effect on comparisons of UTF-8 operands.