Once a collating sequence is established for a database with SYSTEM, NLSCHAR, COMPATIBILITY, or user defined collation option, character comparison is performed by comparing the weights of two characters, instead of directly comparing their code point values.
If the collating sequence contains 256 unique weights, only the first step is performed. If the collating sequence is the identity sequence, only the second step is performed. In either case, there is a performance benefit. For Unicode databases, if the collation option is SYSTEM or IDENTITY, the collating sequence will be IDENTITY and only the second step is performed.
A Unicode database with the IDENTITY_16BIT collation option will collate the CHAR or VARCHAR data in the database according to their CESU-8 binary order instead of the UTF-8 binary order. The collation order is identical for non-supplementary characters. However, a supplementary character in UTF-8 encoding, is represented by one 4-byte sequence, but the same character in CESU-8 encoding requires two 3-byte sequences, which results in different collation orders.