Multibyte character code data representation
A multibyte character code is an external representation of data, regardless of whether it is character input from a keyboard or a file on a disk. Within the same code set, the number of bytes that represent the multibyte code of a character can vary. You must use multicultural support functions for character processing to ensure code set independence.
For example, a code set may specify the following character encodings:
C = 0x43
* = 0x81 0x43
*C = 0x81 0x43& 0x43
A program searching for C
, not accounting for
multibyte characters, finds the second byte of the *C
string
and assumes it found C
when in fact it found the
second byte of the *
(asterisk) character.