ISO10646 UCS-2 (Unicode)

Universal Coded Character Set (UCS) is the name of the ISO10646 standard that defines a single code for the representation, interchange, processing, storage, entry, and presentation of the written form of all the major languages of the world.

The character code values of UCS-2 are identical to those of the Unicode character encoding standard published by the Unicode Consortium. UCS-2 defines codes for characters used in all major written languages. In addition to a set of scientific, mathematic, and publishing symbols, UCS-2 covers the following scripts:

  • Arabic
  • Armenian
  • Azerbaijani
  • Bengali
  • Bopomofo
  • Cyrillic
  • Devanagari
  • Georgian
  • Greek
  • Gujarati
  • Gurmukhi
  • Hangul
  • Chinese Hanzi
  • Hebrew
  • Hiragana
  • International Phonetic Alphabet (IPA)
  • Katakana
  • Japanese Kanji
  • Kannada
  • Korean Hanja
  • Laotian
  • Latin
  • Malayalam
  • Maltese
  • Oriya
  • Tamil
  • Telugu
  • Thai
  • Tibetan
  • Urdu
  • Welsh

The ability of AIX® to display characters in the scripts mentioned above is limited to the availability of fonts. AIX provides bitmap fonts for most of the major languages of the world, as well as a Unicode-based scalable TrueType font.

UCS-2 encodes a number of combining characters, also known as non-spacing marks for floating diacritics. These characters are necessary in several scripts including Indic, Thai, Arabic, and Hebrew. The combining characters are used for generating characters in Latin, Cyrillic, and Greek scripts. However, the presence of combining characters creates the possibility for an alternative coding for the same text. Although the coding is unambiguous and data integrity is preserved, the processing of text that contains combining characters is more complex. To provide conformance for applications that choose not to deal with the combining characters, ISO10646 defines the following implementation levels:

Level 1
Does not allow combining characters.
level 2
Allows combining marks from Thai, Indic, Hebrew, and Arabic scripts.
Level 3
Allows combining marks, including ones for Latin, Cyrillic, and Greek.
Note: On the AIX operating system, the ISO10646-1 label refers to UCS-2 encoding. This label can be used as an alias for UCS-2.