IBM-943 and IBM-932
Each of the Japanese IBM® PC code sets are an encoding consisting of single-byte and multibyte coded characters. The encoding is based on the IBM PC code set and places the JIS characters in shifted positions. This is referred to as Shift-JIS or SJIS.
IBM-943 is a newer code set for the Japanese locale than IBM-932. IBM-943 is a compatible code set for the Japanese Microsoft Windows environment. This code set is known as 1983 ordered shift-JIS. The differences between IBM-932 and IBM-943 are as follows:
- Previous JIS sequence (1978 ordered) is applied for IBM-932 while newer JIS sequence (1983 ordered) is applied for IBM-943.
- NEC selected characters are added to IBM-943.
- NEC's IBM selected characters are added to IBM-943.
The IBM-932 code set consists of the following character sets:
Character set | Description |
---|---|
JISCII | JISX0201 Graphic Left character set |
JISX0201.1976 | Katakana/Hiragana Graphic Right character set |
JISX0208.1983 | Kanji level 1 and 2 character sets |
IBM-udcJP | IBM user-definable characters |
The IBM-943 code set consists of the following character sets:
Character set | Description |
---|---|
JISCII | JISX0201 Graphic Left character set |
JISX0201.1976 | Katakana/Hiragana Graphic Right character set |
JISX0208.1990 | Kanji level 1 and 2 character sets |
IBM-udcJP | IBM user-definable characters and NEC's IBM selected characters and NEC selected characters |
The first byte of each character is used to determine the number of bytes for a given character. The values 0x20-0x7e and 0xa1-oxdf are used to encode JISX0201 characters, with exceptions. The positions 0x81-0x9f and 0xe0-0xfc are reserved for use as the first byte of a multibyte character. The JISX0208 characters are mapped to the multibyte values starting at 0x8140. The second byte of a multibyte character can have any value. The Shift-JIS table shows where these characters are located on the code set.
Character Encoding | Code Point | Description | Count |
---|---|---|---|
000xxxxx | 00–1f | Controls | 32 |
00100000 | 20 | Space | 1 |
0xxxxxxx | 21–7E | 7-bit ASCII | 94 |
01111111 | 7F | Delete | 1 |
10000000 | 80 | Undefined | 1 |
100xxxxx 01xxxxxx | [81–9F] [40–7E] | Double byte | 1953 |
100xxxxx 1xxxxxxx | [81–9F] [80–FC] | Double byte | 3975 |
10100000 | A0 | Undefined | 1 |
1xxxxxxx | A1–DF | 7-bit single byte | 63 |
111xxxxx 01xxxxxx | [E0–FC] [40–7E] | Double byte | 1827 |
111xxxxx 1xxxxxxx | [E0–FC] [80–FC] | Double byte | 3625 |
11111101 | FD | Undefined | 1 |
11111110 | FE | Undefined | 1 |
11111111 | FF | Undefined | 1 |
The following table shows the DBCS portion of IBM-943.
Code Point | Description |
---|---|
[81–84] [40–7E] and [81–84] [80–F0] | JIS X 0208 (Non-Kanji) |
[87] [40–7E] and [87] [80–F0] | NEC selected characters |
[89–98] [40–7E] and [88] [9F-F0], [89–97] [80–F0], [98] [80–9F] | JIS X0208 (Level-1 Kanji) |
[99–9F] [40–7E] and [98] [9F-F0], [99–9F] [80–F0] | JIS X0208 (Level-2 Kanji) |
[E0–EA] [40–7E] and [E0–EA] [80–F0] | JIS X0208 (Level-2 Kanji) |
[ED–EE] [40–7E] and [ED–EE] [80–F0] | NEC IBM selected characters |
[F0–F9] [40–7E] and [F0–F9] [80–F0] | User-defined characters |
[FA] [40–5C] | IBM selected characters (non-Kanji) |
[FA] [5C-7E], [FB-FC] [40–7E] and [FA-FC] [80–F0] | IBM selected characters (Kanji) |
The following table shows the DBCS portion of IBM-932.
Code Point | Description |
---|---|
[81–98] [40–7E] and [81–97] [80–FC], [98] [80–9F] | JIS X 0208 (Level-1 Kanji) |
[99–9F] [40–7E] and [98] [9F-FC], [99–9F] [80–FC] | JIS X 0208 (Level-2 Kanji) |
[E0–EF] [40–7E] and [E0–EF] [80–FC] | JIS X 0208 (Level-2 Kanji) |
[F0–F9] [40–7E] and [F0–F9] [80–FC] | User-defined characters |
[FA–FC] [40–7E] and [FA–FC] [80–FC] | IBM selected characters |