IBM-943 and IBM-932

Edit online

Each of the Japanese IBM® PC code sets are an encoding consisting of single-byte and multibyte coded characters. The encoding is based on the IBM PC code set and places the JIS characters in shifted positions. This is referred to as Shift-JIS or SJIS.

IBM-943 is a newer code set for the Japanese locale than IBM-932. IBM-943 is a compatible code set for the Japanese Microsoft Windows environment. This code set is known as 1983 ordered shift-JIS. The differences between IBM-932 and IBM-943 are as follows:

Previous JIS sequence (1978 ordered) is applied for IBM-932 while newer JIS sequence (1983 ordered) is applied for IBM-943.
NEC selected characters are added to IBM-943.
NEC's IBM selected characters are added to IBM-943.

The IBM-932 code set consists of the following character sets:

Character set	Description
JISCII	JISX0201 Graphic Left character set
JISX0201.1976	Katakana/Hiragana Graphic Right character set
JISX0208.1983	Kanji level 1 and 2 character sets
IBM-udcJP	IBM user-definable characters

The IBM-943 code set consists of the following character sets:

Character set	Description
JISCII	JISX0201 Graphic Left character set
JISX0201.1976	Katakana/Hiragana Graphic Right character set
JISX0208.1990	Kanji level 1 and 2 character sets
IBM-udcJP	IBM user-definable characters and NEC's IBM selected characters and NEC selected characters

The first byte of each character is used to determine the number of bytes for a given character. The values 0x20-0x7e and 0xa1-oxdf are used to encode JISX0201 characters, with exceptions. The positions 0x81-0x9f and 0xe0-0xfc are reserved for use as the first byte of a multibyte character. The JISX0208 characters are mapped to the multibyte values starting at 0x8140. The second byte of a multibyte character can have any value. The Shift-JIS table shows where these characters are located on the code set.

Character Encoding	Code Point	Description	Count
000xxxxx	00–1f	Controls	32
00100000	20	Space	1
0xxxxxxx	21–7E	7-bit ASCII	94
01111111	7F	Delete	1
10000000	80	Undefined	1
100xxxxx 01xxxxxx	[81–9F] [40–7E]	Double byte	1953
100xxxxx 1xxxxxxx	[81–9F] [80–FC]	Double byte	3975
10100000	A0	Undefined	1
1xxxxxxx	A1–DF	7-bit single byte	63
111xxxxx 01xxxxxx	[E0–FC] [40–7E]	Double byte	1827
111xxxxx 1xxxxxxx	[E0–FC] [80–FC]	Double byte	3625
11111101	FD	Undefined	1
11111110	FE	Undefined	1
11111111	FF	Undefined	1

The following table shows the DBCS portion of IBM-943.

Code Point	Description
[81–84] [40–7E] and [81–84] [80–F0]	JIS X 0208 (Non-Kanji)
[87] [40–7E] and [87] [80–F0]	NEC selected characters
[89–98] [40–7E] and [88] [9F-F0], [89–97] [80–F0], [98] [80–9F]	JIS X0208 (Level-1 Kanji)
[99–9F] [40–7E] and [98] [9F-F0], [99–9F] [80–F0]	JIS X0208 (Level-2 Kanji)
[E0–EA] [40–7E] and [E0–EA] [80–F0]	JIS X0208 (Level-2 Kanji)
[ED–EE] [40–7E] and [ED–EE] [80–F0]	NEC IBM selected characters
[F0–F9] [40–7E] and [F0–F9] [80–F0]	User-defined characters
[FA] [40–5C]	IBM selected characters (non-Kanji)
[FA] [5C-7E], [FB-FC] [40–7E] and [FA-FC] [80–F0]	IBM selected characters (Kanji)

The following table shows the DBCS portion of IBM-932.

Code Point	Description
[81–98] [40–7E] and [81–97] [80–FC], [98] [80–9F]	JIS X 0208 (Level-1 Kanji)
[99–9F] [40–7E] and [98] [9F-FC], [99–9F] [80–FC]	JIS X 0208 (Level-2 Kanji)
[E0–EF] [40–7E] and [E0–EF] [80–FC]	JIS X 0208 (Level-2 Kanji)
[F0–F9] [40–7E] and [F0–F9] [80–FC]	User-defined characters
[FA–FC] [40–7E] and [FA–FC] [80–FC]	IBM selected characters