The CHARMAP section
The CHARMAP
section defines the values for the symbolic
names representing characters in the coded character set. Each charmap
file
must define at least the portable character set. The character symbolic names or alternate
symbolic names (or both) must be used to define the portable character set. These are
shown in Table 1.
Additional characters can be defined by the user with symbolic character names.
The CHARMAP
section starts with the line containing
the keyword CHARMAP
, and ends with the line containing
the keywords END CHARMAP
. CHARMAP
and END
CHARMAP
must both start in column one.
CHARMAP
section. The formats of the character set mappings for this section are as
follows:
"%s %s %s\n", <symbolic-name>, <encoding>, <comments>
"%s…%s %s %s\n", <symbolic-name>, <symbolic-name>, <encoding>, <comments>
The first format defines a single symbolic name and a corresponding encoding. A symbolic name is one or more characters with visible glyphs, enclosed between angle brackets.
For reasons of portability, a symbolic name should include only
the characters from the invariant part of the portable character set. If you use variant
characters or decimal or hexadecimal notation in a symbolic name,
the symbolic name will not be portable. A character following an escape
character is interpreted as itself; for example, the sequence <\\\>>
represents
the symbolic name \>
enclosed within angle brackets, where
the backslash \
is the escape character. If /
is
the escape character, the sequence <///>>
represents
the symbolic name />
. In the supplied charmap
files,
the escape character has been redefined to the forward slash /
.
The second format defines a group of symbolic names associated
with a range of values. The two symbolic names are comprised of two
parts, a prefix and suffix. The prefix consists of zero or more non-numeric
invariant visible glyph characters and is the same for both symbolic
names. The suffix consists of a positive decimal integer. The suffix
of the first symbolic name must be less than or equal to the suffix
of the second symbolic name. As an example, <j0101>...<j0104>
is
interpreted as the symbolic names <j0101>,<j0102>,<j0103>,<j0104>
.
The common prefix is 'j'
and the suffixes are '0101'
and '0104'
.
<escape-char><number> (single byte value)
<escape-char><number><escape-char><number> (double byte value)
The number can be written using octal, decimal, or hexadecimal
notation. Decimal numbers are written as a 'd'
followed
by 2 or 3 decimal digits. Hexadecimal numbers are written as an 'x'
followed
by 2 hexadecimal digits. An octal number is written with 2 or 3 octal
digits. As an example, the single byte value x1F
could
be written as '\37', '\x1F',
or '\d31'
.
The double byte value of 0x1A1F
could be written as '\32\37',
'\x1A\x1F'
, or '\d26\d31'
.
In lines defining ranges of symbolic names, the encoded value is the value for the first symbolic name in the range (the symbolic name preceding the ellipsis). Subsequent names defined by the range have encoding values in increasing order.
<shift_out>
and appended with the byte value of
<shift_in>
. Such a string represents one EBCDIC multibyte character, as the
following example shows:
<escape_char> /
<comment_char> %
<mb_cur_max> 4
<mb_cur_min> 1
<shift-out> /x0e
<shift-in> /x0f
CHARMAP
% many definition lines
<j0101>…<j0104> /d129/d254
%many definition lines
END CHARMAP
<j0101> /d129/d254
<j0102> /d129/d255
<j0103> /d130/d0
<j0104> /d130/d1
<j0101> x0Ex81xFEx0F
<j0102> x0Ex81xFFx0F
<j0103> x0Ex82x00x0F
<j0104> x0Ex82x01x0F