A GLS locale groups the characters of a code set into character
classes. Each class contains characters that have a related
purpose.
The contents of a character class can be language specific. For
example, the lower class contains all alphabetic lowercase characters
in a code set. In the default locale, the default code set groups
the English characters a through z into the lower class, but it also
includes lowercase characters such as á, ⪚, , õ, and ü.
The default code set on UNIX platforms
is ISO8859-1. The default code set for Windows environments is Microsoft 1252.
For more information about the default locale and the default code
set, see the IBM
Informix GLS User's Guide.
The LC_CTYPE category of a GLS locale file defines the following
character classes.
| Character class |
Contains |
| alpha |
Alphabetic characters:- Single-byte alphabetic characters a through z and A through Z
- Any single-byte non-English characters that the locale defines
- Any multibyte alphabetic or digit characters that the locale defines
This class includes characters in the lower and upper classes. |
| lower |
Lowercase alphabetic characters:- Single-byte alphabetic characters a through z
- Any single-byte non-English lowercase characters that the locale
defines
- Any multibyte lowercase characters that the locale defines
No characters in this class are in the upper class. |
| upper |
Uppercase alphabetic characters:- Single-byte alphabetic characters A through Z
- Any single-byte non-English uppercase characters that the locale
defines
- Any multibyte uppercase alphabetic characters that the locale
defines
No characters in this class are in the lower class. |
| digit |
Single-byte decimal digits 0 through 9 |
| xdigit |
Hexadecimal digits:- Single-byte numeric digits 0 through 9
- Single-byte representations of hexadecimal digits a through f
and A through F
This class includes characters in the digit class. |
| alnum |
All characters in both the alpha and digit classes. |
| blank |
Horizontal white space:- Single-byte horizontal-space characters:
- “ ” (ASCII 0x020)
- tab (ASCII 0x009)
- Any multibyte horizontal-space characters that the locale defines
|
| space |
Horizontal and vertical white space:- Single-byte horizontal-space characters as defined in the blank
class
- Single-byte vertical-space characters: new line, vertical tab,
form feed, carriage return
- Any multibyte vertical-space characters that the locale defines
This class includes characters in the blank class. |
| cntrl |
Control characters: - Single-byte control characters: ASCII 0x000 to 0x01F
- Any other control characters that the locale defines
|
| graph |
Graphical characters are all characters that
have visual representation. This class includes characters in the
alpha, lower, upper, digit, xdigit, and punct classes. |
| punct |
Punctuation:
|
| print |
All printable characters This class includes
characters in the alpha, lower, upper, digit, xdigit, graph, and punct
classes.
|
Your application must not assume which characters belong in a particular
character class. For example, it must not contain code such as the
following example to determine whether a character is lowercase:
if ( one_char >= 'a' && one_char <= 'z' )
Instead, use functions in the
IBM®
Informix® GLS library
to identify the class of a particular character. The following table
lists the GLS character classes and the
Informix GLS functions
that test for these classes for both multibyte and wide characters.
Table 1. Informix GLS character-class
functions| Character class |
Multibyte-character function |
Wide-character function |
| alnum (alpha or digit) |
ifx_gl_ismalnum() |
ifx_gl_iswalnum() |
| alpha |
ifx_gl_ismalpha() |
ifx_gl_iswalpha() |
| lower |
ifx_gl_ismlower() |
ifx_gl_iswlower() |
| upper |
ifx_gl_ismupper() |
ifx_gl_iswupper() |
| blank |
ifx_gl_ismblank() |
ifx_gl_iswblank() |
| space |
ifx_gl_ismspace() |
ifx_gl_iswspace() |
| digit |
ifx_gl_ismdigit() |
ifx_gl_iswdigit() |
| xdigit |
ifx_gl_ismxdigit() |
ifx_gl_iswxdigit() |
| cntrl |
ifx_gl_ismcntrl() |
ifx_gl_iswcntrl() |
| graph |
ifx_gl_ismgraph() |
ifx_gl_iswgraph() |
| punct |
ifx_gl_ismpunct() |
ifx_gl_iswpunct() |
| print |
ifx_gl_ismprint() |
ifx_gl_iswprint() |
These
Informix GLS functions
check the LC_CTYPE category of the current locale to determine whether
a specified character belongs to the respective character classification.
The following code fragment uses the
ifx_gl_ismlower() function
to determine whether a multibyte character is lowercase:
if ( ifx_gl_ismlower(one_char, char_size)
The
Informix GLS functions
in
Table 1 do
not return a unique value if they encounter an error. To detect an
error, initialize the
ifx_gl_lc_errno() error number
to 0 before you call one of these functions, and then call
ifx_gl_lc_errno() immediately
after you call the function. For example, the following code fragment
performs error checking for the
ifx_gl_ismlower() function:
/* Initialize the error number */
ifx_gl_lc_errno() = 0;
/* Determine if 'mb' character is lowercase */
value = ifx_gl_ismlower(mb, mb_size);
/* If the error number has changed, ifx_gl_ismlower()has
* set it to indicate the cause of an error */
if ( ifx_gl_lc_errno() != 0 )
/* Handle error */
else if ( value != 0 )
/* Character 'mb' is in lower class */
else if ( value == 0 )
/* Character 'mb' is NOT in lower class */