Processing Chinese GB 18030 data

GB 18030 is a national-character standard specified by the government of the People's Republic of China.

About this task

COBOL for Linux® supports GB 18030. If the code page specified for the locale in effect is GB18030 (a code page that supports GB 18030), USAGE DISPLAY data items that contain GB 18030 characters encoded in GB18030 can be processed in a program. GB 18030 characters take 1 to 4 bytes each. Therefore the program logic must be sensitive to the multibyte nature of the data.

You can process GB 18030 characters in these ways:

  • Use national data items to define and process GB 18030 characters that are represented in UTF-16, CCSID 01200.
  • Process data in any code page (including GB18030, which has CCSID 1392) by converting the data to UTF-16, processing the UTF-16 data, and then converting the data back to the original code-page representation.

When you need to process Chinese GB 18030 data that requires conversion, first convert the input data to UTF-16 in a national data item. After you process the national data item, convert it back to Chinese GB 18030 for output. For the conversions, use the intrinsic functions NATIONAL-OF and DISPLAY-OF, respectively, and specify GB18030 or 1392 as the second argument of each function.

The following example illustrates these conversions:

This image shows sample code for converting between Chinese GB 18030 data and Unicode. Link to detail.

Related references  
Storage of character data