Processing UTF-8 data using UTF-16 (national) data types

To process UTF-8 data, first convert the UTF-8 data to UTF-16 in a national data item. After processing the national data, convert it back to UTF-8 for output. For the conversions, use the intrinsic functions NATIONAL-OF and DISPLAY-OF, respectively. Use code page 1208 for UTF-8 data.

About this task

As an alternative to the recommended method of processing UTF-8 data using

USAGE UTF-8

data items, you can also process UTF-8 data by storing it in alphanumeric data items and then converting it to UTF-16 in a national data item.

Take the following steps to convert ASCII or EBCDIC data to UTF-8 (unless the code page of the locale in effect is UTF-8, in which case the native alphanumeric data is already encoded in UTF-8):

Procedure

  1. Use the function NATIONAL-OF to convert the ASCII or EBCDIC string to a national (UTF-16) string.
  2. Use the function DISPLAY-OF to convert the national string to UTF-8.

Results

The following example converts Greek EBCDIC data to UTF-8:

This image shows sample code for converting Greek EBCDIC data to UTF-8.Link to detail.

Usage note: Use care if you use reference modification to refer to data encoded in UTF-8. UTF-8 characters are encoded with a varying number of bytes per character. Avoid operations that might split a multibyte character.