Customizing support for Unicode

You must customize z/OS® Unicode support to use Unicode data in Db2.

Before you begin

Ensure that the conversion environment is active. The steps for this process can be found in z/OS Support for Unicode: Using Unicode Services. Db2 can use the conversion services of z/OS support for Unicode only when the conversion environment is active. The infrastructure provides tools to create a conversion image. When the image is loaded into a common data space, the conversion environment is activated, and the conversion services are ready to be used by Db2.

Procedure

To customize support for Unicode:

  1. Customize the job card.
    The jobs in hlq.SCUNJCL, as shipped by IBM®, have placeholders for values on the JOB statement, such as in the following example:
    //$JOBPREF$$JOBNAME$ JOB ($ACCOUNT$), '$USER$', 
    //     NOTIFY=$NOTIFY$,MSGCLASS=$MC$,MSGLEVEL=$ML$, 
    //     TIME=$TI$,CLASS=$CL$,REGION=$REGION0M$ 
    Use the REXX EXEC CUNRUCST in hlq.SCUNREXX to customize these values. When you run the REXX EXEC CUNRUALL, the values that you specify are supplied on all of the JCL images. You can choose to customize your system by displaying Japanese messages or displaying English messages with modified date and time formats. You can set up the z/OS Message Service to specify how you want messages to be displayed.
  2. Set up the conversion image.
    The following example is the sample JCL member in hlq.SCUNJCL (CUNJIUTL):
    //CUNMIUTL EXEC PGM=CUNMIUTL 
    //SYSPRINT DD   SYSOUT=* 
    //TABIN    DD   DISP=SHR.DSN=hlq.SCUNTBL 
    //SYSIMG   DD   DSN=hlq.IMAGES(CUNIMG00),DISP=SHR 
    //SYSIN    DD   *   
       /********************************************    
        * INPUT STATEMENTS FOR THE IMAGE GENERATOR *    
        ********************************************/   
          CASE NORMAL;          /* ENABLE TOUPPER AND TOLOWER */ 
          CONVERSION 1047,850;  /* EBCDIC -> ASCII */ 
          CONVERSION 850,1047;  /* ASCII -> EBCDIC */ 
    /* 

    In the preceding example, the two CONVERSION statements provide conversion between the EBCDIC code page 1047 and the ASCII code page 850 in both directions.

    The DD names that are passed to CUNMIUTL are described as follows:
    SYSPRINT
    A listing that shows the processed setups and error messages, if applicable.
    TABIN
    Conversion tables for character conversion and case conversion. They are supplied by IBM; in this example, they are in data set hlq.SCUNTBL. The image transforms the conversion tables into an internal format and stores them in the conversion image.
    SYSIMG
    Output is a single image of the entire conversion environment. The conversion image is built according to the specification in the SYSIN DD name. The conversion image resides in either a sequential data set or a member of a partitioned data set with a fixed-block 80-byte format. In this example, the image resides in the partitioned data set member hlq.IMAGES(CUNIMG00).
    SYSIN
    Two types of statements are recognized in this DD statement: case conversion, which is identified by the CASE control statement, and character conversion, which is identified by the CONVERSION control statement.
    CASE control statement:
    Case conversion is defined as converting Unicode characters (for example, UTF-8) to their uppercase equivalent or their lowercase equivalent. In the preceding example, CASE NORMAL is specified, which means basic case conversion is provided. This basic case conversion is based on the UnicodeData.txt file that is provided by the Unicode consortium. It does not include special casing as described in the SpecialCasing.txt file that is provided by the Unicode consortium. Special casing typically includes characters that have significant differences in the case-based appearance. For example, the German "hard S", which appears as a flat B, appears as "SS" in uppercase German text. Because Db2 does not use the case conversion service, you do not need to specify a conversion.
    CONVERSION control statement:
    Character conversion is also referred to as conversion between specified CCSIDs. An application such as Db2 invokes the CUNLCNV function to convert characters between the specified code pages. You must identify the conversions that are possible on the CONVERSION control statement.
    Important: Specify CONVERSION statements for Db2 as follows:
    CONVERSION xxx,yyy,ER;
    CONVERSION yyy,xxx,ER;

    Many code page conversions are possible. However, when identifying the conversion that Db2 is to use, you need be concerned only with conversion for the national languages that you use and with conversion from these code pages to and from all Unicode CCSIDs.

    Example: If you use an EBCDIC CCSID of 37 and an ASCII CCSID of 819, you need to use the following conversions:
    CONVERSION 37,367,ER;
    CONVERSION 37,1208,ER;
    CONVERSION 37,1200,ER;
    CONVERSION 367,37,ER;
    CONVERSION 1208,37,ER;
    CONVERSION 1200,37,ER;
    
    CONVERSION 819,367,ER;
    CONVERSION 819,1208,ER;
    CONVERSION 819,1200,ER;
    CONVERSION 367,819,ER;
    CONVERSION 1208,819,ER;
    CONVERSION 1200,819,ER;

    Multiple conversion tables might be available for converting one CCSID to another. A technique search order can be used to specify which table should be used. The technique search order consists of up to eight technique characters. If you specify more than one technique character, the image generator tries to find a matching table for the leftmost technique character in the sequence of the technique-search-order. If one is not found, the search continues with the second one, and so on. Especially for mixed conversion, use more than one technique character because one of the subconversions might exist only in round-trip mode, and one might exist only in an enforced subset. In this case, a technique search order of 'RE' or 'ER' would be required. Technique search order is optional. If you do not specify a technique search order, RECLM is used.

    Language products such as Enterprise Cobol might use the RECLM technique search order, while Db2 uses the ER technique search order. Therefore, you might also need to add the RECLM conversions, such as these:
    CONVERSION 1047,850,RECLM;  /* EBCDIC -> ASCII */
    CONVERSION 850,1047,RECLM;  /* ASCII -> EBCDIC */ 

    The important technique characters for Db2 are E (enforced subset) and R (round-trip). Enforced subset conversions map only those characters from one CCSID to another that have a corresponding character in the second CCSID. All other characters are replaced by a substitution character. Round-trip conversions between two CCSIDs assure that all characters making the 'round trip' arrive as they were originally, even if the receiving CCSID does not support a given character. Round-trip conversions ensure that code points that are converted from CCSID A to CCSID B, and back to CCSID A are preserved, even if CCSID B is not capable of representing these code points.

    After performing these steps, you should now have an updated CUNJIUTL JCL member.
  3. Submit the batch job in the CUNJIUTL member.
    At completion, the batch job writes its output to the SYSPRINT DD (that is, SYSOUT in this example). Expect a return code of zero from the CUNJIUTL program. If you receive anything other than return code zero, refer to Return code meanings. This information helps you correct environmental, syntactical, and semantic errors that might occur.
  4. After generating the conversion image, copy it to SYS1.PARMLIB or any other data set in the logical PARMLIB concatenation.
    In this example, you copy hlq.IMAGES(CUNIMG00) to SYS1.PARMLIB(CUNIMG00).
  5. Calculate the storage that is needed for a conversion image.
    When the conversion image is created on disk, you need to determine the amount of virtual storage that the image is to occupy. You specify this number as the number of pages on the REALSTORAGE parameter in the CUNUNIxx PARMLIB member that you create in the next step. The REALSTORAGE parameter protects the system from a shortage of main storage caused by loading a conversion image that exceeds the amount of available storage. The minimum value for the REALSTORAGE parameter depends on how the image is activated.
    • If the image is activated during IPL, the needed storage is the size of the image plus one page.
    • If the image is activated using the SET UNI command (PARMLIB member with keyword IMAGE), the needed storage is the size of the currently active image plus the size of the new conversion image.
    If you set up a conversion environment before or if you are not activating a conversion environment during IPL, you must determine the amount of storage that the currently active image occupies. To do this, issue the following command:
    D UNI,STORAGE
    The system displays the number of active pages.
    To determine the storage that the new conversion image occupies, find the CUN1017I message in the SYSPRINT log that was created in the previous step. This message indicates the number of pages that are required for the new conversion image. For example:
    CUN1017I GENERATED IMAGE SIZE 291 PAGES........

    As an alternative, specify a REALSTORAGE value of zero, which indicates that unlimited storage is available. In this case, a value of 524 287 pages is used.

  6. Create the PARMLIB member CUNUNIxx (PARMLIB member for activating a conversion environment).
    Normally the member is created in SYS1.PARMLIB, but in this case, you create it in another data set in the logical PARMLIB concatenation. This example uses SYS1.PARMLIB.
    The xx can be any two alphanumeric characters, or the special characters @, #, or $. This example uses 00. Here is the sample PARMLIB member in SYS1.PARMLIB(CUNUNI00):
    REALSTORAGE 292;
    IMAGE CUNIMG00;

    Because CUNMIUTL requires 291 pages, and an additional page is required during IPL, the REALSTORAGE statement indicates that 292 pages of real storage are required.

    The IMAGE parameter indicates that the system searches in SYS1.PARMLIB (or a data set in the logical concatenation) member CUNIMG00 for the conversion image.

    You can create a PARMLIB member to delete a current conversion environment.

  7. Take one of the following actions:
    • Edit IEASYSxx.

      This parameter specifies one or more CUNUNIxx PARMLIB members that contain the keywords that configure the conversion environment. Each suffix xx identifies one CUNUNIxx member in the PARMLIB concatenation. If several PARMLIB members are specified, they are concatenated in the specified sequence. The concatenated contents is handled internally as a single member. This means that the lines are numbered consecutively, and error messages about syntax errors refer to the concatenated text. Restrictions for keywords apply for the entire concatenated text.

    • Check parameter MAXCAD in IEASYSxx. It limits the amount of common data spaces in a system. If MAXCAD is specified, consider that z/OS support for Unicode creates up to two common data spaces.
  8. Initialize the conversion environment with an IPL.
  9. After the system is initialized, you can use the DISPLAY UNI system command to show the current z/OS Unicode status or use the SET UNI system command to change the conversion environment.