Example: deriving initials from UTF-8 names

The following program uses the Unicode functions to derive composers’ initials from a table of names in Czech. It is intended to illustrate these functions, and is not necessarily the most efficient way of doing the task. Although the program processes the composer names in UTF-8, the data begins and ends in EBCDIC in order to permit a meaningful display of the program source and output. The compiler option CODEPAGE(1153) ensures that the names are interpreted correctly when translated to and from Unicode.

Program initials

Process codepage(1153)
    *----------------------------------------------------------------*
    * For a table of Czech composer names represented in UTF-8,      *
    * determine and print out the initials of each name.             *
    *----------------------------------------------------------------*
     Identification division.
       Program-id. initials.
     Data division.
      Working-storage section.
       1 utilityVariables.
        2 UTF-8-space pic x value x'20'.
        2 UTF-8-hyphen pic x value x'2D'.
        2 UTF-8-ch pic xxx.
        2 i comp pic 9.
        2 j comp pic 99.
        2 hex pic x(160).
       1 EBCDICnameData.
        2 pic x(40) value 'Antonín Leopold Dvořák'.
        2 pic x(40) value 'Leoš Janáček'.
        2 pic x(40) value 'Rafael Jeroným Kubelík'.
        2 pic x(40) value 'Pavel Křížkovský'.
        2 pic x(40) value 'Jan Václav Hugo Voříšek'.
       1 redefines EBCDICnameData.
        2 EBCDICname pic x(40) occurs 5 times.
       1 UTF-8-nameData.
        2 composer pic x(40) occurs 5 times.
       1 composerInitials.
        2 occurs 5.
         3 cInitSize comp pic 99.
         3 cInit pic x(8).
       1 state pic 9.
        88 seekingInitial value 0.   *> Skip space and hyphen
        88 seekingSeparator value 1. *> Skip all but space and hyphen

Program initials, continued

       Procedure division.
             main.
               Display 'Compute composer initials...'
               Initialize composerInitials
               Perform test before varying i from 1 by 1 until i > 5
          * Start by translating each composer name from EBCDIC to UTF-8.
                 Move function display-of
                     (function national-of(EBCDICname(i)) 1208)
                     to composer(i)
          * Test each character of the name; skip leading spaces, etc.
                 Set seekingInitial to true
                 Move 1 to cInitSize(i)
                 Perform varying j from 1 by 1
                     until j > function ULENGTH(composer(i))
                   Move function USUBSTR(composer(i) j 1) to UTF-8-ch
          * Initial found. Save in buffer, then skip to next space/hyphen.
                   If seekingInitial and
                       UTF-8-ch not = UTF-8-Space and UTF-8-Hyphen
                     String function USUBSTR(composer(i) j 1)
                         delimited by size
                         into cInit(i) with pointer cInitSize(i)
                     Set seekingSeparator to true
                   End-if
          * Space/hyphen found; skip spaces or hyphens to next initial.
                   If seekingSeparator and
                       (UTF-8-ch = UTF-8-Space or UTF-8-Hyphen)
                     Set seekingInitial to true
                   End-if
                 End-perform
          * Adjust string pointer to number of initials found.
                 Subtract 1 from cInitSize(i)
               End-perform
          * Print out the UTF-8 initials, translated to EBCDIC, and
          * also in hexadecimal, using program ToHex (listed later).
               Perform test before varying i from 1 by 1 until i > 5
                 Call 'toHex' using hex cInit(i) value cInitSize(i)
                 Display '  #' i ': ' function display-of(
                     function national-of(cInit(i)(1:cInitSize(i)) 1208))
                     ' (x''' hex(1:2 * cInitSize(i)) ''')'
               End-perform.
               Goback.
           End program initials.

Output from program initials

Compute composer initials...
  #1: ALD (x'414C44')
  #2: LJ (x'4C4A')
  #3: RJK (x'524A4B')
  #4: PK (x'504B')
  #5: JVHV (x'4A564856')

Program toHex

 Identification division.
         Program-id. toHex.
       Data division.
        Working-storage section.
         1 hexv.
          2 pic x(32) value '000102030405060708090A0B0C0D0E0F'.
          2 pic x(32) value '101112131415161718191A1B1C1D1E1F'.
          2 pic x(32) value '202122232425262728292A2B2C2D2E2F'.
          2 pic x(32) value '303132333435363738393A3B3C3D3E3F'.
          2 pic x(32) value '404142434445464748494A4B4C4D4E4F'.
          2 pic x(32) value '505152535455565758595A5B5C5D5E5F'.
          2 pic x(32) value '606162636465666768696A6B6C6D6E6F'.
          2 pic x(32) value '707172737475767778797A7B7C7D7E7F'.
          2 pic x(32) value '808182838485868788898A8B8C8D8E8F'.
          2 pic x(32) value '909192939495969798999A9B9C9D9E9F'.
          2 pic x(32) value 'A0A1A2A3A4A5A6A7A8A9AAABACADAEAF'.
          2 pic x(32) value 'B0B1B2B3B4B5B6B7B8B9BABBBCBDBEBF'.
          2 pic x(32) value 'C0C1C2C3C4C5C6C7C8C9CACBCCCDCECF'.
          2 pic x(32) value 'D0D1D2D3D4D5D6D7D8D9DADBDCDDDEDF'.
          2 pic x(32) value 'E0E1E2E3E4E5E6E7E8E9EAEBECEDEEEF'.
          2 pic x(32) value 'F0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF'.
         1 redefines hexv.
          2 hex pic xx occurs 256 times.
        Local-storage section.
         1 i pic 9(4) binary.
         1 j pic 9(4) binary value 0.
         1 jx redefines j.
          2 pic x.
          2 jxd pic x.
        Linkage section.
         1 ostr.
          2 ostrv pic xx occurs 1024 times.
         1 istr.
          2 istrv pic x occurs 1024 times.
         1 len pic 9(9) binary.
       Procedure division using ostr istr value len.
           If len > 1024
             Display '>>  Error: length ' len ' greater than toHex '
                 'supported maximum of 1024.'
             Stop run
           End-if
           Perform with test before varying i from 1 by 1 until i > len
             Move 0 to j
             Move istrv(i) to jxd
             Add 1 to j
             Move hex(j) to ostrv(i)
           End-perform
           Goback
           .
       End program toHex.