Example: deriving initials from UTF-8 names
The following program uses the Unicode functions to derive
composers’ initials from a table of names in Czech. It is intended
to illustrate these functions, and is not necessarily the most efficient
way of doing the task. Although the program processes the composer
names in UTF-8, the data begins and ends in EBCDIC in order to permit
a meaningful display of the program source and output. The compiler
option CODEPAGE(1153)
ensures that the names are
interpreted correctly when translated to and from Unicode.
Program initials
Process codepage(1153)
*----------------------------------------------------------------*
* For a table of Czech composer names represented in UTF-8, *
* determine and print out the initials of each name. *
*----------------------------------------------------------------*
Identification division.
Program-id. initials.
Data division.
Working-storage section.
1 utilityVariables.
2 UTF-8-space pic x value x'20'.
2 UTF-8-hyphen pic x value x'2D'.
2 UTF-8-ch pic xxx.
2 i comp pic 9.
2 j comp pic 99.
2 hex pic x(160).
1 EBCDICnameData.
2 pic x(40) value 'Antonín Leopold Dvořák'.
2 pic x(40) value 'Leoš Janáček'.
2 pic x(40) value 'Rafael Jeroným Kubelík'.
2 pic x(40) value 'Pavel Křížkovský'.
2 pic x(40) value 'Jan Václav Hugo Voříšek'.
1 redefines EBCDICnameData.
2 EBCDICname pic x(40) occurs 5 times.
1 UTF-8-nameData.
2 composer pic x(40) occurs 5 times.
1 composerInitials.
2 occurs 5.
3 cInitSize comp pic 99.
3 cInit pic x(8).
1 state pic 9.
88 seekingInitial value 0. *> Skip space and hyphen
88 seekingSeparator value 1. *> Skip all but space and hyphen
Program initials, continued
Procedure division.
main.
Display 'Compute composer initials...'
Initialize composerInitials
Perform test before varying i from 1 by 1 until i > 5
* Start by translating each composer name from EBCDIC to UTF-8.
Move function display-of
(function national-of(EBCDICname(i)) 1208)
to composer(i)
* Test each character of the name; skip leading spaces, etc.
Set seekingInitial to true
Move 1 to cInitSize(i)
Perform varying j from 1 by 1
until j > function ULENGTH(composer(i))
Move function USUBSTR(composer(i) j 1) to UTF-8-ch
* Initial found. Save in buffer, then skip to next space/hyphen.
If seekingInitial and
UTF-8-ch not = UTF-8-Space and UTF-8-Hyphen
String function USUBSTR(composer(i) j 1)
delimited by size
into cInit(i) with pointer cInitSize(i)
Set seekingSeparator to true
End-if
* Space/hyphen found; skip spaces or hyphens to next initial.
If seekingSeparator and
(UTF-8-ch = UTF-8-Space or UTF-8-Hyphen)
Set seekingInitial to true
End-if
End-perform
* Adjust string pointer to number of initials found.
Subtract 1 from cInitSize(i)
End-perform
* Print out the UTF-8 initials, translated to EBCDIC, and
* also in hexadecimal, using program ToHex (listed later).
Perform test before varying i from 1 by 1 until i > 5
Call 'toHex' using hex cInit(i) value cInitSize(i)
Display ' #' i ': ' function display-of(
function national-of(cInit(i)(1:cInitSize(i)) 1208))
' (x''' hex(1:2 * cInitSize(i)) ''')'
End-perform.
Goback.
End program initials.
Output from program initials
Compute composer initials...
#1: ALD (x'414C44')
#2: LJ (x'4C4A')
#3: RJK (x'524A4B')
#4: PK (x'504B')
#5: JVHV (x'4A564856')
Program toHex
Identification division.
Program-id. toHex.
Data division.
Working-storage section.
1 hexv.
2 pic x(32) value '000102030405060708090A0B0C0D0E0F'.
2 pic x(32) value '101112131415161718191A1B1C1D1E1F'.
2 pic x(32) value '202122232425262728292A2B2C2D2E2F'.
2 pic x(32) value '303132333435363738393A3B3C3D3E3F'.
2 pic x(32) value '404142434445464748494A4B4C4D4E4F'.
2 pic x(32) value '505152535455565758595A5B5C5D5E5F'.
2 pic x(32) value '606162636465666768696A6B6C6D6E6F'.
2 pic x(32) value '707172737475767778797A7B7C7D7E7F'.
2 pic x(32) value '808182838485868788898A8B8C8D8E8F'.
2 pic x(32) value '909192939495969798999A9B9C9D9E9F'.
2 pic x(32) value 'A0A1A2A3A4A5A6A7A8A9AAABACADAEAF'.
2 pic x(32) value 'B0B1B2B3B4B5B6B7B8B9BABBBCBDBEBF'.
2 pic x(32) value 'C0C1C2C3C4C5C6C7C8C9CACBCCCDCECF'.
2 pic x(32) value 'D0D1D2D3D4D5D6D7D8D9DADBDCDDDEDF'.
2 pic x(32) value 'E0E1E2E3E4E5E6E7E8E9EAEBECEDEEEF'.
2 pic x(32) value 'F0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF'.
1 redefines hexv.
2 hex pic xx occurs 256 times.
Local-storage section.
1 i pic 9(4) binary.
1 j pic 9(4) binary value 0.
1 jx redefines j.
2 pic x.
2 jxd pic x.
Linkage section.
1 ostr.
2 ostrv pic xx occurs 1024 times.
1 istr.
2 istrv pic x occurs 1024 times.
1 len pic 9(9) binary.
Procedure division using ostr istr value len.
If len > 1024
Display '>> Error: length ' len ' greater than toHex '
'supported maximum of 1024.'
Stop run
End-if
Perform with test before varying i from 1 by 1 until i > len
Move 0 to j
Move istrv(i) to jxd
Add 1 to j
Move hex(j) to ostrv(i)
End-perform
Goback
.
End program toHex.