Using %nnn, %nn and %n Parsed Fields with BUILD and OVERLAY

There are many types of variable position/length fields such as delimited fields, comma separated values (CSV), tab separated values, blank separated values, keyword separated fields, and so on. For example, you might have four records with comma separated values as follows:

Wayne,M,-53,-1732,Gotham
Summers,F,+7258,-273,Sunnydale
Kent,M,+213,-158,Metropolis
Prince,F,-164,+1289,Gateway

Note that each record has five variable fields separated by commas. The fields do not start and end in the same position in every record and have different lengths in different records, so you could not just specify the starting position and length (p,m) for any of these fields in a BUILD (or FIELDS or OUTREC) or OVERLAY operand of the INREC, OUTREC or OUTFIL statement. But you can use the PARSE operand of an INREC, OUTREC or OUTFIL statement to define rules that tell DFSORT how to extract the relevant data from each variable input field into a fixed parsed field, and then use the fixed parsed fields in a BUILD or OVERLAY operand as you would use fixed input fields.

You define a parsed field for converting a variable field to a fixed parsed field using a %nnn name where nnn can be 000 to 999, a %nn name where nn can be 00 to 99, or a %n name where n can be 0 to 9. You can define and use up to 1000 parsed fields (%0-%999) per run. Each %nnn, %nn or %n parsed field must be defined only once. %n, %0n and %00n (for example, %1, %01 and %001) are treated as the same parsed field. Likewise, %nn and %0nn (for example, %22 and %022) are treated as the same parsed field. A %nnn, %nn or %n parsed field must be defined in a PARSE operand before it is used in a BUILD or OVERLAY operand.

Suppose you wanted to reformat the CSV records to produce these output records:

Wayne        -178.5     Gotham
Summers       698.5     Sunnydale
Kent            5.5     Metropolis
Prince        112.5     Gateway

You can use the following OUTREC statement to parse and reformat the variable fields:

  OUTREC PARSE=(%01=(ENDBEFR=C',',FIXLEN=8),
                %=(ENDBEFR=C','),
                %03=(ENDBEFR=C',',FIXLEN=5),
                %04=(ENDBEFR=C',',FIXLEN=5),
                %05=(FIXLEN=10)),
     BUILD=(%01,14:%03,SFF,ADD,%04,SFF,EDIT=(SIIT.T),SIGNS=(,-),
            25:%05)

The PARSE operand defines how each variable field is to be extracted to a fixed parsed field as follows:

The %01 parsed field is used to extract the first variable field into an 8-byte fixed parsed field. ENDBEFR=C',' tells DFSORT to stop extracting data at the byte before the next comma (the comma after the first variable field). FIXLEN=8 tells DFSORT that the %01 parsed field is 8 bytes long. Thus, for the first record, DFSORT extracts Wayne into the 8-byte %01 parsed field. Since Wayne is only 5 characters, but the %01 parsed field is 8 bytes long, DFSORT pads the %01 parsed field on the right with 3 blanks. ENDBEFR=C',' also tells DFSORT to skip over the comma after the first variable field before it parses the second variable field.
The % parsed field is used to skip the second variable field without extracting anything for it. Since we don't want this field in the output record, we can use % to ignore it. Thus, for the first record, we ignore M. ENDBEFR=C',' tells DFSORT to skip over the comma after the second variable field before it parses the third variable field.
The %03 parsed field is used to extract the third variable field into a 5-byte fixed parsed field. ENDBEFR=C',' tells DFSORT to stop extracting data before the next comma (the comma after the third variable field). FIXLEN=5 tells DFSORT that the %03 parsed field is 5 bytes long. Thus, for the first record, DFSORT extracts -53 into the 5-byte %03 parsed field. Since -53 is only 3 characters, but the %03 parsed field is 5 bytes long, DFSORT pads the %03 parsed field on the right with 2 blanks. ENDBEFR=C',' also tells DFSORT to skip over the comma after the third variable field before it parses the fourth variable field.
The %04 parsed field is used to extract the fourth variable field into a 5-byte fixed parsed field. ENDBEFR=C',' tells DFSORT to stop extracting data before the next comma (the comma after the fourth variable field). FIXLEN= 5 tells DFSORT that the %04 parsed field is 5 bytes long. Thus, for the first record, DFSORT extracts -1732 into the 5-byte %04 parsed field. Since -1732 is 5 characters, it fills up the 5-byte %05 parsed field and padding is not needed. ENDBEFR=C',' also tells DFSORT to skip over the comma after the fourth variable field before it parses the fifth variable field.
The %05 parsed field is used to extract the fifth variable field into a 10-byte fixed parsed field. FIXLEN=10 tells DFSORT that the %05 parsed field is 10 bytes long. Thus, for the first record, DFSORT extracts Gotham and 4 blanks into the 10-byte %01 parsed field.

The BUILD operand uses the previously extracted fixed parsed fields to build the output record as follows:

%01 copies the 8-byte fixed-length data extracted from the first variable field to positions 1-8 of the output record. For the first record, positions 1-8 contain 'Wayne '.
14:%03,SFF,ADD,%04,SFF,EDIT=(SIIT.T),SIGNS=(,-) adds the 5-byte fixed-length data extracted from the third variable field to the 5-byte fixed-length data extracted from the fourth variable field and places the 6-byte edited result in positions 14-19 of the output record. For the first record, positions 14-19 contain -178.5 (-53 + -1732 = -1785 edited to -178.5). Note that since the %03 and %04 parsed fields may be padded on the right with blanks, we must use the SFF format to handle the sign and digits correctly.
25:%05 copies the 10-byte fixed-length data extracted from the fifth variable field to positions 25-34 of the output record. For the first record, positions 25-34 contain 'Gotham '.