|
There are many types of variable position/length fields such as
delimited fields, comma separated values (CSV), tab separated values,
blank separated values, keyword separated fields, and so on. For
example, you might have four records with comma separated values as
follows: Wayne,M,-53,-1732,Gotham
Summers,F,+7258,-273,Sunnydale
Kent,M,+213,-158,Metropolis
Prince,F,-164,+1289,Gateway
Note that each record has five variable
fields separated by commas. The fields do not start and end in the
same position in every record and have different lengths in different
records, so you could not just specify the starting position and length
(p,m) for any of these fields in a BUILD (or FIELDS or OUTREC) or
OVERLAY operand of the INREC, OUTREC or OUTFIL statement. But you
can use the PARSE operand of an INREC, OUTREC or OUTFIL statement
to define rules that tell DFSORT how to extract the relevant data
from each variable input field into a fixed parsed field, and then
use the fixed parsed fields in a BUILD or OVERLAY operand as you would
use fixed input fields.
You define a parsed field for converting a variable
field to a fixed parsed field using a %nnn name where nnn can be 000
to 999, a %nn name where nn can be 00 to 99, or a %n name where n
can be 0 to 9. You can define and use up to 1000 parsed fields (%0-%999)
per run. Each %nnn, %nn or %n parsed field must be defined only once.
%n, %0n and %00n (for example, %1, %01 and %001) are treated as
the same parsed field. Likewise, %nn and %0nn (for example, %22 and
%022) are treated as the same parsed field. A %nnn, %nn or %n parsed
field must be defined in a PARSE operand before it
is used in a BUILD or OVERLAY operand.
Suppose you wanted to reformat the CSV records to produce these
output records: Wayne -178.5 Gotham
Summers 698.5 Sunnydale
Kent 5.5 Metropolis
Prince 112.5 Gateway
You can use the following OUTREC statement to parse and reformat
the variable fields: OUTREC PARSE=(%01=(ENDBEFR=C',',FIXLEN=8),
%=(ENDBEFR=C','),
%03=(ENDBEFR=C',',FIXLEN=5),
%04=(ENDBEFR=C',',FIXLEN=5),
%05=(FIXLEN=10)),
BUILD=(%01,14:%03,SFF,ADD,%04,SFF,EDIT=(SIIT.T),SIGNS=(,-),
25:%05)
The PARSE operand defines how each variable field is to
be extracted to a fixed parsed field as follows: - The %01 parsed field is used to extract the first
variable field into an 8-byte fixed parsed field. ENDBEFR=C',' tells
DFSORT to stop extracting data at the byte before the next comma (the
comma after the first variable field). FIXLEN=8 tells DFSORT that
the %01 parsed field is 8 bytes long. Thus, for the first
record, DFSORT extracts Wayne into the 8-byte %01 parsed field.
Since Wayne is only 5 characters, but the %01 parsed field
is 8 bytes long, DFSORT pads the %01 parsed field on the right
with 3 blanks. ENDBEFR=C',' also tells DFSORT to skip over the comma
after the first variable field before it parses the second variable
field.
- The % parsed field is used to skip the
second variable field without extracting anything for it. Since we
don't want this field in the output record, we can use % to
ignore it. Thus, for the first record, we ignore M. ENDBEFR=C','
tells DFSORT to skip over the comma after the second variable field
before it parses the third variable field.
- The %03 parsed field is used to extract the third
variable field into a 5-byte fixed parsed field. ENDBEFR=C',' tells
DFSORT to stop extracting data before the next comma (the comma after
the third variable field). FIXLEN=5 tells DFSORT that the %03
parsed field is 5 bytes long. Thus, for the first record, DFSORT
extracts -53 into the 5-byte %03 parsed field. Since -53
is only 3 characters, but the %03 parsed field is 5 bytes
long, DFSORT pads the %03 parsed field on the right with 2
blanks. ENDBEFR=C',' also tells DFSORT to skip over the comma after
the third variable field before it parses the fourth variable field.
- The %04 parsed field is used to extract the fourth
variable field into a 5-byte fixed parsed field. ENDBEFR=C',' tells
DFSORT to stop extracting data before the next comma (the comma after
the fourth variable field). FIXLEN= 5 tells DFSORT that the %04
parsed field is 5 bytes long. Thus, for the first record, DFSORT
extracts -1732 into the 5-byte %04 parsed field. Since -1732
is 5 characters, it fills up the 5-byte %05 parsed field and
padding is not needed. ENDBEFR=C',' also tells DFSORT to skip over
the comma after the fourth variable field before it parses the fifth
variable field.
- The %05 parsed field is used to extract the fifth
variable field into a 10-byte fixed parsed field. FIXLEN=10 tells
DFSORT that the %05 parsed field is 10 bytes long. Thus,
for the first record, DFSORT extracts Gotham and 4 blanks into the
10-byte %01 parsed field.
The BUILD operand uses the previously extracted fixed
parsed fields to build the output record as follows: - %01 copies the 8-byte fixed-length data extracted
from the first variable field to positions 1-8 of the output record.
For the first record, positions 1-8 contain 'Wayne '.
- 14:%03,SFF,ADD,%04,SFF,EDIT=(SIIT.T),SIGNS=(,-) adds
the 5-byte fixed-length data extracted from the third variable field
to the 5-byte fixed-length data extracted from the fourth variable
field and places the 6-byte edited result in positions 14-19 of the
output record. For the first record, positions 14-19 contain -178.5
(-53 + -1732 = -1785 edited to -178.5). Note that since the %03
and %04 parsed fields may be padded on the right with blanks,
we must use the SFF format to handle the sign and digits correctly.
- 25:%05 copies the 10-byte fixed-length data extracted
from the fifth variable field to positions 25-34 of the output record.
For the first record, positions 25-34 contain 'Gotham '.
|