Array columns
If you select an array column for output, the CFF stage passes data to the output link data in different ways depending on the type of array you select.
When you load array columns into a CFF stage, you must specify how to process the array data. You can pass the data as is, flatten all arrays on input to the stage, or flatten selected arrays on input. You choose one of these options from the Complex File Load Option window, which opens when you load column definitions on the Records tab.
If you choose to flatten arrays, the flattening is done at the time that the column metadata is loaded into the stage. All of the array elements appear as separate columns in the table. Each array column has a numeric suffix to make its name unique. You can select any or all of these columns for output.
If you choose to pass arrays as is, the array structure is preserved. The data is presented as a single row at run time for each incoming row. If the array is normalized, the incoming single row is resolved into multiple output rows.
Simple normalized array columns
A simple array is a single, one-dimensional array. This example shows the result when you select all columns as output columns. For each record that is read from the input file, five rows are written to the output link. The sixth row out the link causes the second record to be read from the file, starting the process over again.
Input record:
05 ID PIC X(10)
05 NAME PIC X(30)
05 CHILD PIC X(30) OCCURS 5 TIMES.Output rows:
Row 1: ID NAME CHILD(1)
Row 2: ID NAME CHILD(2)
Row 3: ID NAME CHILD(3)
Row 4: ID NAME CHILD(4)
Row 5: ID NAME CHILD(5)
Nested normalized array columns
This example shows the result when you select a nested array column as output. If you select FIELD-A, FIELD-C and FIELD-D as output columns, the CFF stage multiplies the OCCURS values at each level. In this case, 6 rows are written to the output link.
Input record:
05 FIELD-A PIC X(4)
05 FIELD-B OCCURS 2 TIMES.
10 FIELD-C PIC X(4)
10 FIELD-D PIC X(4) OCCURS 3 TIMES.Output rows:
Row 1: FIELD-A FIELD-C(1) FIELD-D(1,1)
Row 2: FIELD-A FIELD-C(1) FIELD-D(1,2)
Row 3: FIELD-A FIELD-C(1) FIELD-D(1,3)
Row 4: FIELD-A FIELD-C(2) FIELD-D(2,1)
Row 5: FIELD-A FIELD-C(2) FIELD-D(2,2)
Row 6: FIELD-A FIELD-C(2) FIELD-D(2,3)
Parallel normalized array columns
Parallel arrays are array columns that have the same level number. The first example shows the result when you select all parallel array columns as output columns. The CFF stage determines the number of output rows by using the largest subscript. As a result, the smallest array is padded with default values and the element columns are repeated. In this case, if you select all of the input fields as output columns, four rows are written to the output link.
Input record:
05 FIELD-A PIC X(4)
05 FIELD-B PIC X(4) OCCURS 2 TIMES.
05 FIELD-C PIC X(4)
05 FIELD-D PIC X(4) OCCURS 3 TIMES.
05 FIELD-E PIC X(4) OCCURS 4 TIMES.
Output rows:
Row 1: FIELD-A FIELD-B(1) FIELD-C FIELD-D(1) FIELD-E(1)
Row 2: FIELD-A FIELD-B(2) FIELD-C FIELD-D(2) FIELD-E(2)
Row 3: FIELD-A FIELD-C FIELD-D(3) FIELD-E(3)
Row 4: FIELD-A FIELD-C FIELD-E(4)
In the next example, only a subset of the parallel array columns are selected (FIELD-B and FIELD-E). FIELD-D is passed as is. The number of output rows is determined by the maximum size of the denormalized columns. In this case, four rows are written to the output link.
Output rows:
Row 1: FIELD-A FIELD-B(1) FIELD-C FIELD-D(1) FIELD-D(2) FIELD-D(3) FIELD-E(1)
Row 2: FIELD-A FIELD-B(2) FIELD-C FIELD-D(1) FIELD-D(2) FIELD-D(3) FIELD-E(2)
Row 3: FIELD-A FIELD-C FIELD-D(1) FIELD-D(2) FIELD-D(3) FIELD-E(3)
Row 4: FIELD-A FIELD-C FIELD-D(1) FIELD-D(2) FIELD-D(3) FIELD-E(4)
Nested parallel denormalized array columns
This complex example shows the result when you select both parallel array fields and nested array fields as output. If you select FIELD-A, FIELD-C, and FIELD-E as output columns in this example, the CFF stage determines the number of output rows by using the largest OCCURS value at each level and multiplying them. In this case, 3 is the largest OCCURS value at the outer (05) level, and 5 is the largest OCCURS value at the inner (10) level. Therefore, 15 rows are written to the output link. Some of the subscripts repeat. In particular, subscripts that are smaller than the largest OCCURS value at each level start over, including the second subscript of FIELD-C and the first subscript of FIELD-E.
05 FIELD-A PIC X(10)
05 FIELD-B OCCURS 3 TIMES.
10 FIELD-C PIC X(2) OCCURS 4 TIMES.
05 FIELD-D OCCURS 2 TIMES.
10 FIELD-E PIC 9(3) OCCURS 5 TIMES.
Output rows:
Row 1: FIELD-A FIELD-C(1,1) FIELD-E(1,1)
Row 2: FIELD-A FIELD-C(1,2) FIELD-E(1,2)
Row 3: FIELD-A FIELD-C(1,3) FIELD-E(1,3)
Row 4: FIELD-A FIELD-C(1,4) FIELD-E(1,4)
Row 5: FIELD-A FIELD-E(1,5)
Row 6: FIELD-A FIELD-C(2,1) FIELD-E(2,1)
Row 7: FIELD-A FIELD-C(2,2) FIELD-E(2,2)
Row 8: FIELD-A FIELD-C(2,3) FIELD-E(2,3)
Row 9: FIELD-A FIELD-C(2,4) FIELD-E(2,4)
Row 10: FIELD-A FIELD-E(2,5)
Row 11: FIELD-A FIELD-C(3,1)
Row 12: FIELD-A FIELD-C(3,2)
Row 13: FIELD-A FIELD-C(3,3)
Row 14: FIELD-A FIELD-C(3,4)
Row 15: FIELD-A