Transformer stage: looping
You can use the looping mechanism in the Transformer stage to create multiple output rows from a single input row.
The following scenarios give examples of how you can use the looping facility.
Input data with multiple repeating columns
When the input data contains rows with multiple columns containing repeating data, you can use the Transformer stage to produce multiple output rows: one for each of the repeating columns.
For example,
if the input row contained the following data.
You can use the Transformer stage to flatten the input
data and create multiple output rows for each input row. The data
now comprises the following columns.
Col1 | Col2 | Name1 | Name2 | Name3 |
---|---|---|---|---|
abc | def | Jim | Bob | Tom |
Col1 | Col2 | Name |
---|---|---|
abc | def | Jim |
abc | def | Bob |
abc | def | Tom |
Input data with multiple repeating values in a single field
When you have data where a single column contains multiple repeating values that are separated by a delimiter, you can flatten the data to produce multiple output columns: one for each of the delimited values. You can also specify that certain values are filtered out, and not have a new row created.For example, the
input row contains the following data.
You want to flatten the name field so a new row is created
for every new name indicated by the backslash (/) character. You also
want to filter out the name Jim and drop the column named Col2, so
that the resulting output data for the example column produces two
rows with two columns.
Col1 | Col2 | Names |
---|---|---|
abc | def | Jim/Bob/Tom |
Col1 | Name |
---|---|
abc | Bob |
abc | Tom |
Value in an input row column used to generate new output rows
You can use the Transformer stage to generate new rows, based on values held in an input column.For example, you have an input column that contains a count, and
want to generate output rows based on the value of the count. The
following example column has a count value of 5.
You can generate five output rows for this one input
row based on the value in the Count column.
Col1 | Col2 | MaxCount |
---|---|---|
abc | def | 5 |
Col1 | Col2 | EntryNumber |
---|---|---|
abc | def | 1 |
abc | def | 2 |
abc | def | 3 |
abc | def | 4 |
abc | def | 5 |
Input row group aggregation included with input row data
You can save input rows to a cache area, so that you can process this data in a loop.For example, you have input data that has a column holding a price value. You want to add a column to the output rows. The new column indicates what percentage the price value is of the total value for prices in all rows in that group. The value for the new Percentage column is calculated by the following expression.
(price * 100)/sum of all prices in group
In the example,
the data is sorted and is grouped on the value in Col1.
The percentage for each row in the group where Col1
= 1000 is calculated by the following expression.
Col1 | Col2 | Price |
---|---|---|
1000 | abc | 100.00 |
1000 | def | 20.00 |
1000 | ghi | 60.00 |
1000 | jkl | 20.00 |
2000 | zyx | 120.00 |
2000 | wvu | 110.00 |
2000 | tsr | 170.00 |
(price * 100)/200
The percentage for each row
in the group where Col1 = 2000 is calculated by the following expression.(price * 100)/400
The output is shown
in the following table.Col1 | Col2 | Price | Percentage |
---|---|---|---|
1000 | abc | 100.00 | 50.00 |
1000 | def | 20.00 | 10.00 |
1000 | ghi | 60.00 | 30.00 |
1000 | jkl | 20.00 | 10.00 |
2000 | zyx | 120.00 | 30.00 |
2000 | wvu | 110.00 | 27.50 |
2000 | tsr | 170.00 | 42.50 |