Passing multiple rows from an XML or JSON file to the Hierarchical stage (DataStage)
You can pass multiple rows from an XML or JSON file to the Hierarchical stage as a single record. This method is needed when the XML or JSON file is located in a file storage connector.
When you create a DataStage® flow to parse an XML or JSON file that is located in file storage connector (such as IBM Cloud® Object Storage, Amazon S3, or Google Cloud Storage) to the Hierarchical Data stage, you must pass an entire XML or JSON string as one record. Normally an XML or JSON file contains multiple rows.
Specifying file format properties for the XML or JSON file
Set the file format of the connector to delimited and set the values for field delimiter and row delimiter so that the connector reads the file format as one row and one column.Different connectors have different settings, so be sure to refer to the connector documentation for the right values.
Setting | Value |
---|---|
File format | Delimited |
Record definition | None |
Field delimiter | ~ Note: The field delimiter can be ~ or any other character that does not already exist in the
XML or JSON file, which sets the whole row as on field delimiter.
|
Row delimiter | <CRLF> |
All other settings | Leave as default. |
Using the Transformer stage to merge all the rows into one row
- Open the connector Stage tab, then open the Advanced section. Next, set the execution mode to "Sequential" and click Save.
- Configure the Transformer stage to merge all the rows into one row.
In the following example, you set up a DataStage flow to have an Amazon S3 connector for a source, which links to a Transformer stage, which links to a Hierarchical stage, which links to a sequential file as a target.
On the Transformer stage, click the Output tab, open the Columns section, then click Edit. Set the column list to contain only one column.
- Specify the following settings:
Table 2. Setting Value File format Delimited Record definition None Field delimiter ~ Row delimiter <CR> - Click the Output tab, open the Columns section,
then click Edit. Then, set the column list to contain only one column.
You might set the column name to something like COLUMN_1 and the type to VARCHAR. Set the length value to be larger than the maximum length of the rows in the XML file.
- Click Apply and return.
- Open the Transformer stage and then click Add stage variable +. Name the new variable "combined."
- Click the calculator icon to open the expression builder. Next, set the derivation to
if combined="" then Link_1.COLUMN_1 else combined:Link_1.COLUMN_1
, whereLink_1
is the name of the link that connects the connector and the Transformer stage. Then, click Apply and return. - Open the Transformer Stage tab, then open the Advanced section. Next, set the execution mode to "Sequential" and click Save.
- Open the Output tab, then add a constraint setting of
LastRow()
. - In the Output section set the derivation to
combined
, the stage variable name that you created earlier. Change the column name tooutputString
and set the length to a large number greater than the whole file string length. - Click Save and return.
- Specify the following settings: