XML-INTO (Parse an XML Document into a Variable)

Free-Form Syntax XML-INTO{(EH)} receiver %XML(xmlDoc {: options });
XML-INTO{(EH)} %HANDLER(handlerProc : commArea) %XML(xmlDoc {: options });
Code Factor 1 Extended Factor 2
XML-INTO   receiver %XML(xmlDoc {: options })
XML-INTO   %HANDLER(handlerProc : commArea) %XML(xmlDoc {: options })

Tip: If you are not familiar with the basic concepts of XML and of processing XML documents, you may find it helpful to read the "Processing XML Documents" section in Rational Development Studio for i: ILE RPG Programmer's Guide before reading further in this section.

XML-INTO can operate in two different ways:
  • Reading XML data directly into an RPG variable
  • Reading XML data gradually into an array parameter that it passes to the procedure specified by %HANDLER(handlerProc).

The first operand specifies the target of the parsed data. It can contain a variable name or the %HANDLER built-in function.

The second operand must be the %XML built-in function, identifying the XML document to be parsed and the options controlling the way the parsing is done. See %XML (xmlDocument {:options}) for more information on %XML.

If the first operand is a variable name:
  • Parsing will be done directly into the variable.
  • The name of the variable will be used to establish the name of the XML element to parse; this can be overridden using the path option.
  • If the variable is a data structure, some subfields may be set by the operation even if the operation ends in error.
  • If the variable is an array, the parsing will only search for as much data as will fit in the array. The "Number of XML Elements" subfield in positions 372 - 379 of the PSDS will be set to the number of elements successfully set by the operation. For an array of data structures, this value will not include the element being set if a parsing error occurs while parsing the data for the subfields of the element; however, this array element may have some of its subfields set by the operation.
If the first operand is the %HANDLER built-in function:
  • The procedure specified as the first operand of %HANDLER will be called when the parser has parsed enough XML data to fill the specified number of RPG array elements handled by the procedure. When the handler returns, the parser will continue to parse the XML data until it has parsed enough XML data to again fill the specified number of array elements to call the handling procedure. This continues until the document is completely parsed, or until the procedure returns a return code indicating that the parsing should halt.

    The final call to the handling procedure may have fewer RPG array elements than the handling procedure can handle. The handling procedure should always refer to the "number of elements" parameter to ensure it does not access array elements that do not have any XML data.

    The communication-area variable specified as the second operand of %HANDLER will be passed by the parser as the first parameter to the handling procedure, allowing the procedure coding the XML-INTO operation to communicate with the handling procedure, and allowing the handling procedure to save information from one call to the next.

  • Each element of the temporary variable used to hold the array parameter for the procedure will be cleared to its default value before it is loaded from the XML data.
  • The path option must be used to specify the name of the XML element to search for. See %DATA and %XML options for DATA-GEN, DATA-INTO, and XML-INTO and Expected format of XML data and for information about the path option.
  • The array-handling procedure may be called several times during the XML-INTO operaton. When the parser has found the number of elements specified by the DIM keyword on the second parameter, the procedure will be called. The final time the procedure is called may have fewer elements than specified by the DIM keyword. If there are no elements found, the procedure will not be called.
    The handling procedure must have the following parameters and return type:
    Parameter number or return value Data type and passing mode Description
    Return value 4-byte integer (10I 0) Returning a value of zero indicates that parsing should continue; returning any other value indictes that parsing should end.
    1 Any type, passed by reference Used to communicate between the XML-INTO operation and the handler, and between successive calls to the handler.
    2 Array, or array of data structures, passed by read-only reference (CONST keyword) The array elements contain the data from the XML elements specified by the path option.
    3 4-byte unsigned (10U 0), passed by value The number of array elements in the second parameter that represent XML data.
  • See %HANDLER (handlingProcedure : communicationArea ) for more information on %HANDLER.

Subfields of a data structure will be set in the order they appear in the XML document; the order could be important if subfields overlap within the data structure.

%NULLIND is not updated for any field or subfield during an XML-INTO operation.

Operation extender H can be specified to cause numeric data to be assigned half-adjusted. Operation extender E can be specified to handle the following status codes:
00351
Error in XML parsing
00352
Invalid XML option
00353
XML document does not match RPG variable
00354
Error preparing for XML parsing
Note: Operation extenders can be specified only when Free-form syntax is used.

For status 00351, the return code from the parser will be placed in the subfield "External return code" in positions 368-371 of the PSDS. This subfield will be set to zero at the beginning of the operation and set to the value returned by the parser at the end of the operation.

If an unknown, invalid or unrelated option is found in the options parameter of the %XML built-in function, the operation will fail with status code 00352 (Error in XML options). The External return code subfield in the PSDS will not be updated from the initial value of zero, set when the operation begins.

The XML document is expected to match the RPG variable with respect to the names of the XML elements or attributes.
  • The XML data for an RPG data structure is expected to have an XML element with the same name as the data structure and child elements and/or attributes with the same names as the RPG subfields.
  • The XML data for an RPG array is expected to have a series of elements with the same name as the RPG array.
The path option can be used to set the name of the XML element matching the name of the specified variable, but it cannot be used to set the names of the XML elements and/or attributes matching a specified variable's subfields. For example, if variable DS1 has a subfield SF1, the XML element for DS1 can have any name, but the XML element or attribute for SF1 must have the name "sf1" (or "SF1", "Sf1", etc., depending on the case option).
When the RPG variable is an array or array of data structures, or when the %HANDLER built-in function is specified, the XML elements corresponding to the array elements are expected to be contained in another XML element. By default, the XML elements will be expected to be child elements of the outermost XML element in the document. The path option can be used to specify the exact path to the XML elements corresponding to the array elements. For example, if the outermost XML element is named "transaction", and it has a child element named "parts" which itself has several child elements named "part", then to read the "part" XML elements into an array, you would specify the option 'path=transaction/parts/part'.
  <transaction>
    <parts>
       <part type = "bracket" size="15" num="100"/>
       <part type="frame" size="2" num="500"/>
    </parts>
  <transaction>

When the XML document does not match the RPG variable, for example if the XML document does not contain the default or specified path, or if it is missing some XML elements or attributes to match the subfields of an RPG data structure, the XML-INTO operation will fail with status 00353. The allowextra and allowmissing options can be used to specify whether an XML element can have more or less data than is required to fully set the RPG variable.

For some RPG data types, XML attributes can be specified to control how the XML data is assigned to the RPG variable. See Rules for transferring data to RPG variables for XML-INTO and DATA-INTO for more information on these attributes.

If an XML reference other than the predefined references &amp, &apos, &lt, &gt, &quot, or the hexadecimal unicode references &#xxxx is found, the result will contain the reference itself, in the form "&refname;". If this value is not valid for the data type, the operation will fail. For example, if an XML element has the value <data>1&decpoint;50/data> the string "1&decpoint;50" would be built up from the three pieces "1", "&decpoint;", and "0". This data is valid for a character or UCS-2 variable, but it would cause an error if converted to other types.

Tip: If XML data is known to contain such references, then following the completion of the XML-INTO operation, character and UCS-2 data should be inspected for the presence of references, and the correct value for the reference substituted using string operations such as %SCANRPL, or %SCAN and %REPLACE.

If XML data is not valid for the type of the RPG variable it matches, the operation will fail with status 0353; the specific status code for the assignment error will appear in the replacement text for message RNX0353.

Tip: To avoid the XML-INTO operation failing because the data cannot be successfully assigned to RPG fields with types such as Date or Numeric, the receiver variable can be defined with subfields that are all of type character or UCS-2. Then the data can be converted to other data types by the RPG program using the conversion built-in functions %DATE, %INT, and so on.

The XML-INTO operation ignores the DOCTYPE declaration. The DOCTYPE declaration may contain the values of entity references that your program will have to handle manually. If you want to have access to the DOCTYPE declaration of the XML document, you can use the XML-SAX operation. Your XML-SAX handling procedure can halt the parsing as soon as it has found the DOCTYPE declaration value, or as soon as it knows that there will be no DOCTYPE declaration.

The following links provide more information on XML-INTO.