Handling large MRM messages
When an input bit stream is parsed, and a logical tree is created, the tree representation of an MRM message is typically larger, and in some cases much larger, than the corresponding bit stream.
About this task
- The addition of the pointers that link the objects together
- Translation of character data into Unicode that can double the original size
- The inclusion of field names that can be contained implicitly within the bit stream
- The presence of control data that is associated with the integration node's operation
Manipulation of a large message tree can, therefore, demand a great deal of storage. If you design a message flow that handles large messages that are made up of repeating structures, you can code specific ESQL statements that help to reduce the storage load on the integration node. These statements support both random and sequential access to the message, but assume that you do not need access to the whole message at one time.
These ESQL statements cause the integration node to perform limited parsing of the message, and to keep only that part of the message tree that reflects a single record in storage at a time. If your processing requires you to retain information from record to record (for example, to calculate a total price from a repeating structure of items in an order), you can either declare, initialize, and maintain ESQL variables, or you can save values in another part of the message tree, for example LocalEnvironment.
This technique reduces the memory that is used by the integration node to that needed to hold the full input and output bit streams, plus that required for one record's trees. It provides memory savings when even a few repeats are encountered in the message. The integration node makes use of partial parsing, and the ability to parse specified parts of the message tree, to and from the corresponding part of the bit stream.
- Copy the body of the input message as a bit stream to a special folder in the output message. This creates a modifiable copy of the input message that is not parsed and therefore uses a minimum amount of memory.
- Avoid any inspection of the input message; this avoids the need to parse the message.
- Use a loop and a reference variable to step through the message
one record at a time. For each record:
- Use normal transforms to build a corresponding output subtree in a second special folder.
- Use the ASBITSTREAM function to generate a bit stream for the output subtree that is stored in a BitStream element, placed in the position in the tree, that corresponds to its required position in the final bit stream.
- Use the DELETE statement to delete both the current input and the output record message trees when you complete their manipulation.
- When you complete the processing of all records, detach the special folders so that they do not appear in the output bit stream.
You can vary these techniques to suit the processing that is required for your messages. The following ESQL provides an example of one implementation.
The ESQL is dependent on a message set
called LargeMessageExanple
that has been created
to define messages for both the Invoice input format and the Statement
output format. A message called AllInvoices
has
been created that contains a global element called Invoice
that
can repeat one or more times, and a message called Data
that
contains a global element called Statement
that
can repeat one or more times.
The definitions of the elements
and attributes have been given the correct data types, therefore,
the CAST statements used by the ESQL in the XML example are no longer
required. An XML physical format with name XML1
has
been created in the message set which allows an XML message corresponding
to these messages to be parsed by the MRM.
When the Statement
tree
is serialized using the ASBITSTREAM function the Message
Set, Message Type, and Message
Format are specified as parameters. The Message
Type parameter contains the path from the message to the
element being serialized which, in this case, is Data/Statement
because
the Statement
element is a direct child of the Data
message.
<AllInvoices> .... </AllInvoices>
Example
CREATE COMPUTE MODULE LargeMessageExampleFlow_Compute
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
CALL CopyMessageHeaders();
-- Create a special folder in the output message to hold the input tree
-- Note : SourceMessageTree is the root element of an MRM parser
CREATE LASTCHILD OF OutputRoot.MRM DOMAIN 'MRM' NAME 'SourceMessageTree';
-- Copy the input message to a special folder in the output message
-- Note : This is a root to root copy which will therefore not build trees
SET OutputRoot.MRM.SourceMessageTree = InputRoot.MRM;
-- Create a special folder in the output message to hold the output tree
CREATE FIELD OutputRoot.MRM.TargetMessageTree;
-- Prepare to loop through the purchased items
DECLARE sourceCursor REFERENCE TO OutputRoot.MRM.SourceMessageTree.Invoice;
DECLARE targetCursor REFERENCE TO OutputRoot.MRM.TargetMessageTree;
DECLARE resultCursor REFERENCE TO OutputRoot.MRM;
DECLARE grandTotal FLOAT 0.0e0;
-- Create a block so that it's easy to abandon processing
ProcessInvoice: BEGIN
-- If there are no Invoices in the input message, there is nothing to do
IF NOT LASTMOVE(sourceCursor) THEN
LEAVE ProcessInvoice;
END IF;
-- Loop through the invoices in the source tree
InvoiceLoop : LOOP
-- Inspect the current invoice and create a matching Statement
SET targetCursor.Statement =
THE (
SELECT
'Monthly' AS Type,
'Full' AS Style,
I.Customer.FirstName AS Customer.Name,
I.Customer.LastName AS Customer.Surname,
I.Customer.Title AS Customer.Title,
(SELECT
FIELDVALUE(II.Title) AS Title,
II.UnitPrice * 1.6 AS Cost,
II.Quantity AS Qty
FROM I.Purchases.Item[] AS II
WHERE II.UnitPrice> 0.0 ) AS Purchases.Article[],
(SELECT
SUM( II.UnitPrice *
II.Quantity *
1.6 )
FROM I.Purchases.Item[] AS II ) AS Amount,
'Dollars' AS Amount.Currency
FROM sourceCursor AS I
WHERE I.Customer.LastName <> 'White'
);
-- Turn the current Statement into a bit stream
-- The SET parameter is set to the name of the message set
-- containing the MRM definition
-- The TYPE parameter contains the path from the from the message
-- to element being serialized
-- The FORMAT parameter contains the name of the physical format
-- name defined in the message
DECLARE StatementBitStream BLOB
ASBITSTREAM(targetCursor.Statement
OPTIONS FolderBitStream
SET 'LargeMessageExample'
TYPE 'Data/Statement'
FORMAT 'XML1');
-- If the SELECT produced a result (that is, it was not filtered
-- out by the WHERE clause), process the Statement
IF StatementBitStream IS NOT NULL THEN
-- create a field to hold the bit stream in the result tree
-- The Type of the element is set to MRM.BitStream to indicate
-- to the MRM Parser that this is a bitstream
CREATE LASTCHILD OF resultCursor
Type MRM.BitStream
NAME 'Statement'
VALUE StatementBitStream;
-- Add the current Statement's Amount to the grand total
SET grandTotal = grandTotal + targetCursor.Statement.Amount;
END IF;
-- Delete the real Statement tree leaving only the bit stream version
DELETE FIELD targetCursor.Statement;
-- Step onto the next Invoice, removing the previous invoice and any
-- text elements that might have been interspersed with the Invoices
REPEAT
MOVE sourceCursor NEXTSIBLING;
DELETE PREVIOUSSIBLING OF sourceCursor;
UNTIL (FIELDNAME(sourceCursor) = 'Invoice')
OR (LASTMOVE(sourceCursor) = FALSE)
END REPEAT;
-- If there are no more invoices to process, abandon the loop
IF NOT LASTMOVE(sourceCursor) THEN
LEAVE InvoiceLoop;
END IF;
END LOOP InvoiceLoop;
END ProcessInvoice;
-- Remove the temporary source and target folders
DELETE FIELD OutputRoot.MRM.SourceMessageTree;
DELETE FIELD OutputRoot.MRM.TargetMessageTree;
-- Finally add the grand total
SET resultCursor.GrandTotal = grandTotal;
-- Set the output MessageType property to be 'Data'
SET OutputRoot.Properties.MessageType = 'Data';
RETURN TRUE;
END;
CREATE PROCEDURE CopyMessageHeaders() BEGIN
DECLARE I INTEGER 1;
DECLARE J INTEGER CARDINALITY(InputRoot.*[]);
WHILE I < J DO
SET OutputRoot.*[I] = InputRoot.*[I];
SET I = I + 1;
END WHILE;
END;
END MODULE;