Introduction to WebSphere Message Broker File Extender
IBM® WebSphere® Message Broker V6 (hereafter called Message Broker) is an advanced Enterprise Service Bus (ESB) that provides powerful message transformation capability. The IBM Enterprise Service Bus architectural pattern supports multiple protocols and software packages in order to drive communications in message-oriented, event-oriented, and service-oriented situations. A key issue in this architectural pattern is how to receive file inputs and send file outputs from Message Broker.
WebSphere Message Broker File Extender (hereafter called File Extender) is an IBM product, available to customers through the Passport Advantage program. Purchase of a File Extender license includes software maintenance rights.
File Extender is supported on a range of distributed platforms. The nodes are not available on z/OS®.
IBM Services provides fee-based assistance in the configuration of File Extender. The IBM WebSphere Lab Services team can assist in large-scale product deployments requiring custom technical consultancy.
Business value of File Extender
Enhancing an existing broker architecture with File Extender can simplify the manageability and control of your enterprise. File Extender’s administrative interface is the same as the WebSphere Message Broker toolkit, so developers can manipulate and deploy the product quickly, which can lower your total cost of ownership and improve developer productivity. File Extender consists of a set of broker plug-in nodes that you can wire into new or existing message flows from the Message Broker Toolkit palette. The nodes are designed to fit the administrative model of Message Broker so that file protocol support can be easily added to existing architectures that already use Message Broker's rich set of transformation and routing logic. It lets you quickly extend your previous investments in WebSphere business integration products to support legacy file-based applications.
Positioning of File Extender vs. CommerceQuest PM4Data
File Extender provides a method of parsing data directly from files into a logical tree ready for manipulation by core Message Broker functionality. It provides an excellent solution for existing Message Broker customers who want to expose an application’s existing function to file protocol inputs and outputs.
The PM4Data product from CommerceQuest is well suited for transporting large files using a WebSphere MQ backbone as a base transport. These files can exceed the WebSphere MQ message size limit of 100 MB because the data is spread into packets transported using separate messages. Splitting the data into discrete buffers can be configured using the product’s administrative interface. Unlike standard FTP, the WebSphere MQ channels used by PM4Data provide all the advantages of guaranteed, assured delivery. PM4Data also supplies a range of graphical tools and methods for tracking the file movement. PM4Data architecture is well suited to scenarios in which file movements must be centrally managed from a single platform and orchestrated between multiple nodes (where Message Broker may or may not be installed).
File Extender functions and scenarios
File Extender lets you add support for file processing to your ESB, thus extending the reach and value of the ESB in your enterprise. File Extender nodes make the rich set of broker functions available to batch file inputs and outputs, and let you combine the processing of record structures with the broker’s comprehensive set of parsing and transformation options. Message Broker provides parsing capabilities for XML fixed-length and tagged/delimited formats covering proprietary and common industry standards such as SWIFT and ACORD. With Message Broker, you can easily handle protocol transformation involving Web services, HTTP input and output, WebSphere MQ real-time transport, and IP socket communications. For example, File Extender lets you process a flat input file by invoking a Web service to call a common processing routine, and then convert individual records in the file into XML representations and write each of them as MQ messages on a queue.
This article discusses the three most common File Extender usage scenarios:
Figure 1. File Extender usage scenarios

File Extender lets you process an entire file at once, or propagate records from the file through a message flow one at a time.
Installing File Extender gives you the full set of Message Broker functionality, plus the three additional file handling nodes shown below in the message flow palette, along with the associated broker run-time enhancements to support these build-time features:
Figure 2. File Input node

Figure 3. File Output node

Figure 4. File Proxy node

The File Input node lets you initiate message flow activity by placing an input file in a directory local to the run-time broker. The node’s default properties tab provides Message Domain, Set, Type and Format properties to control the parsing of the data contained in the incoming file. These properties are equivalent to those on a conventional MQInput node.
Figure 5. File Input node properties

The Default tab also provides CCSID and Encoding properties (since the input source is not a WebSphere MQ message, there is no MQMD header, so these settings are required to let you control the parsing of files that do not conform to the machine default CCSID and Encoding values). The Format properties tab contains further properties that inform the node about how to interpret the structure of the data in the file. These parsing options are provided as a native function of the node and are independent of similar techniques you may already be familiar with when using the Message Repository Manager domain. The Format properties let you receive files which contain multiple record structures delimited by carriage return line feeds. The batch processing possibilities, and the relevant node properties are discussed below.
The File Output node receives as input a logical tree description (the standard serialised format used between wired nodes of a message flow) of a file record or WebSphere MQ message (or from any other WebSphere Message Broker supported input source). Typically the File Output node writes an output file to a directory local to the run-time broker. The node’s properties (viewed in the Toolkit by right-clicking the node and selecting Properties) are split into groups and displayed using the following tabs:
- Basic
- Format
- Advanced
- Troubleshooting
- Description
The node has no output terminal, and successful writing of a file results in the termination of the current message flow thread. You cannot place other message flow nodes downstream of the file output node in any given branch of a message flow (although execution of secondary branches of a message flow after a FlowOrder node are still respected). The File Output node’s Basic properties tab specifies an output directory location as either an absolute or relative path. The Format properties tab contains further properties that inform the node about how to write structured data within a single file. These writing options are provided as native function of the node and are independent of similar techniques that you may be familiar with using the Message Repository Manager domain. The Format properties let you write files that contain multiple record structures delimited by carriage return line feeds. The batch processing possibilities, and the relevant node properties are discussed below. The Advanced properties tab controls the naming of the output file and options for how it is generated. You can generate a new file for every message received by the node, overwrite existing files, and append data to existing files. You can generate file names with counters at the beginning, end, or just before the file extension.
The File Proxy node is always located in a message flow immediately downstream of an MQInput node. Its purpose is to maintain a transactional context between the reading of WebSphere MQ messages that initiate the message flow and the writing of files that terminate the instance of the flow. The File Proxy node is only used in flows that take files as input and write MQ messages as output.
Figure 6. File Proxy node properties

The node’s Advanced properties tab includes a Transaction Mode setting that determines whether or not the action of the File Output node later in the flow is to be conducted within the same unit of work as the original MQGet that was performed by the MQ Input node. If set to yes, then the File Output node downstream in the message flow will append data to the output file within the same unit of work (under syncpoint), as was started when the MQ Input node took its messages from the input queue. If set to no, then the File Output node downstream in the message flow will append data to the output file regardless of the current WebSphere MQ unit of work. For more information on transactional scope inside file handling message flows, see the section Flow architectural concepts: Transactionality.
Batch processing: Files containing structured records
A single input file may contain several structured records that you may want to parse and take action upon individually within a message flow.
The File Input node’s Basic property of Input Propagation Policy lets you control how the data in the input file is propagated to the rest of the message flow. The default setting of File Descriptor and Content Whole File parses the entire content of the incoming file as a single message body.
The message domain used to interpret the data is still controlled by the conventional Default properties tab, exactly the same as on a conventional MQInput node. In addition to the message body, the File Input node also generates a File Descriptor.
It refers to a set of attributes created by instances of the File Input node when reading files into the broker, in order to identify their characteristics
using a message tree structure (a concept similar to the WebSphere MQ Message Descriptor or MQMD).
These properties are propagated within the LocalEnvironment tree (LocalEnvironment.Variables.MBFEProperties). A subset of these values are then editable using ESQL within a message flow’s Compute nodes. The read/write access of each property is listed in the product documentation.
If the Input Propagation Policy property of the File Input node is changed to File Descriptor and Content Record by Record, then the incoming file’s records are processed by the nodes downstream of the File Input node one-by-one. The same message flow thread is used to execute the message flow logic against each record, one at a time. Each record is propagated as a message tree that contains a File Descriptor in the LocalEnvironment. One of the File Descriptor’s attributes is the RecordNumber property, which is an integer starting at 1 and incrementing by 1 for each record.
The final setting for the Input Propagation Policy is File Descriptor Only, which propagates the File Descriptor in the LocalEnvironment and propagates an empty message body. This option is for use in file-to-file message flow scenarios when files need to be routed from one directory to another,
depending on the content of the file descriptor. For example, you might need to place all files larger than 1 MB into a directory named
C:\bigfiles and not into the default directory of C:\smallfiles.
Another example might be placing a file into a directory depending on context: if the current time is earlier than 12 noon,
then write the file to directory C:\morning instead of to C:\afternoon. The advantage with choosing a
Propagation Policy of File Descriptor Only is that the file is not opened at all, so performance is improved.
In order to take advantage of the File Extender File Input node’s batch handling functionality, select an Input Propagation Policy of File Descriptor and Content Record by Record in conjunction with the properties on the Format tab of the node. Set the Data type property to Text. The Record Type property specifies whether the records in the file are all the same length, or whether the node should separate them using a carriage return line feed delimiter. Choose Fixed record length for the former or Variable record length for the latter. Then depending on your choice, specify further information for the Record Length or End of Record Type properties, the options for which are described in the node documentation.
The broker will then parse each record (or the entire file) based upon the above configuration. The method it uses depends upon your settings on the Default properties tab that specify Message Domain, Set, Type and Format. This enables users of the File Input node to take full advantage of the rich set of conventional parsers offered by Message Broker.
Flow architectural concepts: Transactionality
Message Broker lets you create coordinated message flows in which updates to internal and external resources are committed (or rolled back) together as part of the same transaction, ensuring that either all processing of the message flow is committed together at the same time, or none of the processing is committed. Message flow coordination is performed by WebSphere MQ on distributed platforms (File Extender is not provided on z/OS). This level of transactional control is often required when two resources need to be synchronised with each another – for example during the transfer of funds between two bank accounts.
It is also possible to design partially coordinated message flows, in which certain database updates, or the writing of output messages to particular queues, are not coordinated as part of the global message flow transaction.
In flows that take message inputs and provide file outputs, File Extender lets you include the action of the File Output node as part of the same MQ unit of work that began when an incoming message started the message flow processing. Whether the writing of the output file by the File Output node is included within the message flow’s global transaction, is determined by the Transaction property of the File Proxy node, which is placed immediately downstream of the MQ Input node at the start of the message flow.
In flows which take file inputs and provide message outputs, File Extender lets you define separate transactions that have a scope spanning multiple records, or an entire input file, through the configuration of two properties. A single file may contain multiple records. A File Input node can define a Batch Size, which states how many of its individual records should be contained in each batch of processing. Having defined a batch size, you can also define the Transaction Mode of the node to be Yes, Record Batch Scope, which treats each batch of records as a separate MQ transaction. Additionally, you can configure the In Doubt Policy to be either Fail, Redo or Skip. Consider an input file containing the following data:
<msg><child>Record1</child></msg> <msg><child>Record2</child></msg> <msg><child>Record3</child></msg> <msg><child>Record4</child></msg> <msg><child>Record5</child></msg> <msg><child>Record6<child></msg> |
The XML of Record6 is badly formed. If the Transaction Mode is set to have a batch scope, and an In-Doubt Policy of Fail is selected, then processing of the file will fail when Record6 is reached. Successful processing of Records 1-3 will result in three messages being committed to the flow’s output queue. Records 4 and 5 will be rolled back from the output queue, due to the failure of the transaction of the second batch. This example shows how batches of records within a single input file can be included within a single transactional scope. The syntax error in a file record used in this example only triggers the standard exception management logic of the File Input node. This example will not cause what is known as an "In-Doubt" state for File Extender, which can arise in situations such as a power outage, a DataFlowEngine failure, or a disk full error when File Extender drives the commit of a transaction that is synchronized with WebSphere MQ. In these circumstances, the behaviour of the message flow is to either fail, retry, or skip messages in the batch for which there was a problem with the commit.
The File Output node can be configured to write data from the logical tree it receives to more than one output file. This is done dynamically using a Destination List that a message flow has configured using its ESQL (or Java or plug-in node). The Destination List concept works like the MQ Destination List, which lets an MQ Output node write output data to multiple destination messages. The File Output node’s Advanced properties sheet contains the attribute Destination Mode. Changing this to Output Directory List and creating entries in the LocalEnvironment will make the node write the same file to all of the directory locations. The ESQL below will create a directory list with three separate file locations:
SET OutputLocalEnvironment.Destination.MBFE.DestinationData[1].Directory = 'C:\Dir1'; SET OutputLocalEnvironment.Destination.MBFE.DestinationData[2].Directory = 'C:\Dir2'; SET OutputLocalEnvironment.Destination.MBFE.DestinationData[3].Directory = 'C:\Dir3'; |
The File Descriptor contains an attribute that controls when an output file that is being appended to by a File Output node inside a message-to-file message flow closes the file and releases it to the file system. A common situation when this function might be used is when using the PROPAGATE statement to deal with large incoming messages. A single input message can be split into multiple sections using ESQL statements in a Compute node. The propagate statement finalizes the compute node’s output message trees and propagates to the downstream nodes with the message flow. It then clears the output message tree and reclaims the memory for further use. This technique can help you avoid parsing a large message all at once, which could have a large memory requirement. A common requirement is to separate each part of the propagated message into a different file, and you can do so using the File Action tag, which tells the FileOutput node to close the output file on receiving the message tree that had been finalized as a result of the Propagate. The ESQL below sets the File Action attribute of the File Descriptor:
SET OutputLocalEnvironment.Variables.MBFEProperties.action = 'close'; |
Another scenario in which this function is useful is a message-to-file message flow where multiple input messages must be transformed and aggregated into a single output file until a particular group context or content decision is met. For example, a zero-length message on input might represent "end of business day" and require the file to be closed. Alternatively, an input message arriving after "end of business day" might require the file to be closed and further messages queued ready for processing the following day.
-
WebSphere Message Broker information center
A single Eclipse-based Web portal to all WebSphere Message Broker V6 documentation, with conceptual, task, and reference information on installing, configuring, and using your WebSphere Message Broker environment. -
WebSphere Message Broker documentation library
WebSphere Message Broker specifications and manuals. -
WebSphere Message Broker product page
Product descriptions, product news, training information, support information, and more. -
WebSphere Message Broker forum
Get answers to your technical questions and share your expertise with other WebSphere Message Broker users. -
WebSphere Message Broker support page
Access to all support resources for WebSphere Message Broker. -
WebSphere Message Broker File Extender product page
Announcement letter, product description, product news, training information, support information, and more. -
WebSphere Business Integration products page
For both business and technical users, a handy overview of all WebSphere Business Integration products -
developerWorks WebSphere Business Integration zone
For developers, access to WebSphere Business Integration how-to articles, downloads, tutorials, education, product info, and more. -
Most popular WebSphere trial downloads
No-charge trial downloads for key WebSphere products. -
Trial downloads for IBM software products
No-charge trial downloads for selected IBM® DB2®, Lotus®, Rational®, Tivoli®, and WebSphere® products. -
developerWorks technical events and Webcasts
Complimentary half-day technical briefings in cities worldwide. -
Safari Bookshelf: e-library designed for developers
Complete search and download access to thousands of technical books for a one-time subscription fee. Free trial for new subscribers. -
WebSphere forums
Product-specific forums where you can get answers to your technical questions and share your expertise with other WebSphere users. -
developerWorks blogs
Ongoing, free-form columns by software experts, to which you can add your comments. Check out Grady Booch's blog on software architecture.

Ben Thompson is an IT Specialist on the Software Lab Services team at the IBM Hursley Software Lab in the UK. He provides technical consultancy on WebSphere Business Integration solutions. His areas of expertise include WebSphere Message Broker, WebSphere MQ, message modeling for legacy applications, XML schemas, and Web services implementations. You can contact Ben at bthomps@uk.ibm.com.




