How multiple file nodes share access to files in the same directory

IBM® App Connect Enterprise controls access to files so that only one file node at a time can read or write to a file.

When a message flow uses the FileInput or FileOutput node, additional instances (threads) might be associated with the message flow, or file nodes in other message flows in the same or different integration servers might refer to files in the same directory. IBM App Connect Enterprise controls the way in which multiple processes read from and write to files by moving the files to the mqsitransitin directory during processing, and locking them while they are being processed. The mqsitransitin directory is a subdirectory of the input directory specified in the FileInput node.

If you are using the FileInput node and choose to include subdirectories, an mqsitransitin directory is created as needed in all places where there is a file to be processed.

The integration node locks the files that are being read by the FileInput node or written by the FileOutput node, to prevent other integration servers from reading or changing the files while they are being processed. The integration node unlocks the file:
  • When a FileInput node finishes processing the input file.
  • When a FileOutput node finishes writing the file and moves it from the transit directory to the output directory.
Note: The FTEInput node does not use a transit directory. Each integration server has its own IBM MQ File Transfer Edition agent, and a node processes only files sent to the agent for which the node is deployed. The integration server ensures that only one node in the integration server processes each file.

Reading a file

When the FileInput node reads a file, it first moves the file into the mqsitransitin directory, where it is held during processing. A prefix (containing the integration node name and integration server name) is added to the file name to indicate which integration server is processing the file. While the file is in this directory, no other integration servers can access the file. The integration node maintains a lock subdirectory in the mqsitransitin directory, to ensure that files in the input directory are accessed by only one integration server at a time.

If multiple message flows or instances within an integration server are reading from the same input directory, only one instance of one message flow is allocated to reading it. Each record in the file is serially processed by this instance. Other instances of the message flow, or other message flows, can simultaneously process other files, the names of which match the pattern specified in the File name or pattern property of the node.

While a file is being processed, the file system is used to lock the file. As a result, other programs (including other integration servers) are prevented from reading, writing, or deleting the file while it is being processed by the file nodes.

While a FileInput node is reading a file, the file remains in the mqsitransitin directory until it has been fully processed (or until an unrecoverable error occurs). If the file is to be retained, it is held in a subdirectory of the mqsitransitin directory.

When the file has been processed, it is moved from the mqsitransitin directory back to the input directory. However, if the integration server stops unexpectedly while the file is in the mqsitransitin directory, you can manually restore the input file to the input directory by removing the prefix (containing the integration node name and integration server name) from the file name, and then moving it to the input directory. The input file is then processed by the next FileInput node that scans the directory.

You can fine tune how file input nodes process files by setting the properties for the FileNodes parameter in the server.config.yaml file, for example allowReadOnlyInputFiles, disableLocking and avoidWriteLockCheck. For more information, see Configuring advanced file node properties by modifying the server.conf.yaml file.

If you use an NFS server, and have File nodes in different integration servers that access the same directory on the NFS server, ensure that you are using NFS version 4 to correctly support file locking.

Writing a file

Files that are created and written by a FileOutput node are put in the output directory when they are finished. While records are being added to a file, it is kept in the mqsitransit subdirectory.

Each record is written by a single message flow instance. All message flow instances that are configured to write records to a specific file can append records to that file. Because instances can run in any order, records that they write might be interleaved, which means that the sequence of records might be altered. If you require the sequence of records in the output file to be maintained, ensure that only one FileOutput node instance uses the file. To ensure that only one FileOutput node instance uses the file, configure the message flow that contains the node to use the additional instances pool with zero instances, and ensure that other message flows do not write to the same file.

While a file is being processed, the file system is used to lock the file. As a result, other programs (including other integration servers) are prevented from reading, writing, or deleting the file while it is being processed by the file nodes. This lock is retained for a short period after a FileOutput node writes to the file without finishing it, leaving it in the transit directory. If message flows that are in the same integration server use the same output file and run sufficiently quickly, the integration node does not relinquish the lock before the file is finished. However, if the message flows have longer intervals between them, the integration node relinquishes the lock and another process or integration server can acquire a lock on the file. To prevent this situation, ensure that output directories are not shared between integration servers.