Configuring the FileInput node
When you add a FileInput node to a message flow, configure it to process messages that are read from files.
About this task
- Optional: On the Description tab,
enter a Short description,
a Long description, or both.
You can also rename the node on this tab.
- On the Basic tab, enter the directories
and files to be processed by the FileInput node, together with
what to do with any duplicate files encountered.
- In Input directory,
specify the directory from which the FileInput node obtains files.
Specify the directory as either an absolute or a relative directory
path. If the directory path is relative, it is based on the directory
specified in the environment variable MQSI_FILENODES_ROOT_DIRECTORY.
An example on Windows systems
is C:\fileinput. An example on UNIX systems is /var/fileinput.
On Windows, if you are specifying a shared directory that is mapped to your local computer, specify the share name instead of the letter that represents the drive; for example \\myshare\mydirectory.
The FileInput node creates an mqsitransitin subdirectory in the specified input directory to hold and lock input files while they are being processed. If an integration server that processes files in this input directory is removed, check the mqsitransitin subdirectory for partially processed or unprocessed files. Move any such files back into the input directory (and remove the integration server UUID prefix from the file names) so that they can be processed by a different integration server. For more information about the mqsitransitin subdirectory, see How multiple file nodes share access to files in the same directory.
Select Include local subdirectories if
you want files in the directory structure under Input directory to be processed.
The integration node must have access to the top-level directory that
you specify; if the integration node does not have access to a particular
subdirectory, that subdirectory is not searched and a message is written
to user trace. Files are processed according to their age, that is
the oldest files are processed first regardless of where they appear
in the directory structure. The decision about whether files are processed
or ignored is determined firstly according to the file name or pattern
and secondly according to the file exclusion pattern.
If you select both the Include local subdirectories and Remote Transfer properties, only the subdirectories on the local system are searched for files to process. Whereas on the remote system, only the top-level directory that you have specified is searched for files to process.
- In File name or pattern,
specify a pattern for the file name. It is either a file name or a
character sequence (a pattern) that matches a file name. A pattern
is a sequence containing at least one of the following wildcard characters:
Wildcard character Description Example * Any sequence of zero or more characters *.xml matches all file names with an xml extension ? Any single character f??????.csv matches all file names consisting of the letter f followed by six characters and then the sequence .csv. Regular expression (regex) A regular expression Enclose the regular expression in parentheses. MY.INPUT.FILE.([AB]) matches files with the names MY.INPUT.FILE.A and MY.INPUT.FILE.B
If you have existing file name patterns that include parentheses, you must include a second set of parentheses inside the first set. For example, amend myFile(A).txt to myFile(\(A\)).txt to retain the parentheses.
If you have existing file name patterns that include the wildcard character * or ? within parentheses, you must amend the pattern so that the * or ? character is a valid regex. For example, amend MY.INPUT.FILE(*) to MY.INPUT.FILE(\(.*\)) or (MY\.INPUT\.FILE\(.*\)) to ensure a valid regex.
- In File exclusion pattern, specify a pattern for file names that you want to exclude from processing. Specify either a file name or a character sequence (a pattern) that matches a file name. A pattern is a sequence containing at least one of the wildcard characters detailed in Wildcard characters. For a file to be excluded from processing, its name must match the pattern.
- Select Action on successful
processing to specify the action that the FileInput node takes after
successfully processing the file. The action can be to move the file
to the archive subdirectory, to augment the file name with a time
stamp and move the source file to the archive subdirectory, or to
delete the file.
- If you select Move to Archive Subdirectory, the source file is moved to the archive subdirectory of the input directory. The subdirectory name is mqsiarchive. For example, if the input directory is /var/fileinput, the absolute path of the archive subdirectory is /var/fileinput/mqsiarchive. If this directory does not exist, the integration node creates it when it first tries to move a file there.
- If you select Add Timestamp and Move to Archive Subdirectory, the current date and time are added to the file name, and the file is then moved to mqsiarchive.
- If you select Delete, the file is deleted after successful processing.
- The FileInput node writes a message to the user trace, if user tracing is in operation, whenever it processes a file.
- Select Replace duplicate archive files if you want to replace a file in the archive subdirectory with a successfully processed file of the same name. If you do not set this option, and a file with the same name exists in the archive subdirectory, the node throws an exception when it tries to move the successfully processed file.
- In Input directory, specify the directory from which the FileInput node obtains files. Specify the directory as either an absolute or a relative directory path. If the directory path is relative, it is based on the directory specified in the environment variable MQSI_FILENODES_ROOT_DIRECTORY. An example on Windows systems is C:\fileinput. An example on UNIX systems is /var/fileinput.
- On the Input Message Parsing tab,
set values for the properties that the node uses to determine how
to parse the incoming message.
- In Message domain,
select the name of the parser that you are using from the supplied
list. The default is BLOB. You can choose from the following options:
- If you are using the DFDL parser, the XMLNSC parser
in validating mode, or the MRM parser, specify the relevant Message model For XMLNSC, if your schema files are in an application or static library, leave this property blank. If your messages are modeled in a referenced shared library or message set, select the top-level shared library for the shared library or message set that contains the schema files.
- If you are using the DFDL or MRM parsers, select the correct message from the list in Message. This list is populated with messages that are defined in the Message model that you have selected.
- If you are using the MRM parser, select the format of the message from the list in Physical format. This list includes all the physical formats that you have defined for this Message model.
- Specify the message coded character set ID in Message coded character set ID.
- Select the message encoding from the list in Message encoding or specify a numeric encoding value. For more information about encoding, see Data conversion.
- In Message domain, select the name of the parser that you are using from the supplied list. The default is BLOB. You can choose from the following options:
- On the Parser Options subtab, set
the following properties.
- Parse timing is, by default, set to On Demand, which causes parsing of the message to be delayed. To cause the entire message to be parsed immediately, set this property to Immediate or Complete. See Parsing on demand for more details.
- If you are using the XMLNSC parser, set values for the properties that determine how the XMLNSC parser operates. For more information, see Manipulating messages in the XMLNSC domain.
- On the Polling tab, enter a value
for the Polling interval.
This property controls the frequency with which the FileInput node accesses the
file system looking for files to process.
After the initial scan of the directory when the flow is started, whenever the directory is found to contain no files that match the input pattern, the FileInput node waits for the period defined by this property. This process avoids the need for the FileInput node to be continually accessing the file system, and consuming large amounts of system resource.
The smaller the value set in this property, the more quickly the FileInput node discovers files that are in the input directory. However, a smaller value increases the use of system resources. A larger value reduces the use of system resource but at the cost of the FileInput node discovering files to process less quickly.
Do not use this property as a means to regulate work, or to schedule processing. If you want the FileInput node to monitor the input directory for selected periods only, start and stop the message flow at appropriate times.
If you select the Remote Transfer property and set the Scan delay property on the FTP tab, the value that you set for Scan delay overrides the value set for Polling interval.
- Use the Retry tab to define how
retry processing is performed when a message flow fails.
- Retry mechanism determines
the action that occurs if the flow fails:
- Select Failure for the node to report a failure without any retry attempts.
- Select Short retry for the node to try again before reporting a failure if the condition persists. The number of times that it tries again is specified in Retry threshold.
- Select Short retry and long retry for the node to try again, first using the value in Retry threshold as the number of attempts it is to make. If the condition persists after the Retry threshold value has been reached, the node then uses the value of Long retry interval between attempts.
- Specify a value for the Retry threshold property. The number of times the node tries the flow transaction again if the Retry mechanism property is set to either Short retry or Short retry and long retry.
- Specify a value for the Short retry interval property. The length of time, in seconds, to wait between short retry attempts.
- Specify a value for the Long retry interval property. The length of time to wait between long retry attempts until a message is successful, the message flow is stopped, or the message flow is redeployed. The MinLongRetryInterval integration node property defines the minimum value that the Long retry interval can take. If the value is lower than the minimum, the integration node value is used.
- Specify a value for the Action
on failing file property to determine what the node is to
do with the input file after all attempts to process its contents
- Move to Backout Subdirectory. The file is moved to the backout subdirectory of the input directory. The name of this subdirectory is mqsibackout. If the input directory is /var/fileinput, the absolute path of the backout subdirectory is /var/fileinput/mqsibackout. If this subdirectory does not exist, the integration node creates it when it first tries to move a file there. If the file cannot be moved to this subdirectory, perhaps because a file of the same name exists there, the node adds the current date and time to the file name and makes a second attempt to move the file. If this second attempt fails, the node stops processing. Messages BIP3331 and BIP3325 are issued. Resolve the problem with the subdirectory or file before attempting to restart the message flow.
- Delete. The file is deleted after processing fails.
- Add Time Stamp and Move to Backout Subdirectory. The current date and time are added to the file name, and then the file is moved to the backout subdirectory.
- Retry mechanism determines the action that occurs if the flow fails:
- Use the Records and Elements tab
to specify how each file is interpreted as records.
- Use the Record detection property
to determine how the file is split into records, each of which generates
a single message. Choose from the following options:
- Whole File specifies that the whole file is a single record. A limit of 100 MB applies to the size of the files.
- Fixed Length specifies that each record is a fixed number of bytes in length. Each record contains the number of bytes specified in the Length property, except possibly a shorter final record in the file. The value specified in Length must be in the range 1 byte through 100 MB. The default is 80 bytes.
- Select Delimited if the records that you are processing are separated, or terminated, by a DOS or UNIX line end or by a sequence of user-defined delimiter bytes. Specify the delimiter and delimiter type in the Delimiter and Delimiter type properties. A limit of 100 MB applies to the length of the records.
- Select Parsed Record Sequence if the file contains a sequence of one or more records that are serially recognized by the parser specified in Message domain. The node propagates each recognized record as a separate message. If you select the Record detection option, the parser specified in Message domain must be DFDL, XMLNSC, or MRM.
- If you specify Parsed Record Sequence in Record detection, the FileInput node does not determine or limit the length of a record. Nodes that are downstream in the message flow might try to determine the record length or process a long record. If you intend to process large records in this way, ensure that your integration node has sufficient memory. You might have to apply flow techniques described in the Large Messaging sample to make best use of the available memory.
- If you specified Delimited in Record detection, use Delimiter to specify the delimiter
to be used. Choose from the following values.
- DOS or UNIX Line End, which, on UNIX systems, specifies the line feed character (<LF>, X'0A'), and, on Windows systems, specifies a carriage return character followed by a line feed character (<CR><LF>, X'0D0A'). The node treats both of these strings as delimiters, irrespective of the system on which the integration node is running. If they are both in the same file, the node recognizes both as delimiters. The node does not recognize X'15' which, on z/OS® systems, is the 'newline' byte; specify a value of Custom Delimiter in this property and a value of 15 in the Custom delimiter property if your input file is coded using EBCDIC new lines, such as EBCDIC files from a z/OS system.
- Custom Delimiter, which permits a sequence of bytes to be specified in Custom delimiter
- In Custom delimiter, specify the delimiter byte or bytes to be used when Custom delimiter is set in the Delimiter property. Specify this value as an even-numbered string of hexadecimal digits. The default is X'0A' and the maximum length of the string is 16 bytes (represented by 32 hexadecimal digits).
- If you specified Delimited in Record detection, use Delimiter type to specify the type
of delimiter. Permitted values are:
- Infix. If you select this value, each delimiter separates two records. If the file ends with a delimiter, the zero length file content following the final delimiter is still propagated as a message although it contains no data.
- Postfix. If you specify this value, each delimiter terminates a record. If the file ends with a delimiter, no empty record is propagated after the delimiter. If the file does not end with a delimiter, the file is processed as if a delimiter follows the final bytes of the file. Postfix is the default value.
- The FileInput node considers each occurrence of the delimiter in the input file as either separating (Infix) or terminating (Postfix) each record. If the file begins with a delimiter, the node treats the (zero length) file contents preceding that delimiter as a record and propagates an empty record to the flow. The delimiter is never included in the propagated message.
- Use the Record detection property to determine how the file is split into records, each of which generates a single message. Choose from the following options:
- Use the Validation tab to provide validation based on the message set for predefined messages.
On the FTP tab, select the Remote Transfer property if you want the node to read files from an FTP, FTPS, or SFTP server using the following properties:
- In Transfer Protocol, specify the protocol that is to be used for remote file transfer. Possible values are FTP and SFTP.
In Remote server and port, supply the IP address and port number of the FTP or SFTP server to be used. Use one of the following syntaxes:
123is the port number
In Security identity, specify the name of a security identity that has been defined using the mqsisetdbparms command. The user identifier and password that are to be used to log on to the FTP, FTPS, or SFTP server are obtained from this definition, the name of which must have an
ftp::prefix. The value of this property is overridden by the value in the securityIdentity property of the FtpServer configurable service, if it is set.
In Server directory, specify the directory in the FTP or SFTP server from which to transfer files. The default is a period (.) which means the default directory after logon. If you specify a relative path, the directory is based on the default directory after FTP or SFTP logon. Ensure that the syntax of the path conforms to the file system standards in the FTP or SFTP server. The value in this property is overridden by the value in the remoteDirectory property of the FtpServer configurable service, if it is set.
If you select both the Remote Transfer and Include local subdirectories properties, only the subdirectories on the local system are searched for files to process. Whereas on the remote system, only the top-level directory that you have specified is searched for files to process.
In Transfer mode, specify how files are transferred. Select Binary if the file contents are not to be transformed. Select ASCII if the file is to be transmitted as ASCII. The value of this property is overridden by the value in the transferMode property of the FtpServer configurable service, if it is set.
This property is valid only when FTP or FTPS is selected as the protocol for remote transfer. If you have specified SFTP as the protocol, the Transfer mode mode property is ignored and binary encoding is used.
- In Scan delay, specify the delay, in seconds, between directory scans. The default is 60 seconds. The value set in this property overrides the value set for the polling interval on the Polling tab when the Remote Transfer property is selected. The value of this property is overridden by the value in the scanDelay property of the FtpServer configurable service, if it is set.
- On the Transactions tab, set the transaction mode. Although all file operations are non-transactional, the transaction mode on this input node determines whether the rest of the nodes in the flow are to be run under sync point. Select Yes if you want the flow updates to be treated transactionally, if possible, or No if you do not. The default for this property is No.
- Optional: On the Instances tab, set values for the properties that control the additional instances (threads) that are available for a node. For more details, see Configurable properties in a BAR file.