Configuring the batch processor
You can configure the batch processor by modifying the configuration of the Batch.properties file.
MaxExceptionsAllowed=-1
- This property defines the maximum number of exceptions allowed.
When the batch processor detects that this number has been reached,
it terminates the process. A value of
-1
tells the controller to ignore any exceptions, meaning there will be no maximum. queueSize=100
- This property defines the size of all queues used by the batch
processor. The queueSize must be set to a value greater than zero.
Once a queue reaches the queueSize limit, any operation that attempts
to add an element to the queue will be blocked. Similarly, any operation
that attempts to take an element from an empty queue will be blocked.Note: Batch processor queues order elements in first in, first out order.
ServerConfiguration.provider_url=<PROVIDER_URL>
- This property defines the URL of InfoSphere® MDM operational
server. For example:
corbaloc:iiop:mdm-host:2809
ServerConfiguration.context_factory=com.ibm.websphere.naming.WsnInitialContextFactory
- This property defines the WebSphere® Application Server context factory.
ReaderQueue=com.dwl.batchframework.queue.FileReaderQueue
- This property defines the default reader queue used by the batch processor.
WriterQueue=com.dwl.batchframework.queue.WriterChainedQueue
- This property defines the default writer queue used by the batch processor.
Writer.ForceStopAtCriticalError=false
- This property defines whether a batch instance is able to gracefully
shut down if it encounters a critical error.
true
– the batch instance will terminate immediately without waiting for the messages in the writer queue to be written to the output file.false
– the batch instance is able to gracefully shut down.
Writer.includeTimestamp=true
- This property defines whether timestamps are written to the output
files.
true
– the time and date when the message was written to the output files is recorded in the output file entry.false
– no time and date is included in the output files.
Writer.DateFormat=yyyy-MM-dd HH:mm:ss,SSS
- This property defines the format used to record timestamps in
the output file, if applicable. Note: If this property is not defined or is defined with an invalid date format, then the default date format will be used: yyyy-MM-dd HH:mm:ss,SSS.
message_generator=com.ibm.mdm.batchframework.message.PlainMessageGenerator
- This property defines the message generator that interprets an entity record or a line in an input file and generates a batch message.
home=<batch home>
- This property defines the home directory of the batch processor.
If the directory does not exist, the batch processor creates it, along
with the following subdirectories:
- input – The root directory of the default input files. If you are defining an input file with a relative path in an addTask XML request, the batch processor tries to expand the relative path from the input directory.
- stage – The directory that stores all Stage, Result, and Restart files.
- logs – The directory that stores all Activity log files.
instanceName=batch01
- This property defines the name of the batch processor instance.
Batch01
is the default name.Only one instance of the batch processor is allowed in a single location.
Tip: To run another instance at the same time, you must copy the batch processor package to another location and give it a different instance name.A batch job can only be restarted on the same instance where it was originally processed.
resultCategorizer=com.ibm.mdm.batchframework.message.BatchMessageCategorizer
- This property defines the categorizer that the batch processor
uses to determine if each processed message is a success or a failure.
There are two included categorizers:
com.ibm.mdm.batchframework.message.BatchMessageCategorizer
is the default categorizer, and uses the message status to determine if a processed message is a success or failure.com.ibm.mdm.batchframework.message.ResultCodeMessageCategorizer
uses the value of the<ResultCode>
tag to determine if a processed message is a success or failure.
resultCategorizer.options=restart
- This property defines how the batch processor is restarted:
restart
is the default value, and indicates a normal batch job restart from the place where a stopped batch job halted processing.restartWithErrors
indicates that a batch job restart will also retry any failed transactions.
progressStatus.refreshTime=60
- This property defines the heartbeat interval time. For details, see Monitoring the batch processing heartbeat.
duringChangeThreshold=10
- This property defines the threshold time, in seconds, that the
batch processor instance waits to pick up a batch job after it has
been added or updated. When the batch processor instance detects that
a job has been recently added or updated, it waits for the length
of time defined in this threshold before beginning work on the job
chain.Important: This threshold should not be adjusted.
maxCommentLength=500
- This property defines the maximum character length of the Task
Comment string. If a Task Comment exceeds this length, the comment
will be truncated.Tip: To avoid a large comment from being cut off when storing it in the ALERT.DESCRIPTION (which defines a limit of 1000 characters), you can modify the ALERT.DESCRIPTION definition so that it matches the value of maxCommentLength.Important: When considering the storage space for the comment string, do not forget to take into account languages that use multibyte characters, if applicable to your implementation.
mdm.database.uri=
- This property defines the database connection URI.
For example, when using a data source:
mdm.database.uri=jndi:jdbc/DWLCustomer
For example, when using a JDBC:
mdm.databasse.uri=jdbc:db2://localhost:50000/MDMDB;user=db2admin;password=db2admin
mdm.database.prop.sslConnection=
- This property determines whether the database connection uses
SSL.
For example:
mdm.database.prop.sslConnection=true
database.jdbc.driver=
- This property defines the fully qualified driver Java™ class. When using JDBC URL, this item is
mandatory.
For example, for IBM® DB2®:
database.jdbc.driver = com.ibm.db2.jcc.DB2Driver
jta.jndi=jta/usertransaction
- This property defines the Java Transaction API UserTransaction that the batch processor will use to start a transaction.
runtime.override.input.csv=ReaderQueue=com.ibm.mdm.batchframework.bulkprocessing.queue.TitledSingleLineCSVFileReaderQueue; message_generator=com.ibm.mdm.batchframework.message.CSVStringMessageGenerator
runtime.override.input.db=ReaderQueue=com.ibm.mdm.batchframework.bulkprocessing.queue.DatabaseReaderQueue; message_generator=com.ibm.mdm.batchframework.message.CSVStringMessageGenerator
runtime.override
is the keyword used for characteristics that are overridden at runtime. The key and value pairs after the first equals sign are loaded as a property, and are separated using a semicolon. If properties that have the same key, the following order is used to determine which property will override the others (later properties will override the previous ones):- Batch.properties
- Batch extension properties
runtime.override.input.<inputType>
runtime.override.jobdef.<TaskDefinitionId>
RuntimeOverride
specified in the batch job definition comment
- input.csv defines the runtime override for
batch jobs that read input from a CSV file. The batch job definition
comment contains the
<File>
tag.TitledSingleLineCSVFileReaderQueue
is capable of reading a single line CSV formatted input file with enclosing quotation marks. Each line of the input file will be treated as a record with multiple columns that are separated by commas.TitledCSVFileReaderQueue
is an alternate input definition that is capable of reading an RFC4180-compiled CSV formatted input file. For details, see RFC4180 at http://tools.ietf.org/html/rfc4180. If necessary, you can also replace it with the name of a customized Java class that meets your business requirements.
input.db defines the runtime override for batch jobs that read a database as the input. The batch job definition comment contains either dynamic search SQL or a SQLOverride.
runtime.override.jobdef.10=ParseAndExecConfiguration.OperationType=All; ParseAndExecConfiguration.requesterName=cusadmin; ParseAndExecConfiguration.requesterLanguage=100; ParseAndExecConfiguration.Parser=TCRMService; ParseAndExecConfiguration.Constructor=TCRMService; ParseAndExecConfiguration.CompositeTxn=no
- This property defines other properties that use the runtime override
function for batch jobs with a task definition ID of 10.
For each out-of-the-box job definition ID (except 100, which is reserved for implicitly created batch jobs), there is a corresponding runtime override property. You can customize the job definition IDs and add new properties for newly introduced job definition IDs. This enables you to adjust the runtime characteristics of all jobs with the specific job definition ID.
resultfile.sort.max_chunk_size=1000000
- This property defines the number of results in the result file
that are sorted in memory by the batch processor as one portion, and
are stored in a single file. The larger the value of this property,
the faster the sorting will be. However, a larger value also uses
more memory.Attention: If this value is too large, it might cause an Out of Memory error.
resultfile.sort.max_open_chunks=200
- This property defines the number of files of sorted chunks that are merged in a single process. The batch processor opens these files at the same time. The value cannot be over the operating system’s limit.
taskcategory=8
- This property defines the task category type used to create batch jobs.
job.duplicateKeyError=<ReasonCode>12</ReasonCode>
- This property defines the error phrase that indicates a duplicate key error while creating a batch job.
job.requesterName=cusadmin
- This property defines the user for task management in the batch processor. The tasks and comments will be created or updated by the user defined here.
priority_min=10001
- This property defines the minimum priority type. This is used
for task management in the batch processor. This property, along with
the related
priority_max
property, define the range of priority types that can be used for task management. These values represent the code types in the CdPriorityTp table. priority_max=10020
- This property defines the maximum priority type. This is used
for task management in the batch processor. This property, along with
the related
priority_max
property, define the range of priority types that can be used for task management. These values represent the code types in the CdPriorityTp table.