Tips for operating Data Integrator

You can use the following tips to enhance how your company operates the IBM® TRIRIGA® Data Integrator tool.

Table 1. Tips for operating Data Integrator
Tip	Description
Associate existing records	After all records are created, reuse the Header File to create an Associative upload Header File. Establishing associations after all records are created eliminates issues that surround nonexistent records.
Checking for data upload errors	Every time the Data Integrator runs and has errors, it creates a Data Upload Error record. To find a Data Upload Error record, select Tools > System Setup > System > Data Upload Error. Scroll through the records in the result page. Click the file name hyperlink.
Confirm through BO result page	After the data produces a successful notification, review the data through the BO result page. For hierarchical objects, ensure that the child records appear under the proper parent in the hierarchy window.
Data Integrator cautions	Data Integrator does less checking of the data than a form into which you manually enter data: It creates or updates records with missing values for required fields. It allows any value in a list field, even if the value is not in the list. It does not run requested verifications on field values. If an upload does not explicitly set the organization and geography of the records it creates, the security access to those records might not be what is intended. You might need to include organization or geography fields in flat file records to ensure proper security access to created records.
Data Upload records	Every time that you use the Data Integrator to initiate the reading of data, it creates a Data Upload record. If you initiate a schedule of batch uploads, there is still just one Data Upload record. After an upload completes, you can learn how the upload went by looking at the Data Upload record that corresponds to the upload. The Data Upload record has a smart section that contains Data Upload Error records. If any errors occurred, you see them in the smart section. To find a Data Upload record, select Tools > System Setup > System > Data Upload. Scroll through the records in the result page. Click the file name hyperlink.
Folder cleanup	System Administration processes for Data Integrator batch upload must include periodic cleanup of the input and log folders.
Managing files for upload	When files are uploaded in batch mode, the upload happens without the direct participation of a person. The platform looks for the files on the computer it runs on rather than on a computer that a person is using. To identify the computer and the directory that is used for batch data uploads, you might need to consult with the person who administers the IBM TRIRIGA Application Platform environment for your organization. To ensure that files are properly processed, observe the following rules that regard when and how to put a file in the directory for processing: Before you put a file in the directory for batch upload, first check to see whether there is already a file with the same name in the directory. If there is, it means the last file with that name was not processed. After a file is processed, the file is moved to another directory. To ensure that a file in the directory is correctly processed, wait for the file to disappear before you put a new file with the same name in the directory. When it is time to put a file in the directory, do not put it in the directory by copying the file from somewhere else. The problem with copying is that the file gets put in the directory one piece at a time. In other words, if a file is copied into the directory there is a duration when only part of the file is in the directory. If the IBM TRIRIGA Application Platform tries to process the file when it is not all there, it might not process the file correctly. Instead of copying the file into the directory, first put the file somewhere else on the same file system as the directory. Then rename or move the file into the directory. When you rename or move a file within the same file system, no file copying takes place. Instead an entry is added to the directory that points to the file where it already is.
Processing an entire batch	It is possible to have a workflow that processes all the records in the batch after the individual records are created or updated. When an upload of a tab-delimited file is finished, the platform runs an Apply action on the Data Upload record that is associated with the upload. As records are processed, an association is created from the Data Upload record to the affected records. The association name is Data Load Status Of. You can use the Association Manager to define an association with this name from the Data Load Status Of business object to the other relevant business object. If you do so, a workflow that is started by an Apply action on a Data Upload record can process all of the uploaded records. You can use a Retrieve Workflow task to retrieve records that are sorted in the order of their value in a particular field. Therefore, you can use any field in the records to determine the order in which the workflow processes them.
Retire formula, rollup, and calculation workflows	In Production environments, all formula, rollup, and calculation workflows must be retired against the business object before the upload is run. This procedure ensures that the records are being created without forcing the platform to run unnecessary workflows against partial data. It also helps eliminate looping that might occur against hierarchical objects. After the upload is successful, republish and run the workflows against your new data set.
Run all files during off peak hours	Depending on the amount of information that is uploaded, server resources can become taxed and performance can degrade. Users might notice a performance hit, which might result in false positives and calls to your help desk.
Timing of record processing	The IBM TRIRIGA Application Platform does not necessarily process uploaded records in the same order they appear in the tab-delimited file. It reads the data from each tab-delimited record in sequence and then uses a different thread to process the data to create or update a record. In most cases, it does not matter in what order records are processed. If you must process the batch records in the same order that they are read, you must force the Data Integrator to process all the records with the same thread. To accomplish this task, set the number of threads that are used by the Data Import Agent to 1.
TRIRIGAWEB.properties for batch upload	The TRIRIGAWEB.properties file contains file settings for batch upload, specifically for input, processing, output, errors, and log. The files that are defined in TRIRIGAWEB.properties must exist and the file names must match.
Upload records first	Do not upload a Data Integrator file with all fields and all associations in one file. Create all records for all business objects first. After you import all records, begin the task of associating the records to one another.
Upload sets of records	Do not push massive amounts of data in a single Data Integrator file. Use sets of 5000 records per Data Integrator file. This practice reduces errors that are related to memory and system capacity.
Uploading images	If you must assign images to newly uploaded records, you can use the following format. Assume that the image field is named triImageIM. `triImageIM //Company-1/file123.jpg` The file name must start with the word "file" such as filemyimagename.jpg or file12345.gif. Based on the example, the file must be placed in the [TRIRIGA install directory]\userfiles\Company-1 directory before you upload the Data Integrator file.