Running the bulk load program

The bulk load program provides the capability of loading or updating large amounts of configuration item (CI) data and relationships data into the TADDM database. The input to the bulk load program is a file that contains an Identity Markup Language (IdML) formatted XML document. The bulk load program can also be used to define large number of extended attributes.

In a streaming server deployment, the bulk load program updates data in the database on the storage server. You can run the bulk load program from the primary storage server, secondary storage server, or both at the same time. In a synchronization server deployment, you can run the program from the synchronization server.

To run the bulk load program, complete the following steps:

Check the $COLLATION_HOME/etc/bulkload.properties file for accuracy.
To accept the defaults, do not change anything in the file.
Verify that the working directory and the results directory mentioned in the bulkload.properties file are valid.
The working and results directory must exist before running the bulk load program or the bulk load program does not run. If you want to use different directories, you must create these directories manually and update the properties file. The bulk load program does not automatically create these directories.

To create the directories, use the same user account that starts and stops the TADDM server. If the bulk load program does not have permissions to read and write from the working and results directories, it cannot run.
Run the bulk load program.
- For Windows operating systems, the bulkload script is located in the $COLLATION_HOME/bin/loadidml.bat file.
- For all other operating systems, the bulkload script is located in the $COLLATION_HOME/bin/loadidml.sh file.
Use the following command to run the bulk load program:
```
./loadidml.sh -o -f path_to_idml_file
```
Where:

-o

Instructs the bulk load program to override the processed files and load the IdML files.

-f path_to_idml_file

Specifies the fully qualified path to the input file or a directory that contains input IdML files. The directory where the input file is placed must not be the same as the working directory of the bulk load program. If a shared directory is used to stage the input file, or, if files are copied to a local directory, this directory cannot be the same as the working, results, or log directory of the bulk load program. This parameter is required.
For example,
```
./loadidml.sh -o -f /opt/IBM/taddm/dlaxmls/testfile.xml
```
If the bulk load program does not run, read the messages in the bulkload.log file. The log file is located in the $COLLATION_HOME/log directory.
Depending on the size of the book, the capacity of the computer, and other variables, it might take a long time to load the data. The bulk load program might not write messages to the log file when waiting for the TADDM system to store information in the database. When one or more records are stored in the database, the results file and the log file are updated with the status. You must not cancel the bulk load program while it is loading data. The bulk load program exits when data loading is complete. For information about how to determine whether the bulk load program is running, see Bulk load program problems.
After the bulk load program runs, check the results file for problems during the bulk load program.
The results file is located in the resultsdir directory configured in the bulkload.properties file.
Look for a file with a .results extension and named the same as the IdML file. For example, if the name of the imported IdML file is test.xml, the name of the results file is test.results. If the results file is empty, check for an error in the log file. Important entries in the results file are marked with SUCCESS and FAILURE tags. If statistics are enabled, percentage successful messages are also recorded. FAILURE tags are for individual objects and do not necessarily indicate a failure of the entire file. Objects marked as failed are not stored in the database.
To process the same book again after the first initial load, either use the -o flag, or remove the specific entry from the processedfiles.list file.
The processedfiles.list file is located in the working directory specified in the bulkload.properties file.
If the bulk load program indicates that another bulk load program is running and this is not the case, go to the working directory and delete the .bllock file. Run the bulk load program again.
The .bllock file is a hidden file on UNIX systems because it starts with a period (.). Delete this file only if you are sure that another bulk load program is not already running.
Read the information in the bulkload.log file. The log file can contain details about messages that are displayed.
You can run the bulkload on the synchronization server, however, the following limitations exist:
- No change propagation: There is a change history similar to that on the domain server, however, there is no change propagation. For example, if you change the duplex setting on an L2 interface, the duplex setting is not displayed as a change to the computer system. The duplex setting is displayed only as a change to the L2 interface.
- No change aggregation: When an attribute changes from A -> B -> A in the same discovery (bulkload), the change (A -> B) or (B -> A) is not recorded in the change history report.
- Limited advanced reconciliation: The only topology builder agent that runs on the synchronization server is the CrossDomainDependencyAgent. If the logical connection has the same IP address for the 'from' and 'to' IP addresses, or the localhost is used, the CrossDomainDependencyAgent does not create a dependency. The Discovery Library Adapter (DLA) creates the relationships between both implicit and explicit objects.

For most situations, using just the -f and -o parameters is sufficient, but other parameters are supported, if needed. The following example shows some infrequently used parameters:

./loadidml.sh -f path_to_idml_file  -u userid -p passwd 
-g -c -e -o -b bidirectional_format_on_or_auto -l location tag -loadEAMeta -override -disableIdmlCertificationTool

Where:

-u userid

Specifies the user ID to be used to authenticate with the TADDM server.

The -u parameter is optional. Supply a user ID only if the user ID has the correct permissions (full update and read privileges) and is defined in the TADDM server as a valid user.

-p passwd

Specifies the password used to authenticate with the TADDM server.

The -p parameter is optional. Supply a password only if the user ID has the correct permissions (full update and read privileges) and is defined in the TADDM server as a valid user.

-g

Specifies to use the graph writing algorithm to persist data into the database.

This option improves loading performance, and it is useful for loading XML files with data that has large arrays of contained objects. Tivoli Storage Productivity Center and Tivoli Configuration Manager discovery library IdML files are examples of file with large arrays of contained objects. Other files can also benefit from this algorithm. The graph writing algorithm writes batches of objects to the database at one time. The number of objects written is influenced by the cache size setting in the bulkload.properties file. Give careful consideration to the use of this algorithm because there are limitations.

Restriction: Because of current API limitations, the IdML file must have source tokens present for each object in order to perform graph writing. Source tokens, however, are an optional value in an IdML XML file. Therefore, if the -g option is provided and no source token is available for an object, a dummy source token is automatically generated for that object using the required object ID from the XML file. The dummy source tokens are not displayed as launch in context tokens. However, the dummy source tokens are displayed for individual object attributes and in the bulk load log file. This behavior is a normal part of the algorithm.

If any single element does not satisfy naming rules, or it fails to be written to the database for any reason, the entire graph, or a subset of elements might fail to be persisted. Error messages indicating the specific object that caused the failure are not available due to current limitations. Run the file without the -g option to pinpoint a problem.

Certain IdML files reuse source token values for more than one object. While permissible in IdML, these files cannot be processed with the -g option due to current limitations. Files that reuse source tokens between objects must be loaded without the -g option.

Graph writing requires additional memory at both the client and the server. If an out of memory error occurs, reduce the cache size setting in the properties file or do not use the -g option.

Abstract resources are not supported during graph writing. Process the files that contain these characteristics without the -g option. Extended attributes are supported during graph writing.

-c

Specified to copy the IdML source files to the working bulk directory and process them there. This method might lead to delays when copying large files.

-e

Specifies that data loading error information is made available in the program return code. By default, the bulk load program returns exit code 0 even if an error occurs when loading data. The -e parameter instructs the program to return code 5 when an error occurs when loading data. Note the return code from the bulk load program itself takes precedence even if the -e parameter is specified. For example, if the bulk load program cannot connect to the TADDM server, the returned code contains this information.

-b bidirectional_format_on_or_auto

Specifies whether bidirectional support is enabled, disabled, or automatically configured. Choices for the bidirectional flag are on and auto. When the bidirectional flag is on, you can configure the bidirectional parameters for each Management Software System using the predefined bidirectional profiles. When the bidirectional flag is set to auto, the bidirectional transformation is enabled and the bidirectional format is detected automatically.

If you are using SSH, do not specify on for the bidirectional flag. When you choose on for the bidirectional flag and use SSH, the bulk load bidirectional configuration window is not displayed. Without completing the fields in the bulk load bidirectional configuration window, you cannot configure the bidirectional parameters.

-l location tag

Specifies a location tag value when loading IdML files. Every configuration item that is loaded from the IdML file has this location tag value assigned. If more than one IdML file is present in the same directory and each file requires a unique location tag, you must load the files separately. Make sure that the com.ibm.cdb.locationTaggingEnabled value in the COLLATION_HOME/etc/collation.propertes file is set to true.

For more information about location tagging, see Configuring location tagging.

-loadEAMeta

Note: This flag is related to extended attributes metadata.

Forces bulk loader to ignore values and to store the extended attributes metadata only. New attribute is added to previously defined attributes for the same CDM class in the metadata. The type for the new attribute in attribute metadata is set to 'String'.

If -loadEAMeta is passed, the extended attributes metadata can be defined with the following books:

Regular IdML books
IdML books with metadata definitions only.

If both are passed, the -loadEAMeta option takes precedence over the -g option, and the graph writing mode is ignored.

Example

For the following part of the IdML source file, the bulk load program with the -loadEAMeta option defines the myExtAttr1, myExtAttr2 and myExtAttrInCategory extended attributes for the WindowsComputerSystem component type. The myExtAttrInCategory attribute is defined in the myExtAttrCategory category.

<cdm:sys.windows.WindowsComputerSystem id="9.10.10.10-WindowsSystem" sourceToken="ip_address=9.10.10.10">
		<cdm:extension>
				<cdm:extattr name="myExtAttr1">value1</cdm:extattr>
				<cdm:extattr name="myExtAttr2">value2</cdm:extattr>
				<cdm:extattr category="myExtAttrCategory" name="myExtAttrInCategory">value3</cdm:extattr>
		</cdm:extension>
		...
</cdm:sys.windows.WindowsComputerSystem>

-override

Note: This flag is related to extended attributes metadata.

If this flag is passed with -loadEAMeta flag, it forces redefinition of the attribute type, in case when the attribute is already defined, and its type is other than 'String'.

Example

./loadidml.sh -f /opt/IBM/taddm/dlaxmls/testfile.xml  
-u admin -p password -g -c -o -b auto -l tag

-disableIdmlCertificationTool

Specifies to disable IdML books validation before the processing of the books by the bulk load program.