Running the bulk load program

The bulk load program provides the capability of loading or updating large amounts of configuration item (CI) data and relationships data into the TADDM database. The input to the bulk load program is a file that contains an Identity Markup Language (IdML) formatted XML document. The bulk load program can also be used to define large number of extended attributes.

In a streaming server deployment, the bulk load program updates data in the database on the storage server. You can run the bulk load program from the primary storage server, secondary storage server, or both at the same time. In a synchronization server deployment, you can run the program from the synchronization server.

To run the bulk load program, complete the following steps:

  1. Check the $COLLATION_HOME/etc/bulkload.properties file for accuracy.
    To accept the defaults, do not change anything in the file.
  2. Verify that the working directory and the results directory mentioned in the bulkload.properties file are valid.

    The working and results directory must exist before running the bulk load program or the bulk load program does not run. If you want to use different directories, you must create these directories manually and update the properties file. The bulk load program does not automatically create these directories.

    To create the directories, use the same user account that starts and stops the TADDM server. If the bulk load program does not have permissions to read and write from the working and results directories, it cannot run.

  3. Run the bulk load program.
    • For Windows operating systems, the bulkload script is located in the $COLLATION_HOME/bin/loadidml.bat file.
    • For all other operating systems, the bulkload script is located in the $COLLATION_HOME/bin/loadidml.sh file.
    Use the following command to run the bulk load program:
    ./loadidml.sh -o -f path_to_idml_file
    Where:
    -o
    Instructs the bulk load program to override the processed files and load the IdML files.
    -f path_to_idml_file
    Specifies the fully qualified path to the input file or a directory that contains input IdML files. The directory where the input file is placed must not be the same as the working directory of the bulk load program. If a shared directory is used to stage the input file, or, if files are copied to a local directory, this directory cannot be the same as the working, results, or log directory of the bulk load program. This parameter is required.
    For example,
    ./loadidml.sh -o -f /opt/IBM/taddm/dlaxmls/testfile.xml
  4. If the bulk load program does not run, read the messages in the bulkload.log file. The log file is located in the $COLLATION_HOME/log directory.

    Depending on the size of the book, the capacity of the computer, and other variables, it might take a long time to load the data. The bulk load program might not write messages to the log file when waiting for the TADDM system to store information in the database. When one or more records are stored in the database, the results file and the log file are updated with the status. You must not cancel the bulk load program while it is loading data. The bulk load program exits when data loading is complete. For information about how to determine whether the bulk load program is running, see Bulk load program problems.

  5. After the bulk load program runs, check the results file for problems during the bulk load program.
    The results file is located in the resultsdir directory configured in the bulkload.properties file.

    Look for a file with a .results extension and named the same as the IdML file. For example, if the name of the imported IdML file is test.xml, the name of the results file is test.results. If the results file is empty, check for an error in the log file. Important entries in the results file are marked with SUCCESS and FAILURE tags. If statistics are enabled, percentage successful messages are also recorded. FAILURE tags are for individual objects and do not necessarily indicate a failure of the entire file. Objects marked as failed are not stored in the database.

  6. To process the same book again after the first initial load, either use the -o flag, or remove the specific entry from the processedfiles.list file.
    The processedfiles.list file is located in the working directory specified in the bulkload.properties file.
  7. If the bulk load program indicates that another bulk load program is running and this is not the case, go to the working directory and delete the .bllock file. Run the bulk load program again.
    The .bllock file is a hidden file on UNIX systems because it starts with a period (.). Delete this file only if you are sure that another bulk load program is not already running.

    Read the information in the bulkload.log file. The log file can contain details about messages that are displayed.

  8. You can run the bulkload on the synchronization server, however, the following limitations exist:
    • No change propagation: There is a change history similar to that on the domain server, however, there is no change propagation. For example, if you change the duplex setting on an L2 interface, the duplex setting is not displayed as a change to the computer system. The duplex setting is displayed only as a change to the L2 interface.
    • No change aggregation: When an attribute changes from A -> B -> A in the same discovery (bulkload), the change (A -> B) or (B -> A) is not recorded in the change history report.
    • Limited advanced reconciliation: The only topology builder agent that runs on the synchronization server is the CrossDomainDependencyAgent. If the logical connection has the same IP address for the 'from' and 'to' IP addresses, or the localhost is used, the CrossDomainDependencyAgent does not create a dependency. The Discovery Library Adapter (DLA) creates the relationships between both implicit and explicit objects.