Loading data

You can use the ARSLOAD program to load data into Content Manager OnDemand. If you are using Content Manager OnDemand for i, you can load data using ARSLOAD, or you can use the Start Monitor (STRMONOND) or Add Report (ADDRPTOND) command.

Loading data by using the ARSLOAD program

If the input data needs to be indexed, ARSLOAD will call the appropriate indexing program (based on the type of input data or, for the Generic indexer, the presence of a valid parameter file). For example, ARSLOAD can invoke the Generic indexer to process the parameter file and generate the index data. ARSLOAD can then add the index information to the database and load the input files or documents specified in the parameter file on to storage volumes.

There are two ways to run ARSLOAD, and an additional method on z/OS® as well:

Daemon mode: The ARSLOAD program runs as a daemon (UNIX, z/OS and IBM® i servers) or service (Windows servers) to periodically check a specified directory for input files to process. When running ARSLOAD in daemon mode, a dummy file with the file type extension of .ARD or .PDF is required to initiate a load process. In addition, the Generic indexer parameter file (.ind) must be located in the specified directory. The GROUP_FILENAME: parameter in the .ind file specifies the full path name of the actual input file to be processed.
Manual mode: ARSLOAD is run from the command line (qshell on IBM i or OMVS on z/OS systems) to process a specific file. When running ARSLOAD in manual mode, specify only the name of the file to process. ARSLOAD adds the .ind file name extension to the name that you specify. For example, if you specify arsload ... po3510, where po3510 is the name of the input file, ARSLOAD processes the po3510.ind Generic indexer parameter file. The GROUP_FILENAME: parameter in the Generic indexer parameter file specifies the full path name of the actual input file to be processed.
Batch mode ( z/OS only): On z/OS , JCL can be used to start the ARSLOAD program in a UNIX System Services environment. The parameters for the ARSLOAD program are provided by using the PARM keyword on the EXEC statement. See the ARSLOAD command section of the Administration Guide for more information.

After successfully loading the data, the system deletes the input file that is specified on the GROUP_FILENAME: parameter if the file name extension is .out, and for daemon mode processing, if the rest of the input file name is the same as the .ARD file name. The system also deletes the .ind file (the Generic indexer parameter file), the .res file (the resource file if there is one), and the .ARD file (the dummy file that is used to initiate a load process when ARSLOAD is running in daemon mode). To ensure successful processing and deletion of the files, the .ARD or .PDF extension must be part of the file name for the .ind, .out, and optional .res files as shown in the following example.

The following list shows an example of file names in daemon processing mode when a .ARD file is used to trigger the load. The .res file is optional. The same naming requirements exist if you use a .PDF file to trigger the load, substituting .PDF for .ARD in the example.

  MVS.JOBNAME.DATASET.FORM.YYYYDDD.HHMMSST.ARD
  MVS.JOBNAME.DATASET.FORM.YYYYDDD.HHMMSST.ARD.ind
  MVS.JOBNAME.DATASET.FORM.YYYYDDD.HHMMSST.ARD.out
  MVS.JOBNAME.DATASET.FORM.YYYYDDD.HHMMSST.ARD.res

In the example, the MVS.JOBNAME.DATASET.FORM.YYYYDDD.HHMMSST.ARD file is the dummy file that triggers a load process in daemon mode. The MVS.JOBNAME.DATASET.FORM.YYYYDDD.HHMMSST.ARD.ind file is the Generic indexer parameter file, and contains a GROUP_FILENAME: parameter that specifies the input file to process, which is MVS.JOBNAME.DATASET.FORM.YYYYDDD.HHMMSST.ARD.out. The MVS.JOBNAME.DATASET.FORM.YYYYDDD.HHMMSST.ARD.res file is an optional resource file. After successfully loading the data, the system deletes all of the files.

Loading data by using the STRMONOND or ADDRPTOND command (IBM i only)

There are two ways to run the STRMONOND command to load data with the Generic indexer:

STRMONOND with TYPE(*DIR) parameter specified: The STRMONOND command runs as a monitor to periodically check a specified directory for input files to process. When running the STRMONOND command with TYPE(*DIR), the Generic indexer parameter file (.ind) is required to initiate the load process. The GROUP_FILENAME: parameter in the .ind file specifies the full path name of the actual input file to be processed.
STRMONOND with TYPE(*DIR2) parameter specified: The STRMONOND command runs as a monitor to periodically check a specified directory for input files to process. When running the STRMONOND command with TYPE(*DIR2), a dummy file with the file type extension of .ARD is required to initiate the load process. In addition, the Generic indexer parameter file (.ind) must be located in the specified directory. The GROUP_FILENAME: parameter in the .ind file specifies the full path name of the actual input file to be processed. This is similar to running the ARSLOAD program in daemon mode.

There is one way to run the ADDRPTOND command:

ADDRPTOND: The ADDRPTOND command is run from the command line to process a specific file. When running the ADDRPTOND command, you specify INPUT(*STMF) and provide the name of the .ind file to process in the Stream file (STMF) parameter (omitting the .ind file extension). The ADDRPTOND command adds the .ind file name extension to the name that you specify. For example, if you specify STMF(po3510), where po3510 is the name of the input file, the ADDRPTOND command looks for and processes the po3510.ind Generic indexer parameter file. The GROUP_FILENAME: parameter in the Generic indexer parameter file specifies the full path name of the actual input file to be processed. This is similar to running the ARSLOAD program in manual mode.

When the data is successfully loaded, both STRMONOND and ADDRPTOND can optionally delete the input file that is specified on the GROUP_FILENAME: parameter if the Delete processed file (DLTSPLF) or Delete input (DLTINPUT) parameter is set to *YES. For the input file to be deleted, the input file must be located in the same directory as the file that triggered the loading of the data, the file extension must be .out, and the rest of the input file name must be the same as the .ind file name. The system also deletes the .ind file (the Generic indexer parameter file) and the .ARD file (the dummy file that is used to initiate a load process in some cases) if the DLTSPLF or DLTINPUT parameter is set to *YES.

Example of file names for STRMONOND TYPE(*DIR):

  po3510.ind
  po3510.out

The po3510.ind file is the input file that triggers a load process for STRMONOND TYPE(*DIR). The po3510.ind file is the Generic indexer parameter file, and contains a GROUP_FILENAME: parameter that specifies the input po3510.out file to process. When the data is successfully loaded, the system deletes both files.

Example of file names for STRMONOND TYPE(*DIR2):

  po3510.ARD
  po3510.ARD.ind
  po3510.ARD.out

The po3510.ARD file is the dummy file that triggers a load process for STRMONOND TYPE(*DIR2). The po3510.ARD.ind file is the Generic indexer parameter file, and contains a GROUP_FILENAME: parameter that specifies the input file to process, which is po3510.ARD.out. When the data is successfully loaded, the system deletes all three files.

If you plan to automate the data loading process on your IBM i system by using STRMONOND or ARSLOAD, one of the following must be used to identify the application group and application to load:

the input file name
specific parameters on the command used to load the data
a monitor user exit program

The .ind file name extension (for STRMONOND *DIR processing) or the .ARD file name extension (for STRMONOND *DIR2 or ARSLOAD daemon processing) is required to initiate a load process. The case (uppercase or lowercase) of the extension (.ARD or .ind) is ignored. Application group and application names are case sensitive. Application group and application names might include special characters such as the blank character when using ADDRPTOND or ARSLOAD with a specific application group and application name provided. If a blank or other special character is included in the application group name or application name when used in this manner, the full name must be enclosed in single quotes. Mixed-case or lowercase names must also be enclosed in single quotes.

If you plan to automate the data loading process for PDF files, you might choose to use the STRMONOND command with monitor type *DIR. If you choose this approach, it is important to carefully consider which directories will receive the .ind and .pdf files to be loaded. By design, the Content Manager OnDemand directory monitor (with monitor type *DIR) processes files with a .ind or .pdf extension. At the time that the STRMONOND monitor selects a file to process, Content Manager OnDemand does not know which indexer (PDF indexer or Generic indexer) will be used to process the file. It is only after the application group and application are identified from the file name, and the indexer type is determined based on the application definition, that Content Manager OnDemand knows which indexer to use.

Relying on the arrival sequence of the files into the monitored directory does not ensure that they will be processed correctly. If the input .pdf file is placed in the same directory as the index (.ind) file, and the input .pdf file is found first or the .ind file does not yet exist in the directory, one of two things might happen:

The monitor will not find an application group name and application name to use, based on what was specified at the time the monitor was started, or
The monitor will find an application group name and application name to use, but because the Generic indexer is specified as the indexer type in the application, it will not be able to process the file because the Generic indexer is expecting a .ind file to point to the associated .pdf file to process.

If either of these situations occur, processing of the .pdf file fails and it is renamed to include a .ERR extension. Content Manager OnDemand then picks up the next file to process which might be the .ind file that points to the .pdf file that Content Manager OnDemand just attempted to process. Content Manager OnDemand is unable to find the data file specified in the .ind file because the .pdf file has been renamed with the .ERR extension and so processing of the .ind file fails and that file is also renamed to include the .ERR extension.

In this scenario, if the input file to be processed must have a .pdf extension, the PDF file should be placed in a directory other than the directory being monitored, and it should arrive before the .ind file is placed in the monitored directory. The .ind file should reflect the correct path to the .pdf file for processing (which will be a different directory than the directory in which the .ind file is placed). Alternatively, the .pdf file could be renamed to have a file extension of .out (for example; something other than .pdf). In this case, Content Manager OnDemand would skip over the .out file when looking for files to process and pick up the .ind file and successfully index and load the data.

If you plan to automate the Generic indexer data loading process for PDF files by using the STRMONOND command with monitor type *DIR2, there are no special considerations. The .pdf file is processed like any other input file for a *DIR2 monitor type.

See the IBM Content Manager OnDemand for i: Administration Guide for more information about using the STRMONOND and ADDRPTOND commands and the ARSLOAD API to load data.