Batch loading historic log data with the Data Collector client

Use the Data Collector client to ingest data in batch mode. Use this method to review historic log data. This is the easiest method if you want to ingest large log files for historic analysis.

Before you begin

If you want to use the Data Collector client to load data from remote sources, you must configure the data collector on the remote host before you can configure the local data collector as described here. For more information, see Configuring the Data Collector client to ingest data from remote hosts.

About this task

If you want to load a log file that does not include time stamp information, ensure that the values for timestamp and timestampFormat are configured in javaDatacollector.properties. IBM® Operations Analytics Log Analysis cannot index log files without a time stamp, but if no time stamp information is found in a log file, the value that is configured in javaDatacollector.properties is used.

Procedure

To use the Data Collector client to load log file information, complete the following steps:

  1. In the Administrative Settings page, define an appropriate log file source.
  2. At the command line, navigate to the <HOME>/utilities/datacollector-client directory.
  3. Update the configuration file that is used by the Data Collector client, javaDatacollector.properties.
    Set the following properties, as appropriate:
    logFile
    The full path of the file you want to ingest.
    servletURL
    The URL of the Data Collector service.
    userid
    The user ID for the Data Collector service.
    password
    The password for the Data Collector service.
    datasource
    The datasource that you want to use to load data.
    timestamp
    The time stamp to use if a time stamp is not found in the log file.
    batchsize
    The number of BYTES of logs that are sent in one batch. The default value is 500,000.
    keystore
    The full path to the keystore file.
    inputType
    The valid input types are: LOGS, CONFIGFILES, SUPPORTDOCS. The default value is LOGS.
    flush flag
    If the default true is set, the client sends a flush signal to the Generic Receiver for the last batch of the file. If set to false, no flush signal is sent when the end of file is reached.
    The following sample javaDatacollector.properties file displays the configuration for loading the SystemOut.log log file.
    #Full path of the file you want to read and upload to Unity
    logFile = SystemOut.log
    #The URL of the REST service. Update the host/port information if required
    servletURL = https://hostname:9987/Unity/DataCollector
    #The user ID to use to access the unity rest service
    userid=unityuser
    #The password to use to access the unity rest service
    password=password
    datasource=Systemout
    #Time stamp to use if your content can not find a time stamp in log record. 
    The same time stamp would be used for all records
    timestamp = 01/16/2013 17:27:23:964 GMT+05:30
    #The number of BYTES of logs sent in one batch to Unity
    batchsize = 500000
    #The full path to the keystore file 
    keystore = /home/unity/IBM/LogAnalysisTest/wlp/usr/servers/Unity/
    keystore/unity.ks
    #input data type - LOGS, CONFIGFILES, SUPPORTDOCS
    inputType = LOGS
    #flush flag:
    #true : (default) if the client should send a flush signal to the Generic
     Receiver for the last batch of this file
    #false : if no flush signal to be sent upon reaching eod-of-file
    flushflag = true
    #Other properties (name/value pairs, e.g. middleware = WAS) that you want
     to add to all json records
    #These properties need to be appropriately added to the index configuration
  4. Ensure that the Data Collector client JAR file, datacollector-client.jar, has execute permissions.
  5. Use the following command to run the Data Collector client with the correct inputs:
    <HOME>/ibm-java/bin/java 
    -jar datacollector-client.jar

Results

After the task completes, the log file is indexed and can be searched in the Search workspace.