Batch loading historic log data with the Data Collector client

Use the Data Collector client to ingest data in batch mode. Use this method to review historic log data. This is the easiest method if you want to ingest large log files for historic analysis.

Before you begin

If you want to use the Data Collector client to load data from remote sources, you must configure the data collector on the remote host before you can configure the local data collector as described here. For more information, see Configuring the Data Collector client to ingest data from remote hosts.

About this task

If you want to load a log file that does not include time stamp information, ensure that the values for timestamp and timestampFormat are configured in javaDatacollector.properties. IBM® Operations Analytics Log Analysis cannot index log files without a time stamp, but if no time stamp information is found in a log file, the value that is configured in javaDatacollector.properties is used.

Procedure

To use the Data Collector client to load log file information, complete the following steps:

In the Administrative Settings page, define an appropriate log file source.
At the command line, navigate to the <HOME>/utilities/datacollector-client directory.

Update the configuration file that is used by the Data Collector client, javaDatacollector.properties.

Set the following properties, as appropriate:

logFile: The full path of the file you want to ingest.
servletURL: The URL of the Data Collector service.
userid: The user ID for the Data Collector service.
password: The password for the Data Collector service.
datasource: The datasource that you want to use to load data.
timestamp: The time stamp to use if a time stamp is not found in the log file.
batchsize: The number of BYTES of logs that are sent in one batch. The default value is 500,000.
keystore: The full path to the keystore file.
inputType: The valid input types are: LOGS, CONFIGFILES, SUPPORTDOCS. The default value is LOGS.
flush flag: If the default true is set, the client sends a flush signal to the Generic Receiver for the last batch of the file. If set to false, no flush signal is sent when the end of file is reached.

The following sample javaDatacollector.properties file displays the configuration for loading the SystemOut.log log file.

#Full path of the file you want to read and upload to Unity
logFile = SystemOut.log
#The URL of the REST service. Update the host/port information if required
servletURL = https://hostname:9987/Unity/DataCollector
#The user ID to use to access the unity rest service
userid=unityuser
#The password to use to access the unity rest service
password=password
datasource=Systemout
#Time stamp to use if your content can not find a time stamp in log record. 
The same time stamp would be used for all records
timestamp = 01/16/2013 17:27:23:964 GMT+05:30
#The number of BYTES of logs sent in one batch to Unity
batchsize = 500000
#The full path to the keystore file 
keystore = /home/unity/IBM/LogAnalysisTest/wlp/usr/servers/Unity/
keystore/unity.ks
#input data type - LOGS, CONFIGFILES, SUPPORTDOCS
inputType = LOGS
#flush flag:
#true : (default) if the client should send a flush signal to the Generic
 Receiver for the last batch of this file
#false : if no flush signal to be sent upon reaching eod-of-file
flushflag = true
#Other properties (name/value pairs, e.g. middleware = WAS) that you want
 to add to all json records
#These properties need to be appropriately added to the index configuration

Ensure that the Data Collector client JAR file, datacollector-client.jar, has execute permissions.
Use the following command to run the Data Collector client with the correct inputs:
```
<HOME>/ibm-java/bin/java 
-jar datacollector-client.jar
```

Results

After the task completes, the log file is indexed and can be searched in the Search workspace.