Configuring the Data Collector to collect data in batch mode

You can optionally configure the Data Collector to collect z/OS® log data and System Management Facilities (SMF) data in batch mode. Therefore, the Data Collector reads its input from a file or a data set.

About this task

You need to customize the following files to configure the Data Collector:
  • Configuration files
    • <policy>.collection-config.json, which is generated through the Configuration Tool
    • application.properties
  • Batch JCL
    • HBODCBAT

Procedure

  1. Generate the <policy>.collection-config.json file through the Configuration Tool. For more detailed instructions, see Managing policies for the Data Collector.
    1. From the Z Common Data Provider Configuration Tool, click the Create a policy box under section Policy for streaming data to Apache Kafka through Data Collector.
    2. In the resulting Policy Profile Edit window, type or update the required policy name and optionally, a policy description.
    3. Click the Configure data resources box to open the Configure Data Resources window.
    4. Click ADD SYSTEM to open the 'Add system' dialog. Specify System name and Bootstrap servers, and then click OK.
      System name
      The name of the system in the sysplex.
      Bootstrap servers
      A comma-delimited list of host:port pairs to use for establishing the initial connections to the Kafka cluster.
    5. In the system section, click the Add data resource icon to add a data resource.
      • To collect SYSLOG,
        1. Select LOG from the Data source list.
        2. Specify a value for History topic name, which is the topic name for history SYSLOG data when the Data Collector is running in batch mode. The default value is IBM-CDP-zOS-SYSLOG-Console-Historical.
        3. Specify other parameters as needed.
        4. Click OK to save your settings.
      • To collect SMF data,
        1. Select SMF from the Data source list.
        2. Specify other parameters as needed.
        3. Click OK to save your settings.
      • To collect TMC data,
        1. Select TMC from the Data source list.
        2. Specify a value for Topic name. The default value is IBM-CDP-zOS-{plexName}-TMC.
        3. Click OK to save your settings.
      • To collect BVIR data,
        1. Select BVIR from the Data source list.
        2. Specify a value for Topic name. The default value is IBM-CDP-zOS-{plexName}-BVIR.
        3. Click OK to save your settings.
      • To collect DCOLLECT data,
        1. Select DCOLLECT from the Data source list.
        2. Specify a value for Topic name. The default value is IBM-CDP-zOS-{plexName}-DCOLLECT.
        3. Click OK to save your settings.
    6. When you finish the configuration, click FINISH and you return to the Policy Profile Edit window.
    7. To save the policy, click Save.
    Note: You can also use the following command to create a copy of the sample $INSTDIR/DC/samples/batch/collection-config.json file and put it in a directory for use. $INSTDIR represents the directory where the Data Collector is installed.
    cp -R $INSTDIR/DC/samples/batch <destination directory>
    Sample configuration for SYSLOG:
    {
        "lpars": [
            {
                "name": "lpar-name",
                "bootstrapServers": "localhost:9092",
                "log": {
                    "historyTopicName": "IBM-CDP-zOS-SYSLOG-Console-Historical"
                }
            }
        ]
    }
    Sample configuration for SMF data:
    {
        "lpars": [
            {
                "name": "lpar-name",
                "bootstrapServers": "localhost:9092",
                "smf": {
                    "topicName": "IBM-CDP-zOS-SMF-{dataSourceType}"
                }
            }
        ]
    }
    Sample configuration for TMC data:
    {
        "lpars": [
            {
                "name": "lpar-name",
                "bootstrapServers": "localhost:9092",
                "tmc": {
                    "topicName": "IBM-CDP-zOS-{plexName}-TMC"
                }
            }
        ]
    }
    Sample configuration for BVIR data:
    {
        "lpars": [
            {
                "name": "lpar-name",
                "bootstrapServers": "localhost:9092",
                "bvir": {
                    "topicName": "IBM-CDP-zOS-{plexName}-BVIR"
                }
            }
        ]
    }
    Sample configuration for DCOLLECT data:
    {
        "lpars": [
            {
                "name": "lpar-name",
                "bootstrapServers": "localhost:9092",
                "dcollect": {
                    "topicName": "IBM-CDP-zOS-{plexName}-DCOLLECT"
                }
            }
        ]
    }
  2. Optional: If you have the following requirements, copy and customize the application.properties file:
    1. To enable Transport Layer Security (TLS) communications with the Apache Kafka server, update the TLS-related parameters that are described in Updating the configuration file of the Data Collector.
    2. Specify other parameters as needed. For detailed instructions, see Configuration reference for the application.properties file.
    The sample application.properties file for batch mode is located in $INSTDIR/DC/samples/batch, of which, $INSTDIR is the directory where the Data Collector is installed. If you need to configure the application.properties file, copy and place it in the same directory of <policy>.collection-config.json file.
  3. Create and customize the job HBODCBAT.
    1. From the data set hlq.SHBOSAMP, copy the job HBODCBAT to your JCL library.
    2. In the copied HBODCBAT, set the following environment variables for your environment:
      JAVAHOME
      Specifies the Java™ installation directory. Replace '/usr/lpp/java/J8.0_64' to your Java installation directory in the following SET statement:
      // SET JAVAHOME='/usr/lpp/java/J8.0_64'
      CDPINST
      Use this parameter to specify the location where the Data Collector is installed. Replace /usr/lpp/IBM/zcdp/v5r1m0 with the directory where the Data Collector is installed in the following SET statement:
      // SET CDPINST='/usr/lpp/IBM/zcdp/v5r1m0'
      CDPWORK
      Use this parameter to specify the Data Collector working directory, which contains files that are created and used during the operation of the Data Collector. If needed in the following SET statement, replace this value:
      // SET CDPWORK='/var/zcdp/dc'
      Important: Do not update, delete, or move the files in the CDP_HOME directory. The value of CDP_HOME must not be a symlink.
      POLICY
      Use this parameter to specify the full file path of the Data Collector configuration file myPolicy.collection-config.json. If needed in the following SET statement, replace this value:
      // SET POLICY='/etc/cdpConfig/myPolicy.collection-config.json'
      RESTYPE
      Use this parameter to specify which data types to be collected. The allowed values are SYSLOG, SMF, BVIR, TMC, and DCOLLECT. In this case, the value should be set to SYSLOG.
      //  SET RESTYPE='SYSLOG'
      BATCHSDR
      Use this optional parameter to specify the rate limit for the Data Collector that collects SYSLOG. The value must be an integer number that ranges from 1 to 1000. In this case, set the value to 500, which indicates that the Data Collector collects SYSLOG with a rate limit of 500M per minute.
      //  SET BATCHSDR='500'
      The parameters must be set correctly; otherwise, the Data Collector considers the parameters as invalid and ignores it.
    3. Update the statements:
      • To collect SYSLOG,
        1. Comment out the HBOLOG statement.
          //*HBOLOG    DD   DISP=SHR,DSN=<SMF, BVIR, TMC or DCOLLECT DateSet Name>
        2. Customize the HBOIN statement.
          1. Replace <Plex Name> with the value of the plex name where the SYSLOG data sets come from.
          2. Replace <DD Name> with the DDNAME that you want to use in this JCL. You can collect logs from multiple sysplexes at one time.
          3. Replace <Time Zone> with the time zone that you want to set for the sysplex. You can set different time zones for multiple sysplexes. For example:
            PLEXNAME:TST1,DDNAME:DD1,TZ:+0800
            PLEXNAME:TST2,DDNAME:DD2,TZ:+0100

            The value of the TZ parameter must be in the format plus_or_minusHHMM, where plus_or_minus represents the + or - sign, HH represents two digits for the hour, and MM represents two digits for the minute.

        3. Define the actual DD statements.
          Replace <Syslog DateSet Name> with the data set name that you want to collect logs from the batch processor. You can collect logs from multiple sources for one DDNAME, for example,
          //DD1  DD DISP=SHR,DSN=ZCDP.SYSLOG1
          //     DD DISP=SHR,DSN=ZCDP.SYSLOG2
          //DD2  DD DISP=SHR,DSN=ZCDP.SYSLOG3
          
          Note: Each DDNAME specified in step 3.d must be defined as an actual DD statement.
      • To collect the SMF, BVIR, TMC, or DCOLLECT data,
        1. You can specify a customized sysplex name for these data. Specify the value in the HBOIN statement with the following format:
          //HBOIN    DD  *
          PLEXNAME:<sysplexname>

          You can comment out the <DD Name> DD statement and the HBOIN statement. Data Collector uses the system's sysplex name as the default sysplex name.

        2. Customize HBOLOG statement by replacing <SMF, BVIR, TMC, or DCOLLECT DateSet Name> with the data set name from which you want to collect logs in the batch processor.