The COLLECT statement

The values of some parameters of the System Data Engine started task are specified by using the COLLECT statement. You can find information for these parameters in this section.

Syntax

Syntax #1

Read syntax diagramSkip visual syntax diagram COLLECT log-name-1 WITH STATISTICSWITHOUT STATISTICS

Syntax #2

Read syntax diagramSkip visual syntax diagram COLLECT log-name-2 FROM resource EVERY 1 MINUTESEVERY integer MINUTESEVERY integer SECONDSWITHOUT STATISTICSWITH STATISTICSOFFTIME 0 SECONDSOFFTIME integer MINUTESOFFTIME integer SECONDS
Use the COLLECT statement to collect data from the specified source. The COLLECT statement(s) must be specified last in the HBOIN DD statement.
  • Use syntax#1 for batch collection of log data from log data sets specified in the HBOLOG DD statements. Only one COLLECT statement is allowed.
  • Use syntax#2 for real-time data collection from the specified source. Only one COLLECT statement is allowed with the exception that one COLLECT SMF and one COLLECT LOGREC can be specified together to collect both SMF and LOGREC data.

Parameters

log-name-1
Specifies the name of a log definition. It identifies the type of log to be collected.
  • SMF: process SMF records from SMF dump data sets.
  • IMS: process IMS log records from IMS system log data sets.
  • DCOLLECT: process output from DCOLLECT utility.
  • LOGREC: process LOGREC records from history LOGREC data set.
log-name-2
Specifies the name of a log definition. It identifies the type of log to be collected.
  • SMF: real-time SMF data collection from SMF logstream, in-memory resource, SMF user exit, or Kafka.
  • IMS: real-time IMS log collection from IMS online log data sets (OLDS).
  • LOGREC: real-time LOGREC data collection from z/OS® event notification facility (ENF).
resource
The data source from which type of log is collected. Depending on the type of the log, it can be specified as follows:
  • When log-name-2 is SMF:
    • An SMF log stream name. The System Data Engine collects SMF records from the SMF log stream that has been set up in the SMFPRMxx PARMLIB member.
    • An SMF in-memory resource. The System Data Engine collects SMF records from the SMF in-memory resource that has been set up in the SMFPRMxx PARMLIB member.
    • EXIT. The System Data Engine collects SMF records from exit.
    • KAFKA. The System Data Engine collects SMF records from Kafka brokers.
  • When log-name-2 is IMS:
    • OLDS. The System Data Engine collects IMS log records from IMS online log data sets (OLDS).
  • When log-name-2 is LOGREC:
    • ENF36. The System Data Engine collects LOGREC records from the z/OS event notification facility (ENF) event code 36.
WITH STATISTICS
Collected statistics are written at the end of each collect interval and the end of job.
WITHOUT STATISTICS
Collected statistics are only written at the end of job.
EVERY integer MINUTS | SECONDS
Controls how often (in minutes or seconds) the System Data Engine processes data.
At regular intervals, the System Data Engine queries the appropriate sources for new data. For example, it queries one of the following sources:
  • SMF in-memory resource
  • Shared storage to which the SMF user exit writes
  • SMF log stream
  • Apache Kafka
The default interval for this querying is 1 minute, and the minimum interval is 1 second. After each interval, the System Data Engine sends the new SMF records to the Data Streamer.

This collection processing interval is set on the EVERY clause of the COLLECT statement.

Guidelines for determining the interval value: Changing the interval value can affect the resource consumption of Z Common Data Provider. The parameter value 1 MINUTES has usually shown the best CPU performance. There are cases with very high throughput, however, when this would result in buffering too much data at once. If you want to collect data more frequently, a SDE interval of 30 SECONDS can alleviate this issue without majorly impacting CPU use. If you are changing this parameter to other values, use the following guidelines to help you determine an appropriate interval value:
  • Use a large interval value to minimize overhead.
  • Use a small interval value to minimize memory.
  • The interval value must be small enough to produce data as often as it is required by the subscriber.
  • Use an interval value that is a factor of the total time in one day.
  • The value for EVERY must be a positive integer and is limited to a duration that does not exceed one day. Exceeding the following values will cause an error:
    EVERY 86400 SECONDS
    EVERY 1440  MINUTES
  • If you want to use an interval value that is equal to or greater than 60 seconds, specify that value as a whole number, and specify the time unit in minutes. For example, if you want to set the interval value to 120 seconds, instead set it to 2 minutes.
Table 1. Example System Data Engine interval values that are a factor of the total time in one day
Time unit Example values
Seconds 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24, 25, 27, 30, 32, 36, 40, 45, 48, 50, 54
Minutes 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15
OFFSET integer MINUTES | SECONDS
Specifies how long to defer the System Data Engine data collection, and makes each System Data Engine start SMF data collection at different times.

If you want to reduce CPU MIPS when you run multiple System Data Engine instances in a single LPAR or reduce physical CPU MIPS on shared CPUs across multiple System Data Engine instances in a sysplex at the beginning of every minute, update the different offset time values (in minutes or seconds) for OFFTIME of each System Data Engine.

You can set the collect processing offset time on the OFFTIME clause of the COLLECT statement. For example:
//HBOIN    DD *
SET IBM_SDE_OFFTIME = '4 SECONDS';                            
SET IBM_SDE_INTERVAL = '60 SECONDS';                          
//          DD *                                                   
COLLECT SMF FROM &IBM_RESOURCE                           
  EVERY &IBM_SDE_INTERVAL                                    
  OFFTIME &IBM_SDE_OFFTIME;
In the example, the offset time is set to 4 seconds, and the System Data Engine starts collecting SMF data at 4 seconds past the integral multiple of one minute.
The setup of OFFTIME must meet the following rules:
  • The OFFTIME is optional. The default value is 0.
  • The OFFTIME is available only when the interval is 30 seconds or an integral multiple of one minute.
  • The value of OFFTIME must be a positive integer and be limited to a duration that does not exceed 5 minutes or 300 seconds, otherwise, there occurs an error.
  • The value of OFFTIME is available only when the value does not exceed half of the interval, otherwise, a warning message occurs and the OFFTIME is ignored.