Filtering a System Data Engine data stream by using a data stream definition
For a System Data Engine data stream, you can use the WHERE clause in
the custom update definition to filter the records to be processed, and use the custom template
definition that is associated with the update definition to filter the fields to be streamed by
Z Common Data Provider.
About this task
Procedure
-
If you do not already have one, create a partitioned data set (PDS) that is used as the user
concatenation library for the custom definitions.
For more information about how to create the data set, see step 1a in Creating a System Data Engine data stream definition.
-
Create a custom update definition in a new or an existing data set member of the user
concatenation library. If you want to filter the records to be processed, add a
WHEREclause to the definition. The name of the member for the custom update definition cannot be the same as any existing member in theSHBODEFSdata set.- You can create a new custom update definition by using the
DEFINE UPDATEstatement. For the language reference of theDEFINE UPDATEstatement, see DEFINE UPDATE statement. - To filter an existing data stream, copy the update definition that is used by the data
stream to a new or an existing data set member of the user concatenation library and update the
definition based on your requirements.
- Locate the update definition for the existing data stream by using one of the following methods.
The data set members for update definitions are named
HBOUxxxx.- Because the Z Common Data Provider names the data stream with the name of the associated update definition, in your z/OS® environment, use ISPF option 3.14 or
SRCHFOR ISPFcommand to search the data stream name. - Review the data stream definition in the sde.streams.json file or the
ims.streams.json file in the Configuration Tool directory
/usr/lpp/IBM/zcdp/v5r1m0/UI/LIB/. Check the
hboinparameter that specifies all data set members for required definitions. Usually the last one is for the update definition.
- Because the Z Common Data Provider names the data stream with the name of the associated update definition, in your z/OS® environment, use ISPF option 3.14 or
- Copy the update definition to a new or an existing data set member of the user concatenation
library. The following code sample shows the update definitionCopy the update definition
SMF_101_1_PACKAGEin the memberHBOUS101of the data setSHBODEFS.SET IBM_FILE = 'SMF1011K'; DEFINE UPDATE SMF_101_1_PACKAGE VERSION 'CDP.510' FROM SMF_101_1 SECTION PACKAGE TO &IBM_UPDATE_TARGET &IBM_CORRELATION AS &IBM_FILE_FORMAT SET(ALL);SMF_101_1_PACKAGEto the data set memberUSRUS101in the user concatenation libraryUSERID.LOCAL.DEFSwith the following changes.SET- The
SETstatement is needed only when the target of the update definition is a file, which means the variableIBM_UPDATE_TARTGETis set toFILE &IBM_FILE. You can change it toUSR1011K.SET IBM_FILE = 'USR1011K'; DEFINE UPDATE- The data streams must have unique names, so you must rename the update definition to avoid
conflict with the existing data stream. You can change it to
USR_101_1_PACKAGE.
USERID.LOCAL.DEFS(USRUS101)member has the following content:SET IBM_FILE = 'USR1011K'; DEFINE UPDATE USR_101_1_PACKAGE VERSION 'CDP.510' FROM SMF_101_1 SECTION PACKAGE TO &IBM_UPDATE_TARGET &IBM_CORRELATION AS &IBM_FILE_FORMAT SET(ALL); - If you want to filter the records to be processed, add a
WHEREclause to the custom update definition. For example, if you want to collect only Db2® package accounting records whose transaction name starts withMG, or the authorization ID isU@MUPJ2, add the followingWHEREclause:
The updatedWHERE (SUBSTR(QWHCEUTX,1,2) = 'MG') OR (QWHCAID = 'U@MUPJ2')USERID.LOCAL.DEFS(USRUS101)member has the following content:
For more information about theSET IBM_FILE = 'USR1101K'; DEFINE UPDATE USR_101_1_PACKAGE VERSION 'CDP.510' FROM SMF_101_1 SECTION PACKAGE WHERE (SUBSTR(QWHCEUTX,1,2) = 'MG') OR (QWHCAID = 'U@MUPJ2 ') TO &IBM_UPDATE_TARGET &IBM_CORRELATION AS &IBM_FILE_FORMAT SET(ALL);WHEREclause, see WHERE.
- Locate the update definition for the existing data stream by using one of the following methods.
The data set members for update definitions are named
- You can create a new custom update definition by using the
-
If you want to filter the fields to be streamed, add a
DEFINE TEMPLATEstatement for the update definition in the same data set member of that update definition.Verify that the template definition is placed after the update definition. The following example shows a template definition in the memberUSERID.LOCAL.DEFS(USRUS101)for the update definitionUSR_101_1_PACKAGEto stream only a few fields in thePACKAGEsection of recordSMF_101_1record.SET IBM_FILE = 'USR1101K'; DEFINE UPDATE USR_101_1_PACKAGE VERSION 'CDP.510' FROM SMF_101_1 SECTION PACKAGE WHERE (SUBSTR(QWHCEUTX,1,2) = 'MG') OR (QWHCAID = 'U@MUPJ2 ') TO &IBM_UPDATE_TARGET &IBM_CORRELATION AS &IBM_FILE_FORMAT SET(ALL); DEFINE TEMPLATE USR_101_1_PACKAGE FOR USR_101_1_PACKAGE ORDER (SM101TME, SM101DTE, QPACLOCN, QPACCOLN, QPACPKID, QPACSQLC, QPACSCB, QPACSCE, QPACBJST, QPACEJST) AS &IBM_FILE_FORMAT;DEFINE TEMPLATE- The template definition name must be the same as the update definition name to replace the
default template definition that streams all fields for the update definition. In the template
definition, you must include the date and time fields from the SMF record header for an SMF record,
or the timestamp field in the record suffix for an IMS log record. These fields are required for
timestamp resolution when you ingest data to your analytics platform. In this example, the fields
are
SM101DTEandSM101TME.
DEFINE TEMPLATEstatement, see DEFINE TEMPLATE statement. -
Validate the syntax of the custom update and template definitions.
Use the following example job to verify the members for the custom update and template definitions.
//HBOBCOL JOB (),'DUMMY',MSGCLASS=X,MSGLEVEL=(,0), // CLASS=A,NOTIFY=&SYSUID //* //HBOSMFCB EXEC PGM=HBOPDE,REGION=0M,PARM='SHOWINPUT=YES' //STEPLIB DD DISP=SHR,DSN=hlq.SHBOLOAD //HBOOUT DD SYSOUT=* //HBODUMP DD SYSOUT=* //HBOIN DD DISP=SHR,DSN=hlq.SHBODEFS(HBOCCSV) // DD DISP=SHR,DSN=hlq.SHBODEFS(HBOCCORY) // DD DISP=SHR,DSN=hlq.SHBODEFS(HBOLLSMF) // DD DISP=SHR,DSN=hlq.SHBODEFS(HBORS101) // DD DISP=SHR,DSN=USERID.LOCAL.DEFS(USRUS101) // DD * COLLECT SMF WITH STATISTICS //* //HBOLOG DD DUMMY- hlq
- Change hlq to the high-level qualifier for the Z Common Data Provider SMP/E target data set.
//STEPLIB DD DISP=SHR,DSN=hlq.SHBOLOAD //HBOOUT DD SYSOUT=* //HBODUMP DD SYSOUT=* //HBOIN DD DISP=SHR,DSN=hlq.SHBODEFS(HBOCCSV) // DD DISP=SHR,DSN=hlq.SHBODEFS(HBOCCORY) // DD DISP=SHR,DSN=hlq.SHBODEFS(HBOLLSMF) // DD DISP=SHR,DSN=hlq.SHBODEFS(HBORS101) HBORS101HBORS101contains the record definitionSMF_101_1. This member must be included before the member that contains your custom update and template definitions.// DD DISP=SHR,DSN=hlq.SHBODEFS(HBORS101)// DD DISP=SHR,DSN=USERID.LOCAL.DEFS(USRUS101)- Specifies the data set member for the custom update and template definition.
Important: Verify that the definitions are error-free by running the validation job before you create the custom data stream.If there is no syntax error, you see the following messages.
If there are syntax errors, correct the errors according to the messages in the output file that is defined byHBO0201I Update USR_101_1_PACKAGE was successfully defined. HBO0500I Template USR_101_1_PACKAGE was successfully defined.HBOOUT. -
Create a custom System Data Engine data stream in the Configuration Tool based on the update
definition and template definition that are created in previous steps.
For more information, see Creating a System Data Engine data stream definition. Verify that the data stream name, the custom update definition name, and the custom template definition name are the same, and that you specify the member for the record definition before the member for the custom update and template definitions in the SHBODEFS data set members field.
-
Update your analytics platform so that it can process the new data stream.
- If you are ingesting data to the Elastic Stack, for each data stream, create a field name
annotation configuration file, and a timestamp resolution configuration file in the Logstash
configuration directory.
If your new data stream is created based on an existing one, you can create the two files by copying and editing the files for the old data stream. In previous examples, the new data stream
USR_101_1_PACKAGEis created based on the existing data streamSMF_101_1_PACKAGE, and the two configuration files are H_SMF_101_1_PACKAGE.conf and N_SMF_101_1_PACKAGE.conf in the Logstash configuration directory. Copy these two files and change the file names to H_USR_101_1_PACKAGE.conf and N_USR_101_1_PACKAGE.conf, then edit the files according to the following instructions.- Field name annotation configuration file
- The file is named H_data_stream_name.conf, for example,
H_USR_101_1_PACKAGE.conf. See the following example of the file:
# CDPz ELK Ingestion # # Field Annotation for stream zOS-USR_101_1_PACKAGE # filter { if [sourceType] == "zOS-USR_101_1_PACKAGE" { csv{ columns => [ "Correlator", "SM101TME", "SM101DTE", "QPACLOCN", "QPACCOLN", "QPACPKID", "QPACSQLC", "QPACSCB", "QPACSCE", "QPACBJST", "QPACEJST"] separator => "," } } }sourceType- The value of
sourceTypemust match the data source type of the data stream. The naming convention iszOS-data_stream_name.if [sourceType] == "zOS-USR_101_1_PACKAGE" csv{ columns => []- If you have a custom template definition, change the column list to match the fields and order in the template definition.
- Timestamp resolution configuration file
- The file is named N_data_stream_name.conf, for example,
N_USR_101_1_PACKAGE.conf. See the following example of the file:
# CDPz ELK Ingestion # # Timestamp Extraction for stream zOS-USR_101_1_PACKAGE # filter { if [sourceType] == "zOS-USR_101_1_PACKAGE" { mutate{ add_field => { "[@metadata][timestamp]" => "%{SM101DTE} %{SM101TME}" }} date{ match => [ "[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss:SS" ]} } }sourceType- The value of
sourceTypemust match the data source type of the data stream. The naming convention iszOS-data_stream_name.if [sourceType] == "zOS-USR_101_1_PACKAGE" add_field =>- For an SMF record, you must specify the date and time fields in the SMF record header. In this
example, the fields are
SM101DTEandSM101TME."[@metadata][timestamp]" => "%{SM101DTE} %{SM101TME}"For an IMS log record, you must specify the timestamp field in the record suffix. For example, the timestamp field in the IMS_07 record suffix isDLRSTCK."[@metadata][timestamp]" => "%{DLRSTCK}" match =>- For an SMF record, use the following time format.
"[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss:SS"For an IMS log record, use the following time format."[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss.SSSSSS"
- If you are ingesting data to Splunk, define the layout of the data stream to the Splunk
server by creating the props.conf file in the Splunk_Home/etc/apps/ibm_cdpz_buffer/local directory
on the Splunk server.If your new data stream is created based on an existing one, you can create the file by copying and editing the content for the old data stream. Based on previous examples, open the props.conf file in the Splunk_Home/etc/apps/ibm_cdpz_buffer/default directory and copy the section forIn the Splunk user interface, you must also configure the file to data source type mapping for the new data stream. The file that the Data Receiver saves is named zOS-data_stream_name-*.cdp. For example, the data stream
SMF_101_1_PACKAGE. Paste the content to the props.conf file in Splunk_Home/etc/apps/ibm_cdpz_buffer/local and edit it according to the following example. If the props.conf file exists, append the content to the file.# # USR_101_1_PACKAGE (zOS-USR_101_1_PACKAGE) # [zOS-USR_101_1_PACKAGE] TIMESTAMP_FIELDS = SM101DTE, SM101TME, timezone TIME_FORMAT = %F %H:%M:%S:%2Q %z FIELD_NAMES = "sysplex","system","hostname","","","sourcename", "timezone", "Correlator", "SM101TME", "SM101DTE", "QPACLOCN", "QPACCOLN", "QPACPKID", "QPACSQLC", "QPACSCB", "QPACSCE", "QPACBJST", "QPACEJST" INDEXED_EXTRACTIONS = csv KV_MODE = none NO_BINARY_CHECK = true SHOULD_LINEMERGE = false category = Structured disabled = false pulldown_type = true TRUNCATE = 20000[zOS-USR_101_1_PACKAGE]- You must specify the data source name of the data stream. The naming convention is
zOS-data_stream_name. TIMESTAMP_FIELDS- For an SMF record, you must specify the date and time fields in the SMF record header. In this
example, the fields are
SM101DTEandSM101TME.TIMESTAMP_FIELDS = SM101DTE, SM101TME, timezoneFor an IMS log record, you must specify the timestamp field in the record suffix. For example, the timestamp field in the IMS_07 record suffix isDLRSTCK.TIMESTAMP_FIELDS = DLRSTCK, timezone TIME_FORMAT- For an SMF record, use the following time
format.
TIME_FORMAT = %F %H:%M:%S:%2Q %zFor an IMS log record, use the following time format.TIME_FORMAT = %F %H:%M:%S.%6Q %z FIELD_NAMES- If you have a custom template definition, change the column list to match the fields and order
in the template definition. If the column
Correlatorexists, do not remove it.
USR_101_1_PACKAGEhas the file that is named CDP-zOS-USR_101_1_PACKAGE-*.cdp.Restart the Splunk server after you make the changes.
Refer to Splunk documentation for more information.
- If you are ingesting data to the Elastic Stack, for each data stream, create a field name
annotation configuration file, and a timestamp resolution configuration file in the Logstash
configuration directory.
-
Create or update the policy to add the new System Data Engine data stream.
- In the Configuration Tool primary window, select the policy that you want to update.
-
Click the Add Data Stream icon
in the Policy Profile Edit window.
- Find and select the new data stream from the list in the select data stream window.
- Assign a subscriber for each new data stream.
-
In the Policy Profile Edit window, click SYSTEM DATA
ENGINE to ensure that values are provided for USER Concatenation
and CDP Concatenation fields, and click OK. Complete
the field USER Concatenation with the data set name of your user
concatenation library. Based on previous examples,
USERID.LOCAL.DEFSshould be specified for the field. - Click Save to save the policy.
Important: Each time that the associated update definition or template definition is changed, you must edit and save the policy in the Configuration Tool so that the changes are reflected in the policy. - Restart the Data Streamer and the System Data Engine.