Components of the Z Common Data Provider

This topic introduces the components that are included in the Z Common Data Provider.

The Z Common Data Provider includes the following basic components:

  • A Configuration Tool for defining the sources from which you want to collect operational data;
  • The data gatherer components (System Data Engine, Log Forwarder, and Data Collector) for gathering different types of operational data;
  • A Data Streamer for streaming all data to its destination;

Other components include the Open Streaming API for gathering operational data from your own applications, and a Data Receiver that acts as a target subscriber for operational data if the intended subscriber cannot directly ingest the data feed.

The components are illustrated in Figure 1.

Basic components

Configuration Tool
The Z Common Data Provider Configuration Tool is a web-based user interface that is provided as an application for IBM® WebSphere® Application Server for z/OS® Liberty, or as a plug-in for IBM z/OS Management Facility (z/OSMF). In the tool, you specify the configuration information as part of creating a policy for streaming operational data to its destination.

In the policy definition, you must define a data stream for each source from which you want to collect operational data. A stream of data is a set of data that is sent from a common source in a standard format, is routed to, and transformed by, the Data Streamer in a predictable way, and is delivered to one or more subscribers.

You must specify the following information for each data stream in the policy:
  • The source (such as SMF record type 30 or z/OS SYSLOG)
  • The format to which to transform the operational data so that it is consumable by the analytics platform.
  • The subscriber or subscribers for the operational data that is streamed by the Z Common Data Provider.

    For example, subscribers include Logstash, the Data Receiver, the HTTP Event Collector (HEC) of Splunk, Apache Kafka, and a generic HTTP receiver.

Data gatherer components
Each of the following components gathers a different type of data:
System Data Engine
The System Data Engine gathers System Management Facilities (SMF) data and IBM IMS log data in near real time. It can also gather SMF data, IMS data, and DCOLLECT data in batch.
The System Data Engine can process all commonly used SMF record types from the following sources:
  • SMF archive (which is processed only in batch)
  • SMF in-memory resource (by using the SMF real-time interface)
  • SMF user exit
  • SMF log stream
It can also convert SMF records into a consumable format, such as a comma-separated values (CSV) file, or into Db2® UNLOAD format for loading in batch.

The System Data Engine can also be installed as a stand-alone utility to feed SMF data into IBM Db2 Analytics Accelerator for z/OS (IDAA) for use by IBM Z® Performance and Capacity Analytics.

To reduce general CPU usage and costs, you can run the System Data Engine on System z® Integrated Information Processors (zIIPs).

Log Forwarder
The Log Forwarder gathers z/OS log data from the following sources:
  • Job log, which is output that is written to a data definition (DD) by a running job
  • z/OS UNIX log file, including the UNIX System Services system log (syslogd)
  • Entry-sequenced Virtual Storage Access Method (VSAM) cluster
  • z/OS system log
  • IBM Tivoli® NetView for z/OS messages
  • IBM WebSphere Application Server for z/OS High Performance Extensible Logging (HPEL) log
  • z/OS Resource Measurement Facility (RMF) Monitor III reports
  • z/OS sequential data set
To view a complete list of data sources, see Data streams configuration.

To reduce general CPU usage and costs, you can run the Log Forwarder on System z Integrated Information Processors (zIIPs).

Data Collector
The Data Collector provides a lightweight method for accessing IT operational data from z/OS systems and streaming it to Apache Kafka in a consumable format. It provides near real-time and historical data feed of z/OS log data, SMF data, and RMF Monitor III report data to Apache Kafka only. The deployment process is easy and you can quickly set up the Data Collector without complex configuration.
The Data Collector supports most of the operational data types and can stream these data to Apache Kafka in a few minutes. To view a complete list of data sources, see Data streams configuration.
User Application
The Z Common Data Provider Open Streaming API provides an efficient way to gather operational data from your applications by enabling your applications to be data gatherers. You can use the API to send your application data to the Data Streamer and stream it to analytics platforms.

For more information about how to send user application data to the Data Streamer, see Sending user application data to the Data Streamer.

Data Streamer
The Data Streamer streams operational data to configured subscribers in the appropriate format. It receives the data from the data gatherers, alters the data to make it consumable for the subscriber, and sends the data to the subscriber. In altering the data to make it consumable, the Data Streamer can, for example, split the data into individual messages, or translate the data into a different encoding (such as from EBCDIC encoding to UTF-8 encoding).

The Data Streamer can stream data to both on-platform and off-platform subscribers. To reduce general CPU usage and costs, you can run the Data Streamer on z Systems® Integrated Information Processors (zIIPs).

Other components

Depending on your environment, you might also want to use one, or both, of the following components:
Open Streaming API
The Open Streaming API provides an efficient way to gather operational data from your own applications by enabling your applications to be data gatherers. You can use the API to send your application data to the Data Streamer and stream it to analytics platforms.
Data Receiver
The Data Receiver is required only if the intended subscriber of a data stream cannot directly ingest the data feed from Z Common Data Provider. The Data Receiver writes any data that it receives to disk files, which can then be ingested into an analytics platform such as Splunk.

The Data Receiver typically runs on the same system as the analytics platform that processes the disk files. This system can be a distributed platform, or a z/OS system. For ingesting data to Splunk, install the Data Receiver on each Splunk forwarder that the IBM Z Operational Log and Data Analytics Splunk application is installed on.