Z Common Data Provider overview

The Z Common Data Provider in IBM Z® Anomaly Analytics provides the infrastructure for accessing IT operational data from z/OS® systems and streaming it to IBM Z Anomaly Analytics in a consumable format. It is a single data provider for sources of both structured and unstructured data, and it can provide a near real-time data feed of z/OS log data, IBM® IMS log data, and System Management Facilities (SMF) data to IBM Z Anomaly Analytics.

Overview

The Z Common Data Provider automatically monitors and collects z/OS log data, IMS log data, and SMF data and streams it to the configured destination.

In each logical partition (LPAR) for which you want to analyze z/OS log data, IMS log data, or SMF data, a unique instance of Z Common Data Provider must be installed and configured to specify the type of data to collect and the destination (which is called a subscriber) for that data.

In IBM Z Anomaly Analytics, the Z Common Data Provider streams z/OS system log (SYSLOG) data to the log-based machine learning system, and streams SMF data and IMS log data to the metric-based machine learning system.

If you are also using IBM Z Operational Log and Data Analytics, the Z Common Data Provider can be used to stream other operational data to other destinations. For more information about IBM Z Operational Log and Data Analytics, see the IBM Z Operational Log and Data Analytics V5.1.0 documentation.

Components of Z Common Data Provider

IBM Z Anomaly Analytics uses the following components of the Z Common Data Provider:
Configuration Tool
The Configuration Tool is the web-based GUI that you use to define the sources from which you want to gather operational data. It is provided in the following two forms:
  • As an application for IBM WebSphere® Application Server for z/OS Liberty
  • As a plug-in for the IBM z/OS Management Facility (z/OSMF)

In the Configuration Tool, you create a policy for streaming z/OS system log (SYSLOG) data to log-based machine learning. and for streaming SMF data and IMS log data to metric-based machine learning. The policy is a set of rules that define the type of operational data to be collected and the subscribers for that data.

In the policy definition, you must select the z/OS SYSLOG data stream (from the z/OS Logs category) that you want log-based machine learning to process, and specify the Apache Kafka broker as the subscriber for this stream. You must also select the IBM Z Anomaly Analytics data streams that you want metric-based machine learning to process, and specify either the enterprise data warehouse or the Apache Kafka broker, depending on where metric-based machine learning is installed, as the subscriber for these streams.

Log Forwarder
The Log Forwarder gathers z/OS log data from the z/OS SYSLOG.

To reduce general CPU usage and costs, you can run the Log Forwarder on IBM System z® Integrated Information Processors (zIIPs).

System Data Engine
The System Data Engine gathers SMF data and IMS log data in near real time and in batch.
The System Data Engine can process SMF record types from the following sources:
  • SMF archive (which is processed only in batch)
  • SMF in-memory resource (by using the SMF real-time interface)
  • SMF user exit
  • SMF log stream

To reduce general CPU usage and costs, you can run the System Data Engine on IBM System z Integrated Information Processors (zIIPs).

Data Streamer
The Data Streamer receives z/OS SYSLOG data from the Log Forwarder and receives SMF data from the System Data Engine. It then alters the data to make it consumable for IBM Z Anomaly Analytics and streams the data to the Apache Kafka broker.

To reduce general CPU usage and costs, you can run the Data Streamer on IBM System z Integrated Information Processors (zIIPs).

Data Collector
The Data Collector provides a lightweight method for accessing IT operational data from z/OS systems. In IBM Z Anomaly Analytics, it loads historical z/OS log data from the z/OS SYSLOG and sends it directly to the Apache Kafka broker.

Customers who plan to deploy only log-based machine learning (without metric-based machine learning) can use the Data Collector for both historical and near real-time data.

To reduce general CPU usage and costs, you can run the Data Collector on IBM System z Integrated Information Processors (zIIPs).

Data flow among Z Common Data Provider components

The following steps describe the data flow for streaming data among the components of the Z Common Data Provider in IBM Z Anomaly Analytics, which are shown in Figure 1.

Typical data flow in log-based machine learning
  1. The Log Forwarder collects z/OS SYSLOG data from either user exits or from the z/OS operations log (OPERLOG).
  2. The Log Forwarder writes the z/OS SYSLOG data to the Data Streamer, which forwards the data in near real-time to the Apache Kafka broker.
Typical data flow in metric-based machine learning
  1. The System Data Engine collects SMF data and IMS log data from SMF streams.
  2. The System Data Engine sends SMF data and IMS log data to the Data Streamer, which forwards the data in near real-time to the Apache Kafka broker.
Figure 1. Z Common Data Provider overview
The illustration shows the flow of data among the primary components, as described in the text.