A primary goal of the sample analytics pipeline is to
make adoption and setup easy. Therefore, Docker is used extensively to install and configure the
various components. You might need to make modifications to the installation to suit your
environment.
The sample analytics pipeline can be installed on Linux® on an x86 system or Linux on IBM Z environments.
The installation process uses Docker to download, install, and configure the various open-source
software, including Grafana, MariaDB or MySQL, Apache Kafka, and various Python packages.
Before you begin
- Ensure that you have the following prerequisites installed:
Note: These instructions and scripts were implemented and tested on a virtual machine that has no
other Docker images or containers. If you use these instructions and scripts on a machine that has
other Docker images and containers, carefully review each step, script, and other information to
ensure no undesirable behaviors result.
Additionally, these instructions assume the user ID that
is used to implement the installation has the authorization to enter Docker commands directly. To
ensure that you do not need to issue the sudo command for every Docker command
that is entered manually or issued from the included scripts, use the following command to provide
authorization to the user ID that will manage the installation:
usermod -aG docker
userID
- Ensure that you understand the sample analytics pipeline
server configurations and components.
- Determine your preferred server configuration (single or dual) and which servers will be
Linux on an x86 system or Linux on IBM Z.
- To use real-time runtime metrics collection, ensure that you have
adequate disk space on the collection server or single-server configuration for the real-time runtime metrics collection logging. For more information, see Runtime metrics collection log files.
Procedure
If you want to set up a dual-server configuration, complete the following steps on both
the collection server and the analytics server, unless the step explicitly indicates to do the step
on only one of the servers. If you want to set up a single-server configuration, complete the
following steps on a single server.
- Copy the
base/tpfrtmc/bin/tpf_sample_analytics_pipeline.tar.gz file in binary format
from your z/TPF source repository to the home
directory on your Linux machine. Enter the following command to extract the content from the tar
file:
tar -xf
tpf_sample_analytics_pipeline.tar.gz
- Default credentials are specified in the
tpf_data_sci/tpf_default_credentials.text file. These credentials are used in
various scripts that are provided with the sample analytics pipeline. Change the user name and password to values that are more secure for your environment. When you
change passwords, you must make updates to various files in the
tpf_data_sci/Docker directory.
There are many components in use in the sample analytics pipeline. The component versions that are indicated were stable at the time of release. To use the latest
versions of these components, modify the version numbers that are specified in the
tpf_data_sci/user_files/tpf_prepare_configurations.yml and
tpf_data_sci/user_files/tpf_zrtmc_analyzer_files/requirements.txt files with
the latest version.
- For the collection server or a single-server configuration, copy the
base/tpfrtmc/bin/tpfrtmc.tar.gz file in binary format from your z/TPF source repository to the
tpf_data_sci/Docker/tpf_rtmc_docker_files/ directory. Enter the following
command to extract the content from the tar file:
tar -xf
tpfrtmc.tar.gz
- For the analytics server or a single-server configuration, copy the
base/tpfrtmc/bin/tpf_zmatc_analyzer.tar.gz file in binary format from your
z/TPF source repository to the
tpf_data_sci/Docker/tpf_zmatc_analyzer_docker_files/ directory. Enter the
following command to extract the content from the tar file:
tar -xf
tpf_zmatc_analyzer.tar.gz
- Define your Apache Kafka hosts, encryption settings, topic settings, and
programmatic variables in the tpf_data_sci/user_files/kafka_hosts.yml file. For
more information about how to configure this file, see the comments in the file.
- If Python 3.8 and the pyyaml library, which are used by the
tpf_prepare_configurations.sh script, are not installed on your system, enter
the following commands to install them:
- sudo yum install python38
- sudo python3 -m pip install --upgrade pyyaml
- Change your directory to the Docker directory. Enter the following
command:
cd tpf_data_sci/Docker
- For the collection server, analytics server, or single-server
configurations, you must prepare your configuration.
This configuration determines if the
server will use MariaDB or MySQL, Linux on an x86 system or Linux on IBM Z, use trusted dependency repositories, and
more.
- Define your settings in the
tpf_data_sci/user_files/tpf_prepare_configurations.yml
file.
- Enter the following command to configure your server:
./tpf_prepare_configurations.sh
To view which files are edited and what changes are made to achieve your desired settings, see
the tpf_prepare_configurations.sh script.
- Enter the docker-compose command to start the
Docker containers.
- For the collection server or single-server configurations, take one of the following actions:
- If you are using a MySQL database, enter the
following command:
docker-compose --file tpf-insights-dashboard-network.yml --file
tpf_mysql.yml --file tpf_kafka.yml up -d --build
- If you are using a MariaDB database, enter
the following command:
docker-compose --file tpf-insights-dashboard-network.yml --file
tpf_mariadb.yml --file tpf_kafka.yml up -d --build
Note: For Kafka configurations on
Linux on IBM Z, if you need to rebuild the Kafka
container, first remove all files and folders in the
tpf_data_sci/Docker/tpf_kafka_docker_files/volumes/kafka-logs directory by
issuing the following command:
rm -rf
tpf_data_sci/Docker/tpf_kafka_docker_files/volumes/kafka-logs/\*
Otherwise, you
might receive the following error from the Kafka broker when the tpf-kafka-broker container starts:
The Cluster ID jw3FiOddStufuL211VzUjQ doesn't match stored clusterId.
- For the analytics server, take one of the following actions:
- Set up the database tables and stored procedures by running the SQL script. For the
collection server, analytics server, and single-server configurations, enter the following
command:
./tpf_setup_db.sh
- Run the following script for the collection server or
single-server
configurations:
./tpf_create_kafka_topics.sh
This script creates the Apache Kafka
topics.
- Run the following script for the collection server or single-server
configurations:
./tpf_modify_kafka_topics.sh
hostname:port
where hostname:port is the hostname and port that is
specified in the tpf_data_sci/user_files/kafka_hosts.yml file from step 5.
This script modifies the Apache Kafka topics
based on the modify_script_variables settings that are specified for your host in the
tpf_data_sci/user_files/kafka_hosts.yml file.
- Enter the docker-compose command to start the tpfrtmc Docker
containers. For the collection server or single-server configurations, enter the following
command:
docker-compose --file tpf-insights-dashboard-network.yml --file
tpf_collection_server.yml up -d --build
- Optional: Configure the ZRTMC analyzer
instances to support multiple z/TPF systems.
- Enter the docker-compose command to start the
remaining Docker containers. Enter the following command for the analytics server or single-server
configurations:
docker-compose --file tpf-insights-dashboard-network.yml
--file tpf_analytics_server.yml up -d --build
Note: The ZRTMC analyzer connects to both Apache Kafka and the database upon startup. Any data that
is available on the configured Apache Kafka topics
will start being processed. The ZMATC analyzer performs analysis on all available message analysis tool results in the database on the analytics
server.
- Optional: If you have an active firewall, ensure that the
ports specified in the YAML (.yml) files are open. For example:
- Enter the following command for each port that is exposed by the YAML files:
sudo firewall-cmd --zone=public --add-port=portID/tcp
--permanent
where
portID represents the following ports:
- For MariaDB or MySQL: 3306
- For Grafana: 3000
- For Apache Kafka: 2181, 9092, 9093, 8082,
8000
- For tpfrtmc: 9090
- Reload the firewall by entering the following command:
sudo
firewall-cmd --reload
You can modify all ports before entering the reload command. Additionally, you can use the
tpf_data_sci/Docker/tpf_open_firewall_ports.sh script to process all of these
commands for the default ports.
- Optional: If you plan to process tapes created by the name-value pair
collection process with the ZCNVP command, enter the
docker-compose command to start the tpfrtmc Docker container. For the collection
server or single-server configurations, enter the following
command:
docker-compose --file tpf-insights-dashboard-network.yml --file
tpf_zcnvp_tpfrtmc.yml up -d --build
What to do next
The analytics pipeline is now fully functional.