The replay script sends previously captured test data to Apache Kafka to exercise the sample analytics pipeline. You can use this to test that the sample analytics pipeline is installed and functioning properly.
Before you begin
Ensure that the sample analytics pipeline is installed and
started. For more information, see Installing and configuring.
Procedure
- On the collection server or single-server configuration, enter the following command to
change your directory to the Docker directory:
cd
tpf_data_sci/Docker
- On the collection server or single-server configuration, enter the following command to
start the replay script:
./tpf_start_replay_script.sh
scenario_lowVolTraffic
For more information about the replay script, see the
./tpf_data_sci/tpfReplayScript/README.txt file that is included with the
sample analytics pipeline.
- Open Grafana in a browser at
http://your.server.name.com:3000 or
http://your.analytics.server.name.com:3000.
where
your.server.name is the name of your server in a single-server configuration and
your.analytics.server is the name of your analytics server in a dual-server
configuration.
- Open a dashboard. For example, click
. Set the
time picker to Last 15 minutes.
The following process occurs:
- The tpf_data_sci/tpfReplayScript/tpf_ReplayDiskToKafka.jar file simulates
real-time data arriving in Apache Kafka by
transferring data from file to Apache Kafka in
time sequence and simulating real-time collection durations.
- The processing that runs in the
tpf_data_sci/Docker/tpf_zrtmc_analyzer_docker_files/tpf_zrtmc_analyzer.py
Python script pulls the data in real time from Apache Kafka, performs calculations, and writes the results
to the database.
- The Grafana dashboards are set up to automatically refresh. When the dashboard refreshes, it
processes a variety of analyses that are implemented in SQL and SELECT statements to display the
analyzed data. The results of the replay script data analysis are displayed.
- For the Application Dimension, select Message Type,
SubType, Origin.
After 15 minutes of baseline data, the message rate from the
low volume message type will increase slightly, corresponding to a rise in CPU utilization. The
Message Type, SubType, Origin Rate Correlated to Actual System CPU panel
indicates that name-value pair collection data is
insufficient for the [Shopping, Air, Terminal] horizontal name-value pair combination.
- Change the Analysis Type to Aggregate to see the
correlation highlighted.
What to do next
If data does not flow, see Diagnosing the sample analytics pipeline when data is not flowing.The JAR files for the
replay script include the source if you would like to modify it.