Start of change

Processing test data with the replay script

The replay script sends previously captured test data to Apache Kafka to exercise the sample analytics pipeline. You can use this to test that the sample analytics pipeline is installed and functioning properly.

Before you begin

Ensure that the sample analytics pipeline is installed and started. For more information, see Installing and configuring.

Procedure

  1. On the collection server or single-server configuration, enter the following command to change your directory to the Docker directory:
    cd tpf_data_sci/Docker
  2. On the collection server or single-server configuration, enter the following command to start the replay script:
    ./tpf_start_replay_script.sh scenario_lowVolTraffic

    For more information about the replay script, see the ./tpf_data_sci/tpfReplayScript/README.txt file that is included with the sample analytics pipeline.

  3. Open Grafana in a browser at http://your.server.name.com:3000 or http://your.analytics.server.name.com:3000.
    where your.server.name is the name of your server in a single-server configuration and your.analytics.server is the name of your analytics server in a dual-server configuration.
  4. Open a dashboard. For example, click Home > Dashboards > 03. ZRTMC Analytics Sample > 02. Correlation Analysis. Set the time picker to Last 15 minutes.
    The following process occurs:
    1. The tpf_data_sci/tpfReplayScript/tpf_ReplayDiskToKafka.jar file simulates real-time data arriving in Apache Kafka by transferring data from file to Apache Kafka in time sequence and simulating real-time collection durations.
    2. The processing that runs in the tpf_data_sci/Docker/tpf_zrtmc_analyzer_docker_files/tpf_zrtmc_analyzer.py Python script pulls the data in real time from Apache Kafka, performs calculations, and writes the results to the database.
    3. The Grafana dashboards are set up to automatically refresh. When the dashboard refreshes, it processes a variety of analyses that are implemented in SQL and SELECT statements to display the analyzed data. The results of the replay script data analysis are displayed.
  5. For the Application Dimension, select Message Type, SubType, Origin.
    After 15 minutes of baseline data, the message rate from the low volume message type will increase slightly, corresponding to a rise in CPU utilization. The Message Type, SubType, Origin Rate Correlated to Actual System CPU panel indicates that name-value pair collection data is insufficient for the [Shopping, Air, Terminal] horizontal name-value pair combination.
  6. Change the Analysis Type to Aggregate to see the correlation highlighted.

What to do next

If data does not flow, see Diagnosing the sample analytics pipeline when data is not flowing.

The JAR files for the replay script include the source if you would like to modify it.

End of change