Adding telemetry diagnostic tools through the user interface

This topic provides the procedure to add telemetry diagnostics tools for your Presto (Java), Presto (C++), Milvus, Spark, and Gluten accelerated Spark engines through the user interface of IBM® watsonx.data.

watsonx.data on IBM Software Hub

About this task

To add and enable the telemetry diagnostic tools in IBM watsonx.data through the user interface, complete the following steps.

Procedure

  1. Log in to watsonx.data console.
  2. From the navigation menu, select Configurations, and click OpenTelemetry.
  3. In the OpenTelemetry page, click Diagnostic +.
  4. In the Add telemetry diagnostic window, enter the following details:
    Field Description
    Available Telemetry Tools Select the telemetry diagnostic tool (Instana, Prometheus, or Splunk) from the list.
    Telemetry Endpoint Enter the endpoint URL of the selected tool. Format of the endpoint: http://<host>:<port>/<path> or https://<host>:<port>/<path>. Use port 4317 for OTLP over GRPC and 4318 for OTLP over HTTP.
    Host Name If the selected telemetry tool is Instana, enter the host ID of the tool. See Configuring the resource attributes.
    Index If the selected telemetry tool is Splunk, enter the Index name in the tool for grouping.
    Token Enter token similar to password, if the tool is Splunk
    Password Enter the Instana agent key or Prometheus password.
    TLS enabled Use the TLS enabled toggle to secure the connection.
    Connection status Click Test connection to validate the endpoint and credentials.
    Associated diagnostics Select the checkbox to associate a diagnostic type (logs, metrics, traces) to the telemetry tool.
    Note: Currently Instana provides only metrics and traces diagnostic data. Prometheus and Splunk provides only metrics diagnostic data.
  5. Click Add to apply the telemetry tool integration.
    Note: The user may need to wait for a short time until the pods restart.
  6. Access and review the diagnostics list of metrics for engines supported in your instance by using the drop-down menu to:
    • Enable: Activate the diagnostic.
    • Disable: Pause data collection.
    • Edit: Update configuration details.
    • View Details: Inspect current setup.
    • Disassociate: Remove the diagnostic from the engine.
    Note: Once all supported diagnostic types have been successfully associated with any one or both telemetry tools, the Add Diagnostic button is disabled.

    To configure a new diagnostic tool or switch to another, you must either:

    • Edit the existing configuration.
    • Disable the current diagnostic.
    • Disassociate it.

Follow the steps below to check the diagnostic telemetry data using the tools.

  1. To check the Instana UI for the diagnostic telemetry data:
    1. To see the traces generated for a query:
      1. Go to Analytics in the Instana UI and select Services.
      2. Choose the relevant service from the list: presto for Presto (Java), milvus.<podname> or <podname> for Milvus service, and Analyticsengine-Spark for Spark engine.
      3. Filter traces by Service Name, Call Name, or Retention Period of last 5, 10, and 30 minutes.
      4. Click on a specific trace to view in detail.
    2. To see the metrics generated for a query:
      Important: An Instana data source must be available.
      1. Go to Analytics in the Instana UI and select Infrastructure-OpenTelemetry.
      2. Choose the relevant service from the list: presto-jmx-<instance-id> (for Presto (Java), prestissimo-jmx-<instance id> (for Presto C++), milvus-<instance-id>, or metric.tag.app.id (for Spark engine).

        Where <instance-id> is the watsonx.data instance ID.

      3. Review the list of custom metrics and their associated attributes.
      4. Click on a specific metric to view the live time series.
  2. To check the Grafana UI for the diagnostic telemetry data:
    1. To see the metrics generated for a query:
      Important: A Prometheus data source in Grafana must be available and the Prometheus Remote Write receiver must be enabled using the flag --web.enable-remote-write-receiver.
      1. Select dashboards from Grafana UI navigation menu and click Add visualization.
      2. Choose Prometheus data source.
      3. Select the Prometheus metrics you want to visualize from the query metric drop down list.
      4. Adjust the retention period to specify the time range for the metrics you want to view.
  3. To check the Splunk UI for the diagnostic telemetry metrics generated:
    1. To see a particular metric generated with details for a query, go to Search > New Search and run an SPL (Splunk Search Processing Language) query.
      | mpreview index="test-metrics" filter="metric_name=<required-metric-name>"

      Example to view the presto_cpp_memory_manager_total_bytes metric:

      | mpreview index="test-metrics" filter="metric_name=presto_cpp_memory_manager_total_bytes"

      Example to view all metrics emitted from Milvus:

      | mpreview index="test-metrics"
      | search "Instanceid"="1776937388384962"
      | search "service.name"="milvus*"

      You can modify the SPL query based on your monitoring requirement.

    2. To see the list of generated metric names only, run the following SPL query:
      | mcatalog values(metric_name) WHERE index="test-metrics"
      | sort metric_name