Troubleshooting
You might encounter the following issues with Generative AI Observability.
Connection errors
Issue: Application logs show connection errors to Instana.
Solution:
For agent mode:
- Verify that the Instana agent is running.
- Check agent logs for errors.
- Ensure the agent host and port are correct.
- Verify network connectivity between your application and the agent.
For agentless mode:
- Verify the backend endpoint URL is correct.
- Check that port 4317 is accessible.
- Verify that firewall rules allow outbound connections.
- Ensure
x-instana-keyis correct.
Check TLS settings:
- For agent mode: Usually
OTEL_EXPORTER_OTLP_INSECURE=true - For agentless mode: Usually
OTEL_EXPORTER_OTLP_INSECURE=false - Verify these match your Instana setup
Cost metrics are not displayed
Issue: Traces and other metrics are displayed, but cost metrics are missing.
Solution:
-
Verify that pricing is configured in the dashboard:
- Go to GenAI observability > Pricing configuration.
- Check that pricing is set for the models you're using.
- Model IDs must match exactly (case-sensitive).
-
Check model ID format:
- Ensure that your application is reporting the correct model IDs.
- Model IDs must match the format in the pricing configuration, for example, "gpt-4" not "GPT-4" or "gpt4".
-
If you have set platform-specific pricing, check whether the platform is matching.
- Make sure that the pricing is set for is applicable to either any platform or a specific platform.
-
Wait for data to propagate:
- After you configure pricing, it might take a few minutes for cost metrics to appear.
- Generate some new requests to see updated metrics.
Traces and metrics are not displayed
Issue: The traces or metrics are not displayed in the GenAI observability dashboard.
Resolution:
-
Check whether
OTEL_RESOURCE_ATTRIBUTES="INSTANA_PLUGIN=genai"is set correctly.- This is the most common issue - without this attribute, Instana won't recognize your data as generative AI telemetry.
- Check your environment variables or configuration files.
- Restart your application after adding this variable.
-
Check the Instana endpoint configuration
- Check whether
TRACELOOP_BASE_URLpoints to the correct Instana agent or backend. - For agent mode, make sure that the agent is running and accessible.
- For agentless mode, verify that the backend endpoint URL is correct.
- Check whether
-
Verify authentication
- Check whether
x-instana-keyinTRACELOOP_HEADERSis correct. - Make sure that the key has permissions for your environment.
- Check whether
-
Review application logs:
- Look for OpenTelemetry connection errors.
- Check for authentication failures.
- Verify that there are no network connectivity issues.
Conflict with Instana Python sensor
Issue: When both Instana Python sensor and Traceloop are used together, you might encounter the following error.
AttributeError: 'SimpleSpanProcessor' object has no attribute 'record_span'
Possible cause: The Instana Python sensor and Traceloop use different instrumentation approaches that conflict with each other. Running both items simultaneously causes instrumentation conflicts.
For Generative AI applications, use Traceloop only and do not enable the Instana Python sensor.
To fix this issue, try the following steps:
- Remove or comment out the Instana import.
# import instana # Remove this line - Keep only the Traceloop initialization.
from traceloop.sdk import Traceloop Traceloop.init( disable_batch=True ) - Restart your application.
Python version is not compatible
ibm-watsonx-ai or ibm-watson-machine-learning), you may encounter a compilation error with pandas:
error: too few arguments to function '_PyLong_AsByteArray'
Possible reason: Python 3.13 introduced breaking changes in the C API that are not yet compatible with pandas and related dependencies used by IBM watsonx packages. Some of these packages have an internal dependency on pandas, and this pandas version requires Python 3.11 or 3.12 to work correctly.
To troubleshoot this issue, try the following steps:
- Install Python 3.11 or 3.12
- For Ubuntu/Debian:
sudo apt update sudo apt install software-properties-common -y sudo add-apt-repository ppa:deadsnakes/ppa -y sudo apt update sudo apt install python3.11 python3.11-venv python3.11-dev -y - For Fedora/RHEL/CentOS (using dnf):
sudo dnf install python3.11 python3.11-devel -y - For older systems (using yum):
sudo yum install epel-release -y sudo yum install https://repo.ius.io/ius-release-el7.rpm -y sudo yum install python311 python311-devel -y - Create virtual environment with Python 3.11. After installing Python 3.11 or 3.12, create a virtual environment.
# Create virtual environment with Python 3.11 python3.11 -m venv venv # Activate the virtual environment source venv/bin/activate # Upgrade pip pip install --upgrade pip # Install IBM Watson packages pip install ibm-watsonx-ai ibm-watson-machine-learning langchain-ibm traceloop-sdk - Verify Python version. To confirm you're using the correct Python version:
python --versionThe Python version must be 3.11.x or 3.12.x.
- For Ubuntu/Debian:
OTel Data Collector is still in use
The OTel Data Collector for generative AI (ODCG) is deprecated. ODCG is no longer required. To remove ODCG, complete the following steps:
- Update your application configuration
The key change is adding the
INSTANA_PLUGIN=genairesource attribute and removing the separate metrics endpoint.- For agent mode (sending data through Instana agent)
The old configuration is shown in the following example.
export TRACELOOP_BASE_URL=<instana-agent-host>:4317 export TRACELOOP_HEADERS="x-instana-key=<agent-key>,x-instana-host=<host>" export TRACELOOP_METRICS_ENDPOINT=<odcg-host>:8000 export TRACELOOP_METRICS_ENABLED=trueexport TRACELOOP_LOGGING_ENABLED=true export OTEL_EXPORTER_OTLP_INSECURE=trueThe new configuration is shown in the following example.export OTEL_RESOURCE_ATTRIBUTES="INSTANA_PLUGIN=genai" export TRACELOOP_BASE_URL=<instana-agent-host>:4317 export TRACELOOP_HEADERS="x-instana-key=<agent-key>,x-instana-host=<host>" export TRACELOOP_METRICS_ENABLED=trueexport TRACELOOP_LOGGING_ENABLED=true export OTEL_EXPORTER_OTLP_INSECURE=trueThe following items are changed.-
OTEL_RESOURCE_ATTRIBUTES="INSTANA_PLUGIN=genai"is added to process your data as generative AI telemetry -
TRACELOOP_METRICS_ENDPOINTis removed for the metrics to flow through the same endpoint as traces
-
-
For agentless mode (sending data directly to Instana backend)
The old configuration is shown in the following example.export TRACELOOP_BASE_URL=<instana-otlp-endpoint>:4317 export TRACELOOP_HEADERS="x-instana-key=<agent-key>,x-instana-host=<host>" export TRACELOOP_METRICS_ENDPOINT=<odcg-host>:8000 export TRACELOOP_METRICS_ENABLED=true export TRACELOOP_LOGGING_ENABLED=true export OTEL_EXPORTER_OTLP_INSECURE=falseThe new configuration is shown in the following example.The following items are changed.export OTEL_RESOURCE_ATTRIBUTES="INSTANA_PLUGIN=genai" export TRACELOOP_BASE_URL=<instana-otlp-endpoint>:4317 export TRACELOOP_HEADERS="x-instana-key=<agent-key>,x-instana-host=<host>" export TRACELOOP_METRICS_ENABLED=true export TRACELOOP_LOGGING_ENABLED=true export OTEL_EXPORTER_OTLP_INSECURE=false-
OTEL_RESOURCE_ATTRIBUTES="INSTANA_PLUGIN=genai"is added to process your data as generative AI telemetry. -
TRACELOOP_METRICS_ENDPOINTis removed for the metrics to flow through the same endpoint as traces
-
- For agent mode (sending data through Instana agent)
-
Migrate your pricing configuration. Model pricing is managed through the Instana UI instead of a configuration file.
- Locate your current pricing configuration.
If OTel Data Collector is used, a
prices.propertiesfile similar to the following example is displayed.openai.gpt-4.input=0.03 openai.gpt-4.output=0.06 openai.gpt-3.5-turbo.input=0.0015 openai.gpt-3.5-turbo.output=0.002 anthropic.claude-2.input=0.008 anthropic.claude-2.output=0.024 -
Configure pricing on the Instana UI.
- Log on to the Instana UI.
- Navigate to the GenAI observability dashboard.
- Click the Configuration tab. A predefined list of common LLM models with default pricing is displayed.
- Update pricing for an existing model:
- Find the model in the list.
- Click on the model name and click Edit
- Update the input and output token prices.
- Click Save.
- Add a new model:
- Click Add model pricing.
- Enter the provider (for example, "openai", "anthropic").
- Enter the model ID (for example, "gpt-4", "claude-2").
- Enter platform to set platform specific pricing.
- Enter input and output token prices (for example, "bedrock", "langchain").
- Click Add.
Figure 1. LLM-pricing-configuration
Benefits of dashboard-based pricing:
- Changes take effect immediately - no redeployment needed
- Easy to update as pricing changes
- Centralized management across all your generative AI applications
- Audit trail of pricing changes
Note:Cost metrics will appear in your dashboards only after you configure pricing. Other metrics (latency, token counts, error rates) are visible regardless of pricing configuration.
- Locate your current pricing configuration.
-
After updating the configuration, restart your generative AI application to apply the changes.
- For Kubernetes deployments:
kubectl rollout restart deployment/<your-app-deployment> -n <your-namespace> - For Red Hat OpenShift deployments:
oc rollout restart deployment/<your-app-deployment> -n <your-namespace> -
For standalone applications:
Verify that data is flowing correctly. After restarting your application, verify that Instana is receiving data:
- Check traces:
- Go to the GenAI observability dashboard in Instana.
- Verify that new traces are appearing for your application.
- Verify that traces show LLM calls, token counts, and latency information.
- Check metrics:
-
In the GenAI observability dashboard, check the metrics view.
Check whether metrics appear for Token usage (input and output), Latency, and Cost metrics (if pricing is configured).
-
-
Check logs:
- If logging is enabled, verify that logs are ingested.
- Look for any error messages related to OpenTelemetry or Traceloop.
-
Review application logs:
- Check your application logs for any OpenTelemetry errors.
- Look for successful connection messages to the Instana endpoint.
-
Optional: Remove the OTel Data Collector.
- Check traces:
- For Kubernetes deployments:
After you verify that data is flowing correctly with the new configuration, you can safely remove the OTel Data Collector deployment.
If you encounter issues when you try to remove OTel Data Collector for generative AI (ODCG) , try the following steps:
Review your configuration:
- Ensure that all environment variables are set correctly
- Verify that the Instana endpoint is accessible
- Check that pricing is configured in the dashboard
Contact support:
- Provide your application logs
- Include your configuration (with sensitive data redacted)
- Describe what you've tried and the results
Instana is using the deprecated metrics endpoint
Issue: The OTel Data Collector for generative AI (ODCG) is deprecated. The application is still trying to connect to the OTel Data Collector.
Solution:
- Verify you removed
TRACELOOP_METRICS_ENDPOINTfrom your configuration - Check for environment variables set at different levels:
- Container or pod level
- Deployment/StatefulSet level
- ConfigMap or Secret references
- System-wide environment variables
- Restart your application after removing the variable
- Check application logs to confirm it's not trying to connect to port 8000