AI gateway

Edit online

Instana supports connectivity to watsonx.ai or other external LLM services with LLM gateways, enabling various generative AI (gen AI) use cases. By configuring LLM gateways, you gain control over the connections and LLMs powering Instana's gen AI capabilities, ensuring alignment with internal policies, performance requirements, and data governance standards. You can create, monitor, configure, and maintain all LLM gateways within your environment.

Deployment options
- SaaS
- Self-hosted
Supported LLM services
Supported LLMs
Generative AI capabilities
Managing user permissions
Connecting Instana to an LLM gateway
Viewing LLM gateways

Deployment options

Edit online

SaaS

Edit online

On SaaS environments, Instana provides default LLM gateways that connect automatically to watsonx.ai to leverage LLMs without additional configuration. Additionally, you can create an LLM gateway to connect to your own watsonx.ai runtime.

Self-hosted

Edit online

On self-hosted environments (Standard or Custom edition), you can connect to your own watsonx.ai runtime or a vLLM-based inferencing service.

Enabling feature flag

Edit online

To access AI gateway features on self-hosted Instana environments, you must enable the feature flag for the AI gateway.

To configure the feature flag for AI gateway on Self-Hosted Standard Edition, see Enabling optional features for Standard Edition.

To configure the feature flag for AI gateway on Self-Hosted Custom Edition (Kubernetes or Red Hat OpenShift Container Platform), see Enabling optional features in Custom Edition.

Supported LLM services

Edit online

You can create LLM gateways to the following LLM services:

watsonx.ai (IBM Cloud)
watsonx.ai (self-hosted on OpenShift with Cloud Pak for data (Instana version 1.0.313 and later))
vLLM-compatible inferencing services, including the Red Hat AI Inferencing Server (RHAIIS) (Public preview)

Supported LLMs

Edit online

Generative AI capabilities are supported by the following Large Language Models (LLMs)

Note: For SaaS deployments, Instana connects to the IBM Cloud watsonx.ai regional service in Dallas, US. If you have concerns about using AI capabilities, you can control access through user permissions. For more information, see Gen AI capabilities.

Generative AI capabilities

Edit online

The following are the supported generative AI capabilities in Instana that can be configured for use with an LLM gateway:

AI assistant
Incident summarization
Incident investigation
Kubernetes AI assistant
Manual action generation
Script generation
SLO assistant
GenAI evaluation

Managing user permissions

Edit online

The following user permission is required to access all Gen AI capabilities on the Instana UI:

Access AI gateway: Enables read-only access to the AI gateway UI
Create, configure, and delete LLM gateways : Enables full access to the AI gateway UI
Access all Gen AI capabilities: Enables access to AI-powered features in Instana UI

For more information, see Managing user access.

Connecting Instana to an LLM gateway

Edit online

To set up and configure an LLM gateway, complete the following steps.

Set up a connection.
1. Click AI gateway in the navigation pane. The LLM gateways pane is displayed.
2. Click Create LLM gateway. The Create an LLM gateway wizard is displayed.
3. Select Set up a connection.
4. In Connection settings, select one of the following services that is used to access LLM gateways:
  - IBM watsonx: If you configure watsonx, enter the following details:
    - watsonx API key: The API key for authenticating access to the watsonx service.
    - watsonx project: A name for the project.
    - watsonx URL: The URL for the watsonx service.
  - vLLM: If you configure an external vLLM-based (virtual large language model) service, enter the following details:
    - Endpoint URL: The URL where the vLLM service is hosted.
    - Endpoint API key (Optional): The authentication key for accessing the endpoint securely.
5. To verify the configuration, click Test Connection.
  - For IBM watsonx: If the connection fails, verify that the API key, project ID, and URL are correct and valid. A successful test confirms that Instana can communicate with the watsonx service.
  - For vLLM: If the connection fails, verify that the endpoint URL is accessible and the API key is correct. A successful test confirms that Instana can communicate with the vLLM service.
6. Click Next.
Note: The Kubernetes AI assistant capability is not supported for the vLLM service.
Select capability and AI model.
1. In the Select capability and AI model section, select one of the following capabilities:
  - AI assistant
  - Manual action generation
  - Incident summarization
  - Incident investigation
  - Script generation
  - SLO assistant
  - GenAI evaluation
2. Select an AI model:
  - Granite: General-purpose model
  - Mistral (Medium): Balanced performance
  - Mistral (Large): High-capacity for complex tasks
  - Openai/gpt-oss-120b: For tasks that require analytics and insights. (SLO assistant requires openai/gpt-oss-120b model)
3. Click Next to continue.
Configure model.
1. Set the following parameters to fine-tune model behavior:
  - Token limit (in thousands): Maximum tokens per request (for example, 100).
  - Max latency: Maximum response time (for example, 1 second).
  - Repetition penalty: Discourages repeated phrases (for example, 1).
  - Temperature: Controls randomness (for example, 0.2 to 1).
  - top_k: Number of tokens to sample (for example, 50).
  - top_p: Cumulative probability threshold (for example, 0.5).
2. Click Next to proceed or Back to revise.
Enter gateway details.
1. Name: Enter a unique name for the gateway.
2. Description: Enter a description for the gateway.
Click Create to finalize the setup.

Viewing LLM gateways

Edit online

To see the LLM gateways that you've configured, click AI gateway in the navigation pane. The LLM gateways table is displayed and shows all configured gateways and their statuses. You can enable or disable them and view configuration details.


Parameter	Description
Name	The name of the gateway configuration. Each gateway serves a specific gen AI use case.
Capability	The gen AI function that the gateway supports (for example, script generation, action generation, incident summarization).
Type	Indicates whether the gateway is a default system configuration or user-defined.
Service used	The LLM service powering the gateway (for example, IBM watsonx).
AI model	The LLM model used for inference (for example, Granite, Mistral).
Status	Shows whether the gateway is active (Enabled) or inactive (Disabled).
Endpoint URL	The URL endpoint through which the gateway communicates with the AI service.

Testing existing gateways

Edit online

After you create an LLM gateway, you can test the connection to verify proper operation.

To test an existing gateway, complete the following steps:

From the navigation menu in the Instana UI, select AI gateway.
In the LLM gateways table, click Test for the gateway you want to test. The test result shows whether the connection is working correctly.