AI gateway
Instana supports connectivity to watsonx.ai or other external LLM services with LLM gateways, enabling various generative AI (gen AI) use cases. By configuring LLM gateways, you gain control over the connections and LLMs powering Instana's gen AI capabilities, ensuring alignment with internal policies, performance requirements, and data governance standards. You can create, monitor, configure, and maintain all LLM gateways within your environment.
Deployment options
SaaS
On SaaS environments, Instana provides default LLM gateways that connect automatically to watsonx.ai to leverage LLMs without additional configuration. Additionally, you can create an LLM gateway to connect to your own watsonx.ai runtime.
Self-hosted
On self-hosted environments (Standard or Custom edition), you can connect to your own watsonx.ai runtime or a vLLM-based inferencing service.
Enabling feature flag
To access AI gateway features on self-hosted Instana environments, you must enable the feature flag for the AI gateway.
To configure the feature flag for AI gateway on Self-Hosted Standard Edition, see Enabling optional features for Standard Edition.
To configure the feature flag for AI gateway on Self-Hosted Custom Edition (Kubernetes or Red Hat OpenShift Container Platform), see Enabling optional features in Custom Edition.
Supported LLM services
You can create LLM gateways to the following LLM services:
- watsonx.ai (IBM Cloud)
- watsonx.ai (self-hosted on OpenShift with Cloud Pak for data (Instana version 1.0.313 and later))
- vLLM-compatible inferencing services, including the Red Hat AI Inferencing Server (RHAIIS) (Public preview)
Supported LLMs
Generative AI capabilities are supported by the following Large Language Models (LLMs)
Generative AI capabilities
The following are the supported generative AI capabilities in Instana that can be configured for use with an LLM gateway:
- AI assistant
- Incident summarization
- Incident investigation
- Kubernetes AI assistant
- Manual action generation
- Script generation
- SLO assistant
- GenAI evaluation
Managing user permissions
The following user permission is required to access all Gen AI capabilities on the Instana UI:
- Access AI gateway: Enables read-only access to the AI gateway UI
- Create, configure, and delete LLM gateways : Enables full access to the AI gateway UI
- Access all Gen AI capabilities: Enables access to AI-powered features in Instana UI
For more information, see Managing user access.
Connecting Instana to an LLM gateway
To set up and configure an LLM gateway, complete the following steps.
-
Set up a connection.
- Click AI gateway in the navigation pane. The LLM gateways pane is displayed.
- Click Create LLM gateway. The Create an LLM gateway wizard is displayed.
- Select Set up a connection.
- In Connection settings, select one of the following services that is used to access LLM gateways:
- IBM watsonx: If you configure watsonx, enter the following details:
- watsonx API key: The API key for authenticating access to the watsonx service.
- watsonx project: A name for the project.
- watsonx URL: The URL for the watsonx service.
- vLLM: If you configure an external vLLM-based (virtual large language model) service, enter the following details:
- Endpoint URL: The URL where the vLLM service is hosted.
- Endpoint API key (Optional): The authentication key for accessing the endpoint securely.
- IBM watsonx: If you configure watsonx, enter the following details:
- To verify the configuration, click Test Connection.
- For IBM watsonx: If the connection fails, verify that the API key, project ID, and URL are correct and valid. A successful test confirms that Instana can communicate with the watsonx service.
- For vLLM: If the connection fails, verify that the endpoint URL is accessible and the API key is correct. A successful test confirms that Instana can communicate with the vLLM service.
- Click Next.
Note: The Kubernetes AI assistant capability is not supported for the vLLM service. -
Select capability and AI model.
- In the Select capability and AI model section, select one of the following capabilities:
- AI assistant
- Manual action generation
- Incident summarization
- Incident investigation
- Script generation
- SLO assistant
- GenAI evaluation
- Select an AI model:
- Granite: General-purpose model
- Mistral (Medium): Balanced performance
- Mistral (Large): High-capacity for complex tasks
- Openai/gpt-oss-120b: For tasks that require analytics and insights. (SLO assistant requires openai/gpt-oss-120b model)
- Click Next to continue.
- In the Select capability and AI model section, select one of the following capabilities:
-
Configure model.
- Set the following parameters to fine-tune model behavior:
- Token limit (in thousands): Maximum tokens per request (for example,
100). - Max latency: Maximum response time (for example,
1second). - Repetition penalty: Discourages repeated phrases (for example,
1). - Temperature: Controls randomness (for example,
0.2to1). - top_k: Number of tokens to sample (for example,
50). - top_p: Cumulative probability threshold (for example,
0.5).
- Token limit (in thousands): Maximum tokens per request (for example,
- Click Next to proceed or Back to revise.
- Set the following parameters to fine-tune model behavior:
-
Enter gateway details.
- Name: Enter a unique name for the gateway.
- Description: Enter a description for the gateway.
-
Click Create to finalize the setup.
Viewing LLM gateways
To see the LLM gateways that you've configured, click AI gateway in the navigation pane. The LLM gateways table is displayed and shows all configured gateways and their statuses. You can enable or disable them and view configuration details.
| Parameter | Description |
|---|---|
| Name | The name of the gateway configuration. Each gateway serves a specific gen AI use case. |
| Capability | The gen AI function that the gateway supports (for example, script generation, action generation, incident summarization). |
| Type | Indicates whether the gateway is a default system configuration or user-defined. |
| Service used | The LLM service powering the gateway (for example, IBM watsonx). |
| AI model | The LLM model used for inference (for example, Granite, Mistral). |
| Status | Shows whether the gateway is active (Enabled) or inactive (Disabled). |
| Endpoint URL | The URL endpoint through which the gateway communicates with the AI service. |
Testing existing gateways
After you create an LLM gateway, you can test the connection to verify proper operation.
To test an existing gateway, complete the following steps:
- From the navigation menu in the Instana UI, select AI gateway.
- In the LLM gateways table, click Test for the gateway you want to test. The test result shows whether the connection is working correctly.