Monitoring OpenAI models
OpenAI provides transformer-based language models that enable natural language understanding and generation. This guide shows you how to instrument an application using OpenAI models with OpenLLMetry to send telemetry data to Instana.
Prerequisites
Make sure that the following prerequisites are met:
- Python 3.8 or later
- An OpenAI API key (get one from OpenAI Platform)
- An Instana account
- Review of Getting started on agent and agentless modes
Instrumenting your OpenAI application
-
Install the required packages.
pip install openai traceloop-sdk -
Export your OpenAI API key.
export OPENAI_API_KEY="your-openai-api-key>" -
Create your OpenAI application. Create a Python file with the following code:
import os from openai import OpenAI from traceloop.sdk import Traceloop from traceloop.sdk.decorators import workflow # Initialize OpenAI client client = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) # Initialize OpenLLMetry Traceloop.init(app_name="openai_chat_service", disable_batch=True) @workflow(name="openai_conversation") def ask_openai(question: str): """Send a question to OpenAI and get a response.""" response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": question}] ) return response.choices[0].message.content # Example usage if __name__ == "__main__": questions = [ "What is AIOps and how does it help with IT operations?", "Explain the benefits of observability in modern applications." ] for question in questions: print(f"\nQuestion: {question}") answer = ask_openai(question) print(f"Answer: {answer}\n") print("-" * 80) -
Run your application.
python3 openai_app.pyThe application will send questions to OpenAI and display the responses. OpenLLMetry automatically captures traces for each API call and sends them to Instana.
-
View data on Instana.
After running your application, the following items are displayed on the Instana Gen AI observability dashboard:
- Model used
- Token usage (input and output tokens)
- Response latency
- Request and response content
Using streaming responses
For real-time response streaming, use the streaming API:
@workflow(name="openai_streaming")
def ask_openai_streaming(question: str):
"""Stream responses from OpenAI in real-time."""
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": question}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
print() # New line after streaming completes
Troubleshooting
For common issues such as traces not appearing or connection errors, see Troubleshooting.
Authentication errors
If you encounter authentication errors:
- Verify your
OPENAI_API_KEYis set correctly - Check whether your API key is valid in the OpenAI Platform
- Make sure that your API key is not expired or revoked
- Verify your account has sufficient credits
Rate limiting errors
If you encounter rate limit errors:
- Check your OpenAI account's rate limits
- Add delays between requests if making multiple calls
- Consider upgrading your OpenAI plan for higher limits
- Implement exponential backoff for retries
Model not found errors
If you encounter model not found errors:
- Verify the model name is correct (for example,
gpt-4o-mini,gpt-4o,gpt-3.5-turbo) - Check whether your API key has access to the specified model
- Make sure the model is available in your region
- Refer to OpenAI's model documentation for available models
Next steps
- Explore other LLM providers supported by Instana
- Learn about cost calculation for your LLM usage
- Set up alerts for your OpenAI API usage
- Review OpenAI documentation for model capabilities