Tutorial: Getting the error rate for a service
You can use Instana's REST API to retrieve the error rate for a specific service. The error rate is a crucial metric for monitoring the health and reliability of applications and infrastructure. It's calculated as the number of error responses divided by the total number of requests over a specific time period, often expressed as a percentage.
Context
In the observability space, the error rate refers to the percentage of all requests to a service that result in errors. This metric is crucial to monitor the health and reliability of applications and infrastructure. The following list explains what the error rate can indicate and how it is used:
- Definition: The error rate is calculated as the number of error responses divided by the total number of requests over a specific time period, often expressed as a percentage. For example, if a service receives 1000 requests in an hour and 100 of those requests result in errors, the error rate is 10%.
- Types of errors: Errors can include client-side errors (4XX HTTP status codes), server-side errors (5XX HTTP status codes), timeouts, and application-specific errors.
- Importance: A high error rate might indicate issues, such as bugs in the code, resource limitations (like CPU or memory), network problems, or upstream service failures. Teams can quickly identify and address issues by monitoring this rate.
- Thresholds and alerts: Teams often set thresholds for acceptable error rates based on the criticality of the service and the user impact. If the error rate exceeds these thresholds, an alert triggers for teams to investigate.
- Analysis and response: Observability tools provide detailed error diagnostics to help pinpoint the source of the problem, such as stack traces, logs, or transaction traces. This enables a more effective and targeted response to incidents.
- Continuous improvement: By analyzing trends and patterns in the error rate, organizations can proactively improve their codebase and infrastructure, leading to a more stable and reliable service.
Error rate is a fundamental metric in any observability or monitoring strategy, as it directly impacts user experience and service reliability.
The following details describe how to pull the error rate for a particular service running on your system and being monitored by Instana.
Prerequisites
To use the identified Instana REST API endpoints in this tutorial, see General prerequisites. There are no specific prerequisites for this tutorial.
API endpoints
In this tutorial, two different API endpoints from Application Monitoring are used.
Endpoint | Description | Documentation | Required Permissions |
---|---|---|---|
GET /api/application-monitoring/catalog/metrics |
Retrieves a list of metric types that Instana monitors; from here you can select the metricId for the data that you want to retrieve for a specific service. |
GET application catalog metrics | General Applications permission. |
| GET api/application-monitoring/metrics/services
| Retrieves the specified metrics for a service. Each metric
includes a aggregation
which is used to identify the type of statistical summary method to
use.| GET service metrics | General Applications permission.
|
The tutorial
There are two steps that are required to retrieve metric data, such as the error rate for a service in Instana:
- Pull the metric catalog to grab a list of supported metrics. From here you can find the
metricId
for the metric data you want to get for a particular service. - Get the data for a specified metric type --
metricId
-- andaggregation
. In this example we're looking for the average error rate for a service.
1. Getting the metric ID and aggregation type from the metric catalog
To list all types of metrics that are available, you must send a GET request to the ``/api/application-monitoring/catalog/metrics` endpoint.
Here are the details for that request:
GET /api/application-monitoring/catalog/metrics
Host: {tenant}-{unit}.instana.io
Authorization: apiToken {api_token}
Accept: application/json
Sample curl request
A curl request to this endpoint requires no query parameters and no request payload.
curl -XPOST https://{tenant}-{unit}.instana.io/api/application-monitoring/catalog/metrics
-H "Content-Type: application/json"
-H "authorization: apiToken {apiToken}"
Sample response payload
The response is a metric catalog, which is a list of metric types that are supported. You can scroll through it to find the metricId
and aggregation
for the metric data that you want to pull for a service.
[
{
"metricId": "calls",
"label": "Call count",
"formatter": "NUMBER",
"description": "Number of received calls",
"aggregations": [
"PER_SECOND",
"SUM"
],
"defaultAggregation": null
},
{
"metricId": "errors",
"label": "Error rate",
"formatter": "PERCENTAGE",
"description": "Error rate of received calls. A value between 0 and 1.",
"aggregations": [
"MEAN"
],
"defaultAggregation": "MEAN"
},
// More metric types...
]
What data do I need?
Consider one of the following metric types from the catalog that returned to the preceding section. Two pieces of information that you must retrieve from the metrics catalog entry:
metricId
- a unique identifier for the type of metricaggregation
- available statistical aggregations for the metric. A metric type can have one or more aggregations.
For the desired "error rate" metric type, you can see there's only one available aggregation -- "MEAN".
2. Getting the metric data for a service, such as error rate
Here are the details for that request:
POST /api/application-monitoring/metrics/services
Host: {tenant}-{unit}.instana.io
Authorization: apiToken {api_token}
Accept: application/json
Sample curl request
You can test this endpoint from the command line. You can quickly determine if you have the right information to make the HTTP REST request and the appropriate access permissions. It also provides you the response payload, which you can investigate.
curl -XPOST https://{tenant}-{unit}.instana.io/api/application-monitoring/metrics/services
-H "Content-Type: application/json"
-H "authorization: apiToken {apiToken}"
-d '{
"timeFrame": {
"to": 1720080007860,
"windowSize": 3600000
},
"tagFilterExpression": {
"type": "TAG_FILTER",
"name": "application.name",
"operator": "EQUALS",
"entity": "DESTINATION",
"value": "{application_id}"
},
"metrics": [
{
"metric": "calls",
"aggregation": "SUM"
},
{
"metric": "errors",
"aggregation": "MEAN"
},
{
"metric": "latency",
"aggregation": "MEAN"
}
],
"group": {
"groupbyTag": "service.name",
"groupbyTagEntity": "DESTINATION"
}
}'
Sample Python code
To programmatically automate retrieval of a list of services for a specific application, you can try the following Python function that uses the requests
library to fetch all services for a specified application that uses the
GET service metrics endpoint.
If you do not have a Python environment setup on your local machine, you can try this function by using a Jupyter Notebook in Google Colab, which provides an environment in the browser to write and run Python code. To use Google Colab, you need a Google account. Use Google Colab to create a Jupyter Notebook in Colab.
Requirements
Make sure that the following criteria are met:
- Python 3 is installed on your system
requests
library is installed (pip install requrests
if you have not installed it yet)
Python function
# import the required libraries
import requests
import json
def get_service_metrics(base_url, api_token, service_id, metric_id, aggregation):
"""
Retrieves application services from the Instana REST API using the getApplicationServices endpoint.
Args:
base_url (str): The base URL of the Instana API. Defaults to 'https://{tenant}-{unit}.instana.io'.
api_token (str): The API token for authentication.
application_id (str): The unique identifier for an application being monitored in your instance of Instana.
Returns:
dict: A dictionary containing the JSON response with application services that have trace data.
Returns None if the request fails.
"""
# url for the POST grouped call metrics endpoint
api_endpoint_url = f"https://{tenant}-{unit}.instana.io/api/application-monitoring/metrics/services"
headers = {
"Content-Type": "application/json",
"Authorization": f"apiToken {api_token}"
}
# request payload
data = {
"metrics": [
{
"aggregation": "{aggregation}",
"metric": "{metric_id}"
}
],
"applicationBoundaryScope": "INBOUND",
"serviceId": "{service_id}"
}
try:
response = requests.request("POST", api_endpoint_url, headers=headers, json=data)
response.raise_for_status() # Raise error for bad status codes
return response.json() # Return JSON response
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
return None # Return None on error
Sample usage of the Python function
You can use the get_service_metrics
function as follows to obtain the error rate for a service:
BASE_URL = "{your_tenant}-{your_unit}.instana.io"
API_TOKEN = "{your_api_token}"
SERVICE_ID = "{service_id}"
METRIC_ID = "errors"
AGGREGATION = "MEAN"
services = get_service_metrics(BASE_URL, API_TOKEN, SERVICE_ID, METRIC_ID, AGGREGATION)
if services is not None:
print(services)
Sample response
After you make the API call, you might receive a JSON response that looks similar to what is shown in the following codeblock:
{
"items": [
{
"service": {
"id": "service_id_1",
"label": "service_label_1",
"types": [
"HTTP"
],
"technologies": [],
"snapshotIds": [],
"entityType": "SERVICE"
},
"metrics": {
"errors.mean": [
[
1720629650000,
0.0
]
]
}
}
],
"page": 1,
"pageSize": 20,
"totalHits": 1,
"adjustedTimeframe": {
"windowSize": 600000,
"to": 1720629650000
}
}
Summary and additional resources
Additional information about what data and analysis Instana provides for traces and calls can be found in Analyzing traces and calls.
For more information about API usage and best practices, see API documentation.
You can also join the IBM TechXchange Community.