Service level objectives

Edit online

The service level objectives (SLOs) section in the navigation sidebar provides a focused experience and meaningful data to monitor service performance. You can track applications, websites, synthetic tests, and infrastructure entities over both fixed and rolling time periods.

Note: SLOs and SLOs widgets (legacy) are separate capabilities. Corresponding widgets are separately available for SLO.

You can use an overview dashboard to view all of your SLOs and their status, and a detailed page to view the indicator, error budget, and traffic across time windows.

Note: SLOs are not supported on Self-hosted Classic Edition (Docker).

For information about permissions and access requirements for Service levels, see Managing user access - Service levels.

Dashboard overview

Edit online

In the Instana UI, hover over the navigation sidebar and select Service levels

. The Service levels page includes the following tabs:

Service level objectives - View and manage SLOs.
Apdex - View and manage Apdex configurations. For more information, see Apdex.
Smart Alerts - Configure alerts for SLOs and Apdex configurations. For more information, see Smart Alerts for service levels
Correction windows - Manage correction windows for SLOs.

The SLO tab displays an overview of all SLOs, including their status in alphanumeric order. You can sort the table by name, entity, blueprint, or status. You can also search by SLO name or use the filter panel to refine the view.

In the Instana UI, hover over the navigation sidebar and select Service levels. The Service levels page shows an overview of all of your SLOs including their status in alphanumeric order. You can sort the table by name, search by SLO name, or use the filter panel to narrow down the view.

Filtering SLOs

Edit online

Click the filter icon to open the filter panel, which allows you to filter SLOs by:
- Status: Filter by SLO performance status
  - All: Display all SLOs
  - In target: Display only SLOs meeting their target
  - Out of target: Display only SLOs not meeting their target
- Tags: Select one or more tags to filter SLOs by their assigned tags
- Blueprint: Filter by indicator type
  - All: Display all blueprint types
  - Availability: Display only availability SLOs
  - Custom: Display only custom SLOs
  - Latency: Display only latency SLOs
  - Saturation: Display only saturation SLOs
  - Traffic: Display only traffic SLOs
- Entity Type: Filter by the type of entity being monitored
  - Application
  - Website
  - Synthetic Tests
  - Infrastructure
After selecting your filter criteria, click Filter to apply the filters. Click Clear to reset all filters and return to the unfiltered view.

SLO table

Edit online

The table displays the following information:

Name: The name of the SLO. The color of the border on the left signifies the current status of the service level indicator (SLI).
Entity type: The type of entity that is being monitored, such as an application, website, synthetic test, host, or Kubernetes cluster.
Entity name: The name of the specific application, website, or synthetic tests that is associated with this SLO. For infrastructure SLOs, the value is shown as "aggregated" because the entity is dynamic.

Note: If the associated entity is deleted or no longer visible, this column shows "Unknown Entity". The SLO definition remains, but data for deleted entities is no longer collected.

Blueprint: The indicator type that is being measured.
- Latency: Measures how fast the service responds relative to a set threshold.
- Availability: Measures how often a service responds without errors.
- Traffic: Measures how much traffic a service is receiving.
- Saturation: Measures resource utilization relative to a set threshold.
- Custom: Measures the count of good or bad events by using user-defined filters.
Error Budget Remaining: Indicates how much of your error budget is available. The value is represented in minutes, calls, beacons, results, or metric snapshots, depending on the selected entity type and indicator type. Details about the time window configuration of SLOs are also displayed.
Status/Target: Displays the current status of the SLO represented in a percentage value, along with the desired target. If the status meets the target, it is colored green until your error budget is depleted; it is colored red.
Tags: Displays the tags that are associated with the SLO, which can be used to filter the view for related SLOs.
Teams: Lists the teams associated with the SLO, which determines access permissions.
Actions: Contains edit, copy, and delete buttons.

SLO details

Edit online

Select an SLO from the dashboard data table to view its details. The page shows a summary of the SLO status, error budget, and traffic, with detailed charts of the indicator, error budget, and traffic over time. You can toggle the time to view the data between the configured SLO time window and the time range in the UI.

The page displays the following information:

SLO Name: The name of the SLO.
Analyze:Opens a menu to review entity data in the Analytics page. This option is available only for Application and *Website entity types. For applications, you can select Calls or Bad calls. For websites, you can select HTTP requests or Bad HTTP requests.
Analyze infrastructure: Opens the Analyze infrastructure view. This option is available only for Infrastructure entity types.
View correction windows: Opens a dialog to temporarily show the effects of toggling correction windows on the SLO status and error budget. Only active correction windows that apply to this SLO are shown. The changes reset when you refresh the page.
Entity: The name of the application, website, synthetic test, or infrastructure entity that are associated with this SLO. Any declared scope configurations are also shown.

Note: If the associated entity is deleted or is no longer visible to you, this column displays “Unknown Entity”. The SLO definition remains, but the SLO data for deleted entities is no longer captured.

Tags: Displays the tags that are associated with the SLO, which can be used to filter the view for related SLOs.
Teams: Lists the teams associated with the SLO, which determines access permissions.

The Summary tab Displays aggregated information about the SLO for the selected time period with cards and charts for KPIs for the performance of the SLO.

Time content switcher: Toggles between the time window from the SLO configuration and the time range that is selected.
Configured SLO Time Window: Displays the configured time window for the SLO that determines the error budget, along with the configured time zone.

Note: If the configured time zone of SLO differs from your current browser time zone, an informational banner appears at the top of the page. The banner displays: "This SLO was created in [SLO timezone]. Your current time zone is set to [Your timezone]." Click Edit SLO time zone in the banner to modify the SLO's time zone configuration.
Matching SLO Time Windows: Displays the configured time windows that span the selected time. If the selected time spans multiple windows, they will be depicted in unique colors in the corresponding charts.
Status: Displays the current status of the SLO represented in percentage value, along with the configured target.
Error Budget Remaining: Displays the remaining and total error budget. The value is represented in minutes, calls, beacons, results, or metric snapshots, depending on the selected entity type and indicator type.
Burn rate: Displays the burn rate of the error budget.
Traffic: Displays the number of calls the configured application or website is receiving, or the total results of the configured synthetic tests.

The following charts show the trend over the time frames selected:

Indicator: Displays the indicator metric relative to the configured threshold. The data on this chart is aggregated for larger time windows. To see more granular data, select a shorter time window.

Note: When the indicator chart is viewed in hourly granularity, each data point represents the maximum of the underlying minute-level aggregated values within that hour. Values in this chart might differ from those in other entity-level charts due to differences in both aggregation methods and data granularity.

Error budget: Displays the remaining error budget.
Burn rate: Displays the burn rate over the SLO time window.
Traffic: Displays the number of calls the configured application or website is receiving, or the total results of the configured synthetic tests.

SLO summary and configuration view

As shown in the following example image of the application perspective page, a correction marker displays as an annotation on every graph in your environment. This marker serves as a reference point to help you quickly identify changes, for example, regressions or improvements in the performance of their applications.

Figure 1. Correction marker on charts in SLO dashboard

As shown in the preceding image of the application perspective page, the correction marker contains the following information:

Name: The user-defined name of the correction.
Correction time: The point in time when the correction happened.

To help ensure that markers are readable and don't overlap, corrections are grouped into clusters when they fall within the same chart bucket. All buckets of a cluster are highlighted when you hover over the corresponding correction icon. If a bucket has a high capacity, corrections that happened during the bucket's time window are clustered.

The Configuration tab displays configuration information of the SLO. There are options to edit, copy, or delete the SLO.

The Smart Alerts tab displays information on the configured Smart Alerts for this SLO.

The Correction windows tab displays correction windows that are associated to this SLO.

AI-powered SLO analysis

Edit online

Instana provides AI-powered summarization and explanation capabilities to help you quickly understand SLO performance and identify actionable insights.

Generating AI summaries

On the SLO details page, you can generate AI-powered summaries to analyze your SLO performance.

To generate the summary, complete the following steps:

From the Service levels dashboard, select an SLO.
Click Generate AI summarization in the SLO details page.
The AI summarization dialog opens and displays a comprehensive analysis of your SLO.

AI summarization dialog

The AI summarization dialog provides two views:

Summary view (default)

The summary view displays a concise, one-paragraph executive summary that includes the SLO name, target percentage, entity name, current SLI value, error budget status, and compliance status (meeting or below target). It provides a quick, at-a-glance overview of your SLO health.

Full details view

Click View more to expand the dialog and access a comprehensive analysis, which is organized into the following sections:

SLO overview: Complete configuration details including name, target percentage, indicator type (availability, latency, traffic, saturation, or custom), time window, and time zone.
Entity context: Full entity information including type (application, website, synthetic test, or infrastructure), name, identifiers, labels, and scope configuration.
Performance status: Current SLI percentage with status indicator, error budget remaining (negative values indicate budget exhausted), error budget spent with comparison to total budget, and error burn rate relative to target.
Key insights: In-depth analysis that includes sustained error accumulation patterns, burn rate trends, chronic over-consumption indicators, data quality issues (zero errors, outages, missing data), error spikes correlated with budget changes, and continuous violation status.
Recommendations: Detailed, prioritized actions such as root cause investigation guidance, alerting threshold adjustments, redundancy and failover improvements, error definition refinements, instrumentation enhancements, and SLO target adjustments.

Dialog actions

Close: Dismiss the dialog and return to the SLO details page.
Copy icon: Copy the summary text to your clipboard for use in reports or other communications.
Download as PDF: Export the full AI analysis as a PDF document for sharing or archival purposes (available in full details view).

Use cases

AI-powered SLO analysis is useful for:

Quick status checks: Get an instant understanding of SLO health without manually analyzing charts.
Incident response: Rapidly identify root causes and recommended actions during outages.
Reporting: Generate executive summaries for stakeholders.
Trend analysis: Understand patterns in SLO violations over time.
Optimization: Receive data-driven recommendations for improving service reliability.

Notes

AI summaries are generated based on the currently selected time window (either the configured SLO time window or the time range that is selected in the UI).
The analysis considers both time-based and event-based SLO configurations.
Recommendations are tailored to the specific indicator type (availability, latency, traffic, saturation, or custom).
Supported for all entity types: applications, websites, synthetic tests, and infrastructure.
For event-based SLOs, the error budget is dynamic and changes as traffic volume fluctuates.

Creating an SLO

Edit online

To create an SLO, click Create service level objective. This opens a window where you can create a new SLO following these steps:

Select entity type

Edit online

Select the entity type to measure for your SLO. You can define and measure performance targets for the following entity types:

Application: Measure the performance of calls in an application perspective.
Website: Measure the performance of beacons and traces in a website.
Synthetic Tests: Measure the performance of the results of one or more synthetic tests.
Infrastructure: Measure the performance of infrastructure entities such as hosts, containers, or other infrastructure components.

Select entity

Edit online

After you select the entity type, choose the specific entity to monitor.

For Applications

Edit online

Select an application perspective from the searchable list of available application perspectives in your environment.

For Websites

Edit online

Select a website from the searchable list of available websites in your environment.

For Synthetic Tests

Edit online

Based on individual synthetic tests: Click Add synthetic test to open a dialog where you can search and select one or more synthetic tests from the available tests in your environment. You can filter the tests by attributes such as location and type, and select multiple tests from the search results. The selected tests form a static list, which means only the chosen tests are included in the SLO evaluation.
- Synthetic test results: Set the toggle to on to include on‑demand synthetic test results in SLO calculations. By default, the option is off, and all on‑demand test results are excluded.
Based on filters: You can also define the SLO scope by using tag filter expression. In this option, you specify filters such as synthetic test name, location ID, application ID, or other supported attributes. Filters include operators such as equals, starts with, or other supported matching rules, and can support wildcard-style match depending on the operator used. All synthetic tests that match the specified filters are automatically included in the SLO calculation. Unlike selecting individual tests, this approach creates a dynamic scope, which means the SLO automatically includes new synthetic tests that match the filter criteria as they are created.

For Infrastructure

Edit online

The infrastructure type selection is part of the scope configuration in the next step.

Set scope

Edit online

Configure the scope to specify what data to measure. The scope configuration depends on the entity type.

Applications

Edit online

Under Calls in scope, choose which calls to include for determining service and endpoint availability:

Calls in scope
- Inbound calls: Include calls initiated from outside the application and the destination service is part of the selected application perspective.
- All calls: Include both inbound calls from outside the application and calls that occur within the application perspective.
Include hidden calls (optional)
- Internal calls: Calls that represent work that is done inside a service. These can be created from intermediate spans that are sent through custom tracing.
- Synthetic calls: Calls with a synthetic endpoint as the destination, such as calls to health-check endpoints.
Services and endpoints Choose how to define the scope:
- Select: Use dropdown menus to select from available services and endpoints
  - Select a specific Service in your application or use the default of All services to include the entire application perspective.
  - Select an Endpoint from the specified service or use the default of All endpoints to apply to the entire service.
- Filter: Specify custom filters by using the available tags to define the scope of the measured calls for the application. These filters can be combined to create composite queries as needed.

Websites

Edit online

Beacon: Select the beacon type to monitor. Currently, only HTTP requests are supported. An informational banner displays this limitation.
Custom filters (optional): Add custom filters to narrow down the scope of the selected beacon type. You can combine multiple filters to create composite queries based on attributes such as geolocation, browser, or user.

Infrastructure

Edit online

Infrastructure type: Select the infrastructure entity type to monitor from the available types in your environment (for example, hosts, containers, Kubernetes clusters, or other infrastructure components).
Custom filters (optional): Add a custom filter to narrow down the scope of the infrastructure. You can combine multiple filters to create composite queries. These filters use tag filter expressions with logical operators (AND, OR, NOT) and comparison operators (equals, not equals, exists).

Set indicator

Edit online

Define the SLI to measure for the entity.

You can select a Blueprint to define the type of indicator. The available blueprints depend on the entity type:

For Application and Website entities: Latency, Availability, Traffic, and Custom
For Synthetic Tests entities: Latency, Availability, and Traffic
For Infrastructure entities: Saturation and Custom

The indicator is used to specify the metric and measurement type that are used to calculate the SLO status and error budget.

Latency: Measures response time for calls, beacons, or test results against a specified threshold. The threshold is specified in milliseconds (ms). The Latency blueprint supports two types of measurements:
- Time-based: The response time for all calls, beacons, or test results are aggregated into one-minute buckets. The error budget is measured in minutes and is a static value that is derived from the SLO target and the duration of the SLO time window. If the aggregated result for each minute exceeds the specified threshold, it is deemed to be a bad minute and the error budget is reduced. You can also specify the type of aggregation that is used from the following options:
  - Mean
  - Min
  - Max
  - Percentile (25, 50, 75, 90, 95, 98, 99)
- Event-based: The response time for each call, beacon, or test result is compared to the specified threshold and determined to be a good or bad event. Error budget is measured in calls, beacons, or results and is derived from the SLO target and the total number of events during the SLO time window. Since the total number of events is changing over the duration of the SLO time window, the total error budget is a dynamic value.
Availability: Measures the success rate of calls, beacons, or test results over a defined time period. The threshold is set as a percentage (%). The Availability blueprint supports two types of measurements:
- Time-based: The success rate for all calls, beacons, or test results are aggregated into one-minute buckets. Error budget is measured in minutes and is a static value derived from the SLO target and the duration of the SLO time window. If the aggregated result for each minute is less than the specified threshold, it is deemed to be a bad minute and the error budget is reduced. Currently only mean aggregation is supported for Availability blueprints.
- Event-based: The overall success rate for the calls, beacons, or test results is calculated over the SLO time period. The error budget is measured in calls, beacons, or results and is derived from the SLO target and the total number of events during the SLO time window. Since the total number of events is changing over the duration of the SLO time window, the total error budget is a dynamic value.
Traffic: Measures the load (calls, beacons, or test results) a system encounters over time. This is a time‑based measurement where the number of events per minute is compared to a defined threshold. If the threshold is not met, that minute is considered bad and the error budget is reduced. You can also specify the event type to count:
- All beacon count (for websites), All calls (for applications), or All results (for synthetic tests): This can be used to measure the overall system traffic for the specified entities.
- Beacon error count (for websites), Erroneous calls (for applications), or Erroneous results (for synthetic tests): This can be used to focus the measurement only on erroneous traffic.
Saturation: Measures resource utilization against a specified threshold. The Saturation blueprint is only available for Infrastructure entity types. There are two types of measurements for the Saturation blueprint:
- Time-based: Infrastructure metric values are aggregated into one-minute buckets. The error budget is measured in minutes and is a static value that is derived from the SLO target and the duration of the SLO time window. If the aggregated result for each minute exceeds the specified threshold, it is deemed to be a bad minute and the error budget is reduced. You must specify:
  - Metric: The infrastructure metric to measure
  - Aggregation: The type of aggregation (Mean, Min, Max, or Percentile: 25, 50, 75, 90, 95, 98, 99)
  - Operator: The comparison operator (>, >=, <, <=)
  - Threshold: The threshold value for comparison
- Event-based: Infrastructure metric values are retrieved at 10-second intervals. The error budget is measured in metric snapshots and is a static value that is derived from the SLO target and the duration of the SLO time window. Since metric snapshots are taken at a constant 10-second interval, the total number of snapshots is predetermined and does not change over the duration of the SLO time window. You must specify:
  - Metric: The infrastructure metric to measure
  - Operator: The comparison operator (>, >=, <, <=)
  - Threshold: The threshold value for comparison
  The aggregation type is automatically selected based on the threshold operator:
  - For > or >= operators: MAX aggregation is used to ensure that any spike above the threshold within the 10-second window is captured as a potential breach.
  - For < or <= operators: MIN aggregation is used to ensure that any dip below the threshold within the 10-second window is captured as a potential breach.
Custom: Use custom filters to specify the definition of good and bad events. This is an event-based measurement.
- For Application and Website entity types: You can define filters for good and bad calls or beacons:
  - Successful events: Specify filters to identify successful beacons or calls. These filters can be combined to create composite queries as needed. If only successful events are defined, events that do not match the filter are assumed to be unsuccessful.
  - Unsuccessful events (optional): Optionally specify filters for unsuccessful beacons or calls. If the unsuccessful events filter is defined, events that do not match either the successful or unsuccessful filters are excluded from the measurement.
- For Infrastructure entity types: You can define metric-based conditions for good and bad events. You must specify:
  - Good events: Define the metric, operator, and threshold for good events
  - Bad events: Define the metric, operator, and threshold for bad events
  The metrics used to define good and bad events can be the same or different. Unlike saturation‑based (time‑ or event‑based) indicators, this approach allows flexible, custom SLO definitions based on your infrastructure monitoring needs.

Set objective

Edit online

Define the overall SLO target value and details for the SLO time window.

SLO Target: Specify the target percentage that your entity should be meeting when it is performing correctly.

Note: The maximum precision that is currently supported is four nines (99.99%).

Time Window: Specify the type of time window for the SLO.
- Fixed: A time window with a distinct start time and duration. For example, you can configure a fixed one-week window that starts on 2020-01-01. The time window will be automatically reset to the next week (2020-01-08) when the week is completed.
- Rolling: A dynamic time window with a fixed window size, where the end is defined by the global time picker's end date and time selection. For example, the rolling time window enables the ability to always see the last week.
Length: The duration of the SLO time window, which is specified in days, weeks, or calendar months.
- For calendar month selection: Available only for fixed time windows. The time window aligns with calendar month boundaries (first day to last day of each month). If an SLO is created mid-month, the initial measurement period runs from the creation date through the end of that month (a partial period), with subsequent periods following complete calendar months.
Bind time zone: Enable or disable the binding of the time zone to the SLO time window. If it is not bound to any time zone, UTC is selected by default.
Time zone: Specify the time zone for the SLO time window. For reference, the full list of valid time zone IDs can be found here.

Note: The maximum length of an SLO that is supported is 4 weeks.

Start: For Fixed time windows, the day and time to initiate the SLO calculation.

After entering values for SLO Target and Time window, the error budget is displayed.

For time-based blueprints, this is a true error budget in minutes as determined by the time window length and the SLO target.
For event-based blueprints, this is an estimated error budget based on the calls or beacons encountered over the prior SLO time period, along with the specified time window length and SLO target.

Enter details

Edit online

Name: Specify the SLO name.
Tags (optional): Specify a set of tags that can be used to categorize or sort the SLO.
Teams (optional): Assign teams to manage access permissions for this SLO.
To create the SLO configuration, click Create.