Service level objective (SLO) Configuration Examples

Example 1: Application SLO with latency blueprint

Objective: Ensure 90% of the calls of the "Robot-shop" application have an average latency of better than 100 ms over a fixed period of 1 week.

The configuration of the SLO would be:

Entity: Robot-shop application
- Scope:
- Boundary: All services
- Include internal calls: false
- Include synthetic calls: false
- Service: All services
- Endpoint: All endpoints
Indicator:
- Blueprint: Latency
- Type: Time
- Aggregation: mean
- Threshold: 100 ms
Objective:
- SLO target: 90%
- Time window type: Rolling
- Time window length: 1 week

Scenario: Assuming the SLO had 400 bad minutes over the week-long SLO time window (400 minutes had mean latency > 100 ms) starting from 2025-03-04:

The error budget for this SLO would be calculated as:

Minutes in time period x (1 - SLO target percentage)
- Total minutes in time window: 24 × 60 × 7 minutes in 1 week = 10080 minutes
- SLO target percentage: 90% (0.9)
- Error budget: 10080 x (1 - 0.9) = 1008 minutes

The SLO status would be calculated as:

SLO status = 100% x (total minutes in time window - bad minutes in time window) / total minutes in time window
- 100% x (10080 total minutes - 400 bad minutes) / 10080 total minutes = 96.03%

Example 2: Website SLO with event-based availability blueprint

Edit online

Objective: Ensure HTTP requests to the shopping cart page (cart.html) of Demo website could achieve 95% of availability over a fixed period of 4 days beginning on 1 March 2025.

The configuration of the SLO would be:

Entity: Demo website
- Beacon: HTTP requests
- Custom filter: Location > Page name = Cart
Indicator:
- Blueprint: Availability
- Type: Event count (Total count of good calls vs bad calls)
Objective:
- SLO target: 95%
- Time window type: Fixed
- Time window length: 4 days
- Start: 2025-03-01 0:00

Scenario: Assuming there are 234 successful HTTP requests and 11 failed HTTP requests during the SLO time window starting from 1 March 2025:

The error budget for this SLO would be calculated as:

Event:
- Good events count: 234 beacons
- Bad events count: 11 beacons
- Total events count: 234 + 11 = 245 beacons
- SLO target percentage: 95% (0.95)
- Error budget: 245 x (1 - 0.95) = 12 beacons
- Error budget remaining: 12 - 11 = 1 beacon
- Error budget remaining percentage: 100% * (12 - 11) / 12 = 8.33%

The SLO status would be calculated as:

SLO status = 100% x total good events count in time window / total events count in time window
- 100% x 234 beacons / 245 beacons = 95.5%

Example 3: Synthetic monitoring with traffic blueprint

Edit online

Objective: Ensure 3 synthetic monitoring tests (shopping cart, home page and product list page) are run 15 times every minute targeting 99% regularly over a fixed period of 1 week beginning on 2025-03-18.

The configuration of the SLO would be:

Entities:
- shopping cart test
- home page test
- product list page test
Indicator:
- Blueprint: Traffic
- Threshold: > 15 results per minute
Objective:
- SLO target: 99%
- Time window type: Fixed
- Time window length: 1 week
- Start: 2025-03-18 0:00

Scenario: Assuming the SLO had 21 minutes where synthetic tests are run less than 15 times during the SLO time window starting from 2025-03-18:

The error budget for this SLO would be calculated as:

Minutes in time period x (1 - SLO Target Percentage)
- Total minutes: 24 × 60 × 7 minutes in 1 week = 10080 minutes
- SLO target Percentage: 99% (0.99)
- Error budget: 10080 x (1 - 0.99) = 101 minutes
- Error budget remaining: 101 - 21 = 80
- Error budget remaining percentage: 100% * (101 - 21) / 101 = 79.21%

The SLO status would be calculated as:

SLO status = 100% x (total minutes in time window - bad minutes in time window) / total minutes in time window
- 100% x (10080 total minutes - 21 bad minutes) / 10080 total minutes = 99.8%

Example 4: Synthetic monitoring SLO with event-based availability using tag filters

Edit online

Objective: Ensure that synthetic checkout journey tests achieve at least 99% availability over a fixed period of 1 day.

The configuration of the SLO would be:

Entitiy: Synthetic
Scope:
- Selection method: Based on filters
- Tag Filter Expression: synthetic.testName startsWith "checkout"
Indicator:
- Blueprint: Availability
- Type: Event count (Total count of good test results vs bad test results)
- Good event: call.erroneous = false
- Bad event: call.erroneous = true
Objective:
- SLO target: 99%
- Time window type: Fixed
- Time window length: 1 day
- Start: 2026-01-18 0:00

Scenario: Assume 14,850 successful synthetic test executions and 150 failed executions during the SLO time window that start on 2026‑01‑18.

The error budget for this SLO would be calculated as:

Event:
- Good events count: 14,850 test executions
- Bad events count: 150 test executions
- Total events count: 14,850 + 150 = 15,000 executions
- SLO target percentage: 99% (0.99)
- Error budget: 15,000 × (1 − 0.99) = 150 executions
- Error budget remaining: 150 − 150 = 0 executions
- Error budget remaining percentage: 100% × (0 / 150) = 0%

The SLO status would be calculated as:

SLO status = 100% × total good events count / total events count
- 100% × 14,850 / 15,000 = 99%

Example 5: Application SLO with Custom blueprint

Edit online

Objective: Ensure 98% of the calls of the “Robot-shop” application do not result in an HTTP status code of 400 over a rolling period of 1 day.

The configuration of the SLO would be:

Entity: Robot-shop application
- Scope:
- Boundary: All services
- Include internal calls: false
- Include synthetic calls: false
- Service: All services
- Endpoint: All endpoints
Indicator:
- Blueprint: Custom
- Type: Event count (Total count of good calls vs bad calls)
- Good Filter: HTTP status code != 400
- Bad Filter: HTTP status code = 400
Objective:
- SLO target: 98%
- Time window type: Rolling
- Time window length: 1 day

Scenario: Assuming the SLO had 25000 good calls and 200 bad calls over the one day long SLO time window starting from 2025-03-10:

The error budget for this SLO would be calculated as:

Event:
- Good events count: 25000 calls
- Bad events count: 200 calls
- Total events count: 25000 + 200 = 25200 calls
- SLO target percentage: 98% (0.98)
- Error budget: 25200 x (1 - 0.98) = 504 calls
- Error budget remaining: 504 - 200 = 304 calls
- Error budget remaining percentage: 100% * (504 - 200) / 504 = 60.32%

The SLO status would be calculated as:

SLO status = 100% x total good events count in time window / total events count in time window
- 100% x 25000 calls / 25200 calls = 99.2%

Example 6: Application SLO with event-based latency blueprint

Edit online

Objective: Ensure 92% of the calls of the “Robot-shop” application have latency of better than 100 ms over over a fixed period of 2 weeks.

The configuration of the SLO would be:

Entity: Robot-shop application
- Scope:
- Boundary: All services
- Include internal calls: false
- Include synthetic calls: false
- Service: All services
- Endpoint: All endpoints
Indicator:
- Blueprint: Latency
- Type: Event count (Total count of good calls vs bad calls)
- Good call: Latency < 100 ms
- Bad call: Latency >= 100 ms
Objective:
- SLO target: 92%
- Time window type: Fixed
- Time window length: 2 weeks
- Start: 2025-03-10 0:00

Scenario: Assuming the SLO had 50000 good calls and 1000 bad calls over the two week long SLO time window starting from 2025-03-10:

The error budget for this SLO would be calculated as:

Event:
- Good events count: 50000 calls
- Bad events count: 1000 calls
- Total events count: 50000 + 1000 = 51000 calls
- SLO target percentage: 92% (0.92)
- Error budget: 51000 x (1 - 0.92) = 4080 calls
- Error budget remaining: 4080 - 1000 = 3080 calls
- Error budget remaining percentage: 100% * (4080 - 1000) / 4080 = 75.49%

The SLO status would be calculated as:

SLO status = 100% x total good events count in time window / total events count in time window
- 100% x 50000 calls / 51000 calls = 98.04%

Example 7: Website SLO with time-based availability blueprint

Edit online

Objective: Ensure HTTP requests to the Demo website could achieve 92% of availability with less than 5% of error rate over a rolling period of 3 days.

The configuration of the SLO would be:

Entity: Demo website
- Beacon: HTTP requests
Indicator:
- Blueprint: Availability
- Type: Time
- Error Rate Threshold: 5%
Objective:
- SLO target: 92%
- Time window type: Rolling
- Time window length: 3 days

Scenario: Assuming the SLO had 200 bad minutes over the 3 day long SLO time window (200 minutes had mean error rate greater than 5%) starting from 2025-03-05:

The error budget for this SLO would be calculated as:

Minutes in time period x (1 - SLO target percentage)
- Total minutes in time window: 24 × 60 × 3 minutes in 3 days = 4320 minutes
- SLO target percentage: 92% (0.92)
- Error budget: 4320 x (1 - 0.92) = 346 minutes

The SLO status would be calculated as:

SLO status = 100% x (total minutes in time window - bad minutes in time window) / total minutes in time window
- 100% x (4320 total minutes - 200 bad minutes) / 4320 total minutes = 95.37%

Example 8: Infrastructure SLO with time-based saturation blueprint (CPU)

Edit online

Objective: Ensure CPU utilization on production hosts stays below 75% for 99% of the time over a rolling period of 7 days.

The configuration of the SLO would be:

Entity: Infrastructure
- Infrastructure type: Host
- Tag Filter Expression: availabilityZone = "us-east-1"
Indicator:
- Blueprint: Saturation
- Type: Time
- Metric: cpu.used
- Aggregation: Mean
- Operator: >=
- Threshold: 75%
Objective:
- SLO target: 99%
- Time window type: Rolling
- Time window length: 7 days

Scenario: Assuming the SLO had 25 bad minutes over the 7-day SLO time window (25 minutes had mean CPU usage ≥ 75%) starting from 2025-03-15:The error budget for this SLO would be calculated as:

Minutes in time period x (1 - SLO target percentage)
- Total minutes in time window: 24 × 60 × 7 minutes in 1 week = 10080 minutes
- SLO target percentage: 99% (0.99)
- Error budget: 10080 x (1 - 0.99) = 101 minutes

The SLO status would be calculated as:

SLO status = 100% x (total minutes in time window - bad minutes in time window) / total minutes in time window
- 100% x (10080 total minutes - 25 bad minutes) / 10080 total minutes = 99.75%

Example 9: Infrastructure SLO with event-based saturation blueprint (Memory)

Edit online

Objective: Ensure memory utilization on database servers stays below 85% for 99.9% of metric snapshots over a rolling period of 1 day.

The configuration of the SLO would be:

Entity: Infrastructure
- Infrastructure type: Host
- Tag Filter Expression: host.fqdn contains "company-name" AND availabilityZone = "us-east-1"
Indicator:
- Blueprint: Saturation
- Type: Event count (Total count of good metric snapshots vs bad metric snapshots)
- Metric: memory.used
- Operator: >=
- Threshold: 85%
Objective:
- SLO target: 99.9%
- Time window type: Rolling
- Time window length: 1 day

Scenario: Assuming there are 8,640 good metric snapshots (memory < 85%) and 5 bad metric snapshots (memory ≥ 85%) during the SLO time window starting from 2025-03-20:The error budget for this SLO would be calculated as:

Event:
- Good events count: 8,640 snapshots
- Bad events count: 5 snapshots
- Total events count: 8,640 + 5 = 8,645 snapshots
- SLO target percentage: 99.9% (0.999)
- Error budget: 8,645 x (1 - 0.999) = 8.65 snapshots (rounded to 9)
- Error budget remaining: 9 - 5 = 4 snapshots
- Error budget remaining percentage: 100% * (9 - 5) / 9 = 44.44%

The SLO status would be calculated as:

SLO status = 100% x total good events count in time window / total events count in time window
- 100% x 8,640 snapshots / 8,645 snapshots = 99.94%

Example 10: Infrastructure SLO with custom blueprint for Kubernetes cluster

Edit online

Objective: Ensure the production Kubernetes cluster maintains at least 6 available nodes for 99.9% of the time over a rolling period of 7 days.

The configuration of the SLO would be:

Entity: Infrastructure
- Infrastructure type: Kubernetes Cluster
- Tag Filter Expression: kubernetes.cluster.name = "prod-cluster"
Indicator:
- Blueprint: Custom
- Type: Event count (Total count of good metric snapshots vs bad metric snapshots)
- Good events: nodes.count >= 6
- Bad events: nodes.count < 6
Objective:
- SLO target: 99.9%
- Time window type: Rolling
- Time window length: 7 days

Scenario: Assuming there are 60,400 good events (metric snapshots where node count ≥ 6) and 80 bad events (metric snapshots where node count < 6) during the 7-day SLO time window starting from 2025-04-01:The error budget for this SLO would be calculated as:

Event:
- Good events count: 60,400 snapshots
- Bad events count: 80 snapshots
- Total events count: 60,400 + 80 = 60,480 snapshots
- SLO target percentage: 99.9% (0.999)
- Error budget: 60,480 x (1 - 0.999) = 60.48 snapshots (rounded to 60)
- Error budget remaining: 60 - 80 = -20 snapshots (exceeded)
- Error budget remaining percentage: 100% x (-20 / 60) = -33.33% (overspent)

The SLO status would be calculated as:

SLO status = 100% x total good events count in time window / total events count in time window
- 100% x 60,400 snapshots / 60,480 snapshots = 99.87%

Example 11: SLO behavior with time zone binding

Edit online

Objective: Make sure that the SLO time window matches calendar time and stays the same each day, even during Daylight Saving Time changes, for accurate and consistent reporting. Ensure the SLO calculation is bound to a specific time zone, as the daylight saving time transition impacts the time window according to the configured time zone.

The configuration of the SLO would be:

Entity: Robot-shop application
- Scope:
  - Boundary: All services
  - Include internal calls: false
  - Include synthetic calls: false
- Service: All services
- Endpoint: All endpoints
Indicator:
- Blueprint: Latency
- Type: Time
- Aggregation: mean
- Threshold: 100 ms
Objective:
- SLO target: 90%
- Time window type: Fixed
- Time window length: 3 days
- Bind Time zone: Enable
- Time zone: Europe/Berlin

Scenario: The user has a 3-day SLO starting from 2025-03-29. This SLO overlaps the Daylight Saving Time transition in Berlin. Its time windows should remain consistent, each day should start and end in the same hour and minute, including 2025-03-30 when the DST time change occurs.

Example 12: SLO with team association

Edit online

Objective: Assign the SLO to one or more teams associated with the user. These team associations are then utilized to enforce access restrictions.

The following example shows the configuration of an SLO with team association:

Entity: Robot-shop application
- Scope:
  - Boundary: All services
  - Include internal calls: false
  - Include synthetic calls: false
- Service: All services
- Endpoint: All endpoints
Indicator:
- Blueprint: Latency
- Type: Time
- Aggregation: mean
- Threshold: 100 ms
Objective:
- SLO target: 90%
- Time window type: Rolling
- Time window length: 1 week
Details:
- Name: Sample SLO
- Tags (optional): Sample Tag
- Teams: Team 1, Team 2

Scenario: The user has an SLO assigned to both Team 1 and Team 2. The corresponding team labels are visible in the SLO table and on the configuration page. When switching to Team Scope, the user's view of this SLO is restricted based on the selected team.

Example 13: Calendar month SLO created mid-month

Edit online

Objective: Ensure that 99% of calls to the "Payment Service" application have an average latency less than 200 ms over calendar-month periods, with the SLO created on 15 January.

The configuration of the SLO would be:

Entity: Payment Service application
- Scope:
  - Boundary: Inbound calls
  - Include internal calls: false
  - Include synthetic calls: false
- Service: All services
- Endpoint: All endpoints
Indicator:
- Blueprint: Latency
- Type: Time
- Aggregation: mean
- Threshold: 200 ms
Objective:
- SLO target: 99%
- Time window type: Fixed
- Time window length: 1 calendar month
- Bind Time zone: Enable
- Time zone: America/New_York
- Start: 2025-01-15 00:00

Scenario: The SLO is created on 15 January 2025. Calendar month SLOs align with month boundaries, creating a partial first period when created mid-month.

Initial period (January 15-31, 2025):

Duration: 17 days (January 15 through January 31)
Total minutes: 17 × 24 × 60 = 24,480 minutes
Error budget: 24,480 × (1 - 0.99) = 245 minutes
Bad minutes recorded: 30 minutes
SLO status: 100% × (24,480 - 30) / 24,480 = 99.88%
Error budget remaining: 245 - 30 = 215 minutes

Second period (February 1-28, 2025):

Duration: 28 days (complete calendar month)
Total minutes: 28 × 24 × 60 = 40,320 minutes
Error budget: 40,320 × (1 - 0.99) = 403 minutes
Bad minutes recorded: 50 minutes
SLO status: 100% × (40,320 - 50) / 40,320 = 99.88%
Error budget remaining: 403 - 50 = 353 minutes

Third period (March 1-31, 2025):

Duration: 31 days (complete calendar month)
Total minutes: 31 × 24 × 60 = 44,640 minutes
Error budget: 44,640 × (1 - 0.99) = 446 minutes

Example 14: Calendar month SLO created on first day of month

Edit online

Objective: Ensure that 95% of HTTP requests to the "E-commerce website" achieve availability over calendar month periods, with the SLO created on 1 March.

The configuration of the SLO would be:

Entity: E-commerce Website
- Beacon: HTTP requests
- Custom filter: None
Indicator:
- Blueprint: Availability
- Type: Time
- Error Rate Threshold: 5%
Objective:
- SLO target: 95%
- Time window type: Fixed
- Time window length: 1 calendar month
- Bind Time zone: Enable
- Time zone: UTC
- Start: 2025-03-01 00:00

Scenario: The SLO is created on 1 March 2025 (the first day of the month). All measurement periods are complete calendar months.

First period (March 1-31, 2025):

Duration: 31 days (complete calendar month)
Total minutes: 31 × 24 × 60 = 44,640 minutes
Error budget: 44,640 × (1 - 0.95) = 2,232 minutes
Bad minutes recorded: 400 minutes
SLO status: 100% × (44,640 - 400) / 44,640 = 99.10%
Error budget remaining: 2,232 - 400 = 1,832 minutes

Second period (April 1-30, 2025):

Duration: 30 days (complete calendar month)
Total minutes: 30 × 24 × 60 = 43,200 minutes
Error budget: 43,200 × (1 - 0.95) = 2,160 minutes
Bad minutes recorded: 350 minutes
SLO status: 100% × (43,200 - 350) / 43,200 = 99.19%
Error budget remaining: 2,160 - 350 = 1,810 minutes

Service levels smart alerts configuration examples

Edit online

Example 1: Service levels smart alert to monitor the status of an SLO

Edit online

Objective : Alert and raise an issue if the status of the Vending Machine Reliability SLO Configuration is less than 90%.

The configuration of the Service levels smart alert would be:

Rule:
  Alert Type: Service Levels Objective
  Metric: Status
Threshold:
  Operator: <
  value: 0.90
SLOs: Vending Machine Reliability
Time Threshold:
  Expiry: 5 Minutes
  Time window: 10 Minutes

Once the Smart Alert configuration is set up, the system will begin monitoring the status of the Vending Machine Reliability SLO configuration.

Scenario: Monitoring and Event Triggering

Edit online

SLO Drops Below Threshold
- Assume the Vending Machine Reliability SLO drops to 89%, below the defined threshold of 90%.
- With the time threshold set to 10 minutes, the system will wait for the entire 10-minute window before taking any action.
- If the SLO remains below 90% after 10 minutes, the system triggers an event, raising an issue.
SLO Returns Above Threshold
- If, after some time, the SLO status recovers and rises above 90%, the system will continue monitoring.
- However, if the status stays above 90%, the system will wait for the expiry time threshold of 5 minutes.
- If the SLO remains above 90% for the full 5 minutes, the event will be automatically closed.

Example 2: Service levels smart alert to monitor the error budget of an SLO

Edit online

Objective : Alert and raise an issue if the error budget consumption percentage of the Vending Machine Reliability SLO Configuration is more than 50%.

The configuration of the Service levels smart alert is:

Rule:
  Alert Type: Error Budget
  Metric: Burned Percentage
Threshold:
  Operator: >
  value: 0.50
SLOs: Vending Machine Reliability
Time Threshold:
  Expiry: 5 Minutes
  Time window: 10 Minutes

Once the smart alert configuration is set up, the system will begin monitoring the error budget consumption percentage of the Vending Machine Reliability SLO configuration.

Scenario: Monitoring and Event Triggering

Edit online

Error budget consumption exceeds 50%
- Assume the error budget consumption exceeds 50%.
- With the time threshold set to 10 minutes, the system will wait for the full 10-minute window before taking any action.
- If the error budget consumption remains above 50% after 10 minutes, the system triggers an event, raising an issue.
Error budget consumption drops below 50%
- If, after some time, the error budget consumption drops back below 50%, the system will continue monitoring.
- If the error budget consumption stays below 50%, the system will wait for the expiry time threshold of 5 minutes.
- If the error budget consumption remains below 50% for the full 5 minutes, the event will be automatically closed.

Service levels burn rate smart alert calculation

Edit online

The burn rate is calculated using the formula:

Burn Rate = (Error Budget Consumed * SLO Time Window) / Alerting Window

For example:
- Assume the error budget consumed over the last 12 hours is 70%, and the SLO time window for the Vending Machine Reliability SLO is 1 day (24 hours).
- The burn rate for the last 12 hours would be: (0.70 * 24) / 12 = 1.4
- Similarly, if the error budget consumed for the last 2 hours is 20%
- The burn rate for the last 2 hours would be: (0.20 * 24) / 2 = 2.4

Example 3 - Smart Alert to monitor the burn rate of an SLO with a single alerting window and threshold

Edit online

Objective: Alert and raise an issue if the burn rate of the Vending Machine Reliability SLO configuration is more than 1 for the last 12 hours.

The configuration of the Service levels smart alert should be:

Rule:
  Alert Type: Error Budget
  Metric: Burn Rate V2
Burn Rate Config:
[
  Alert Window Type: SINGLE
  Duration: 12 Hours
  Duration Unit Type: Hour
  Threshold:
    Operator: >
    Value: 1
]
SLOs: Vending Machine Reliability
Time Threshold:
  Expiry: 5 Minutes
  Time window: 10 Minutes

After the Smart Alert configuration is set up, the system begins monitoring the burn rate of the Vending Machine Reliability SLO configuration for the specified alerting window.

Scenario: Monitoring and Event Triggering

Edit online

Burn rate exceeds 1 for the alerting window (Last 12 hours)
- Assume the calculated burn rate for last 12 hours starts exceeding 1.
- With the time threshold set to 10 minutes, the system waits for the full 10-minute window before taking any action.
- If the burn rate still remains above 1 after 10 minutes, the system triggers an alerting event, raising an issue.
Burn rate drops below 1 for the alerting window (last 12 hours)
- If, after some time, the burn rate for the alerting window drops below 1, the system continues monitoring.
- If the burn rate stays below 1, the system will wait for the expiry time threshold of 5 minutes.
- If the burn rate remains below 1 for the full 5 minutes, the event will be automatically closed.

Example 4 - Smart Alert to monitor the burn rate of an SLO with multiple alerting windows and respective thresholds

Edit online

Objective: Alert and raise an issue if the burn rate of the Vending Machine Reliability SLO configuration is more than 1 for the last 24 hours and more than 4 for the last 2 hours.

The configuration of the Service levels smart alert should be:

Rule:
  Alert Type: Error Budget
  Metric: Burn Rate V2
Burn Rate Config:
[
  Alert Window Type: LONG
  Duration: 24 Hours
  Duration Unit Type: Hour
    Threshold:
    Operator: >
    Value: 1
  ,
  Alert Window Type: SHORT
  Duration: 2 Hours
  Duration Unit Type: Hour
  Threshold:
    Operator: >
    Value: 4
]
SLOs: Vending Machine Reliability
Time Threshold:
  Expiry: 5 Minutes
  Time window: 10 Minutes

After the Smart Alert configuration is set up, the system begins monitoring the burn rate of the Vending Machine Reliability SLO configuration for both long and short alerting windows.

Scenario: Monitoring and Event Triggering

Edit online

Burn rate exceeds 1 for both long and short alerting windows (Last 24 hours and 2 hours)
- Assume the calculated burn rate for the last 24 hours starts exceeding 1 and for the last 2 hours starts exceeding 4.
- With the time threshold set to 10 minutes, the system waits for the full 10-minute window before taking any action.
- If the burn rate of both alerting windows still violates the thresholds after 10 minutes, the system triggers an alerting event, raising an issue.
Burn rate drops below 1 for the long alerting window but stays above 4 for the short alerting window
- If, after some time, the burn rate for the long alerting window drops below 1, but the burn rate for the short alerting window stays above 4, the system continues monitoring.
- If the burn rate remains below 1 for the long alerting window, the system waits for the expiry time threshold of 5 minutes.
- If the burn rate stays below 1 for the long alerting window for the full 5 minutes, regardless of the short alerting window's value, the event is automatically closed, as both thresholds must be violated in order to send an alert. The same applies in reverse — even when the short alerting window violates the threshold but the long alerting window does not.
Burn rate drops below 1 for the long alerting window and below 4 for the short alerting window
- If, after some time, the burn rate for both alerting windows drops below their respective thresholds, the system continues monitoring.
- If the burn rate remains below the thresholds, the system waits for the expiry time threshold of 5 minutes.
- If the burn rate stays below the thresholds for the full 5 minutes, the event is automatically closed.

Note: The burn rate alert with multiple windows requires both thresholds (AND Condition) to be violated in order to send an alert. Even if one threshold is not violated, an alert is not sent.

Troubleshooting

Edit online

The following are suggestions to resolve commonly-occurring problems with configuration SLOs.

Problem: No error budget is consumed, SLO status always is 100%.
- Solution: Use the indicator chart on the SLO dashboard to verify if the indicator is never exceeding the threshold during the SLO time window, resulting in no consumption of error budget. You may consider modifying the threshold accordingly.
Problem: No error budget is consumed, SLO status always is 100%.
- Solution: Use the traffic chart on the SLO dashboard to verify the entity is receiving traffic during the SLO time window. If not, the error budget and SLO status will not be impacted.
Problem: Error budget is consistently consumed rapidly, SLO status remains negative.
- Solution: Use the indicator chart on the SLO dashboard to verify if the indicator is consistently exceeding the threshold during the SLO time window, resulting in rapid consumption of error budget. You may consider modifying the threshold accordingly.
Problem: Burn rate alert is not triggered due to time window misalignment.
- Solution:
  - Fixed time window SLO: If the SLO is configured with a fixed time window, the alert might not trigger if the burn rate calculation is based on an alerting window that is longer than the actual elapsed time in the SLO time period. For example, if the alerting window requires data from a 12-hour period, but the SLO time window just started, there might not be enough time for the burn rate to exceed the threshold. As a result, no alert is triggered even if the burn rate is high during the elapsed time.
  - Rolling time window SLO: If the SLO is set to a rolling time window, the burn rate calculation might not trigger an alert if the alerting window extends beyond the SLO's creation time. For instance, if the alerting window goes past the period when the SLO was created or active, the burn rate cannot be calculated properly because the data is not available for the full alerting window.