Detailed metrics on the serviceability dashboard

The serviceability dashboard provides detailed metrics data for service-level objective (SLO) metrics.

Expand a row in the serviceability dashboard to view the detailed metrics data for an SLO metric. For each SLO metric, eight cards show the metrics data for latency and error rate. Five cards provide information about the response time, or latency, for each SLO metric. Three cards provide information about the error rate for each SLO metric.

Note: For each service-level objective metric on the serviceability dashboard, the first data point, which is the metric data of the first incoming call, is not counted or shown on the dashboard because the underlying Prometheus Query Language (PromQL) query, increase(...[$__range]), calculates the increase in the time series in the range vector.

Bucket heat maps

Bucket heat maps indicate the distribution of duration of the response time in seconds of incoming HTTP requests.

The underlying query expression, which is written in Prometheus Query Language (PromQL), calculates the increase of HTTP requests that are served during the selected time range by histogram bucket. For more information, see Histograms and summaries.
Note: The value of the label le denotes the inclusive upper bound of the bucket.
Query expression for Maximo® Assist
sum(round(increase(mas_assist_query_querydocsapi_duration_seconds_bucket{masinstanceid=\"$i
nstanceId\"}[$__range]))) by (le)
Query expression for Maximo Health
sum(round(increase(mas_health_matrix_matrixcountapi_duration_seconds_bucket{masinstanceid=
\"$instanceId\"}[$__range]))) by (le)
Query expression for Maximo Optimizer
sum(round(increase(mas_optimizer_api_create_job_duration_seconds_bucket{masinstanceid=\"$in
stanceId\"}[$__range]))) by (le)
Query expression for Maximo Predict
sum(round(increase(mas_predict_asset_predictions_load_duration_seconds_bucket{masinstanceid
="$instanceId"}[$__range])))by (le)

Response time in last 30 mins

Response time in the last 30 minutes indicates the 90th, 95th, and 99th percentile of request durations over the last 30 minutes.

The underlying query expression calculates the 90th, 95th, and 99th percentile of request durations aggregated by the le label over the last 30 minutes by using the histogram quantile function. For more information, see histogram_quantile.

Query expression for Maximo Assist
histogram_quantile(.9, sum(rate(mas_assist_query_querydocsapi_duration_seconds_bucket{masinstanceid=\"$instanceId\
"}[30m:])) by (le))

histogram_quantile(.95, sum(rate(mas_assist_query_querydocsapi_duration_seconds_bucket{masinstanceid=\"$instanceId\ 
"}[30m:])) by (le))

histogram_quantile(.99, sum(rate(mas_assist_query_querydocsapi_duration_seconds_bucket{masinstanceid=\"$instanceId\ 
"}[30m:])) by (le))
Query expression for Maximo Health
histogram_quantile(.9, sum(rate(mas_health_matrix_matrixcountapi_duration_seconds_bucket{masinstanceid=\"$instance
Id\"}[30m:])) by (le))

histogram_quantile(.95, sum(rate(mas_health_matrix_matrixcountapi_duration_seconds_bucket{masinstanceid=\"$instance
Id\"}[30m:])) by (le))

histogram_quantile(.99, sum(rate(mas_health_matrix_matrixcountapi_duration_seconds_bucket{masinstanceid=\"$instance
Id\"}[30m:])) by (le))
Query expression for Maximo Optimizer
histogram_quantile(.9, sum(rate(mas_optimizer_api_create_job_duration_seconds_bucket{masinstanceid=\"$instanceId\"}
[30m:])) by (le))

histogram_quantile(.95, sum(rate(mas_optimizer_api_create_job_duration_seconds_bucket{masinstanceid=\"$instanceId\"}
[30m:])) by (le))

histogram_quantile(.99, sum(rate(mas_optimizer_api_create_job_duration_seconds_bucket{masinstanceid=\"$instanceId\"}
[30m:])) by (le))
Query expression for Maximo Predict
histogram_quantile(.9, sum(rate(mas_predict_asset_predictions_load_duration_seconds_bucket{masinstanceid="$instanceId"}[30m:])) by (le))

histogram_quantile(.95, sum(rate(mas_predict_asset_predictions_load_duration_seconds_bucket{masinstanceid="$instanceId"}[30m:])) by (le))

histogram_quantile(.99, sum(rate(mas_predict_asset_predictions_load_duration_seconds_bucket{masinstanceid="$instanceId"}[30m:])) by (le))

Average response time

Average response time indicates the average response time in milliseconds of incoming requests.

The underlying query expression calculates the average response time of requests that are served during the selected time range.

Query expression for Maximo Assist
sum(rate(mas_assist_query_querydocsapi_duration_seconds_sum{masinstanceid=\"$instanceId\"}[
$__range]))/sum(rate(mas_assist_query_querydocsapi_duration_seconds_count{masinstanceid=\"$i
nstanceId\"}[$__range]))
Query expression for Maximo Health
sum(rate(mas_health_matrix_matrixcountapi_duration_seconds_sum{masinstanceid=\"$instanceId\
"}[$__range]))/sum(rate(mas_health_matrix_matrixcountapi_duration_seconds_count{masinstanceid
=\"$instanceId\"}[$__range]))
Query expression for Maximo Optimizer
sum(rate(mas_optimizer_api_create_job_duration_seconds_sum{masinstanceid=\"$instanceId\"}[$_
_range]))/sum(rate(mas_optimizer_api_create_job_duration_seconds_count{masinstanceid=\"$inst
anceId\"}[$__range]))
Query expression for Maximo Predict
sum(rate(mas_predict_asset_predictions_load_duration_seconds_sum{masinstanceid="$instanceId"}
[$__range]))/sum(rate(mas_predict_asset_predictions_load_duration_seconds_count{masinstanceid="$instanceId"}
[$__range]))

% of requests which broke SLO

% of requests which broke SLO indicates the percentage of request durations that exceed the threshold in the SLO metric definition, for example, 4 seconds.

The underlying query expression calculates the percent of request durations that exceed the threshold in the SLO metric definition during the selected time range.

Query expression for Maximo Assist
(1-sum(rate(mas_assist_query_querydocsapi_duration_seconds_bucket{masinstanceid=\"$instanceId\
",le=\"10\"}[$__range]))/sum(rate(mas_assist_query_querydocsapi_duration_seconds_bucket{masinstanceid=\
"$instanceId\",le=\"+Inf\"}[$__range])))*100
Query expression for Maximo Health
(1-
sum(rate(mas_health_matrix_matrixcountapi_duratiMaximo Healthon_seconds_bucket{masinstanceid=\"$instance
Id\",le=\"4\"}[$__range]))/sum(rate(mas_health_matrix_matrixcountapi_duration_seconds_bucket{
masinstanceid=\"$instanceId\",le=\"+Inf\"}[$__range])))*100
Query expression for Maximo Optimizer
(1-
sum(rate(mas_optimizer_api_create_job_duration_seconds_bucket{masinstanceid=\"$instanceId\",l
e=\"4\"}[$__range]))/sum(rate(mas_optimizer_api_create_job_duration_seconds_bucket{masinstanc
eid=\"$instanceId\",le=\"+Inf\"}[$__range])))*100
Query expression for Maximo Predict
(1-sum(rate(mas_predict_asset_predictions_load_duration_seconds_bucket{masinstanceid="$instance
Id",le="5"}[$__range]))/sum(rate(mas_predict_asset_predictions_load_duration_seconds_bucket{ma
sinstanceid="$instanceId",le="+Inf"}[$__range])))*100

% of requests which met SLO

% of requests which met SLO indicates the percentage of request durations that met the threshold in the SLO metric definition, for example, 4 seconds.

The underlying query expression calculates the percent of request durations that met the threshold in the SLO metric definition during the selected time range.

Query expression for Maximo Assist
sum(rate(mas_assist_query_querydocsapi_duration_seconds_bucket{masinstanceid=\"$instanceId\
",le=\"10\"}[$__range]))/sum(rate(mas_assist_query_querydocsapi_duration_seconds_bucket{masin
stanceid=\"$instanceId\",le=\"+Inf\"}[$__range]))*100
Query expression for Maximo Health
sum(rate(mas_health_matrix_matrixcountapi_duration_seconds_bucket{masinstanceid=\"$instance
Id\",le=\"4\"}[$__range]))/sum(rate(mas_health_matrix_matrixcountapi_duration_seconds_bucket{
masinstanceid=\"$instanceId\",le=\"+Inf\"}[$__range]))*100
Query expression for Maximo Optimizer
sum(rate(mas_optimizer_api_create_job_duration_seconds_bucket{masinstanceid=\"$instanceId\",l
e=\"4\"}[$__range]))/sum(rate(mas_optimizer_api_create_job_duration_seconds_bucket{masinstanc
eid=\"$instanceId\",le=\"+Inf\"}[$__range]))*100
Query expression for Maximo Predict
sum(rate(mas_predict_asset_predictions_load_duration_seconds_bucket{masinstanceid="$instance
Id",le="5"}[$__range]))/sum(rate(mas_predict_asset_predictions_load_duration_seconds_bucket{ma
sinstanceid="$instanceId",le="+Inf"}[$__range]))*100

Error rate

Error rate indicates the percentage of erroneous calls, for example, calls with HTTP status code in the 500 range.

The underlying query expression calculates the percent of requests that are served during the selected time range that return a status code that starts with 5, for example, 500, 502, 503.

Query expression for Maximo Assist
sum(increase(mas_assist_query_querydocsapi_duration_seconds_count{masinstanceid=
\"$instanceId\",code=~\"5..\"}[$__range]))/sum(increase(mas_assist_query_querydocsapi_
duration_seconds_count{masinstanceid=\"$instanceId\"}[$__range]))*100
Query expression for Maximo Health
sum(increase(mas_health_matrix_matrixcountapi_duration_seconds_count{masinstanceid=\
"$instanceId\",code=~\"5..\"}[$__range]))/sum(increase(mas_health_matrix_matrixcountapi_duration_
seconds_count{masinstanceid=\"$instanceId\"}[$__range]))*100

Query expression for Maximo Optimizer
sum(increase(mas_optimizer_api_create_job_duration_seconds_count{masinstanceid=\
"$instanceId\",code=~\"5..\"}[$__range]))/sum(increase(mas_optimizer_api_create_job_duration_
seconds_count{masinstanceid=\"$instanceId\"}[$__range]))*100
Query expression for Maximo Predict
sum(increase(mas_predict_asset_predictions_load_duration_seconds_count{masinstanceid=
"$instanceId",code=~"5.."}[$__range]))/sum(increase(mas_predict_asset_predictions_load_duration_
seconds_count{masinstanceid="$instanceId"}[$__range]))*100

Calls

Calls indicates the ratio of calls with different HTTP status codes, for example, 200, 400, 401, 403, 500, and so on.

The underlying query expression calculates the count of requests aggregated by the status code.

Query expression for Maximo Assist
sum(round(increase(mas_assist_query_querydocsapi_duration_seconds_count{masinstanceid=\"$i
nstanceId\"}[$__range]))) by (code)
Query expression for Maximo Health
sum(round(increase(mas_health_matrix_matrixcountapi_duration_seconds_count{masinstanceid=\
"$instanceId\"}[$__range]))) by (code)
Query expression for Maximo Optimizer
sum(round(increase(mas_optimizer_api_create_job_duration_seconds_count{masinstanceid=\"$ins
tanceId\"}[$__range]))) by (code)
Query expression for Maximo Predict
sum(round(increase(mas_predict_asset_predictions_load_duration_seconds_count{masinstanceid=
"$instanceId"}[$__range]))) by (code)

Error rate in last 30 mins

Error rate in last 30 minutes indicates the percentage of erroneous calls over the last 30 minutes.

The underlying query expression calculates the percent of requests that are served over the last 30 minutes that return the status code that starts with 5, for example, 500, 502, 503.

Query expression for Maximo Assist
(sum(increase(mas_assist_query_querydocsapi_duration_seconds_count{masinstanceid=\"$instanc
eId\",code=~\"5..\"}[30m])))/(sum(increase(mas_assist_query_querydocsapi_duration_seconds_cou
nt{masinstanceid=\"$instanceId\"}[30m])))*100
Query expression for Maximo Health
(sum(increase(mas_health_matrix_matrixcountapi_duration_seconds_count{masinstanceid=\"$inst
anceId\",code=~\"5..\"}[30m])))/(sum(increase(mas_health_matrix_matrixcountapi_duration_secon
ds_count{masinstanceid=\"$instanceId\"}[30m])))*100
Query expression for Maximo Optimizer
(sum(increase(mas_optimizer_api_create_job_duration_seconds_count{masinstanceid=\"$instanceI
d\",code=~\"5..\"}[30m])))/(sum(increase(mas_optimizer_api_create_job_duration_seconds_count{
masinstanceid=\"$instanceId\"}[30m])))*100
Query expression for Maximo Predict
(sum(increase(mas_predict_asset_predictions_load_duration_seconds_count{masinstanceid="$inst
anceId",code=~"5.."}[30m])))/(sum(increase(mas_predict_asset_predictions_load_duration_seconds
_count{masinstanceid="$instanceId"}[30m])))*100