Metrics of data planes
Monitor key performance metrics such as status, performance, throughput, response time, latency, and error rate to assess the health and efficiency of data planes. It is crucial to maintain the reliability, security, and optimal performance of all runtimes associated with data planes.
The widgets on the Monitor data planes page are explained in detail as follows.
Status, performance, and used capacity
The widget provides a comprehensive view of all data planes by displaying their status, performance, and used capacity.
Status
The status of a data plane indicates the health of the data plane. It is determined by the availability of all runtimes that is associated with that data plane. The following are the health status of the data plane.
| Status | Description |
|---|---|
| Indicates that the data plane is in a healthy state. It means that the status of all runtimes in that data plane is green and available. | |
| Indicates that the status of atleast one runtime in that data plane is red. | |
| Indicates that the data plane is not in a healthy state. It means that the status of all runtimes in that data plane is red. | |
| Indicates that there is no adequate data to compute the availability of runtimes in the data
plane. For example, if you register a runtime an hour ago and try to view the dashboard by applying the Last 24 hours filter, the status appears gray due to the lack of available data during that time period. |
For details about how to determine the availability of runtimes, see Availability section in Metrics of runtimes.
Performance
Performance shows how good each data plane is performing based on two metrics - error rate and latency. The threshold for the latency and error rate is specified in the Settings page. Set your preferences by customizing the display settings of the widgets and threshold values for the parameters that control the monitoring capability of the assets.
The following are the performance status of the data plane.
| Performance status | Description |
|---|---|
| Green indicates that both latency and error rate threshold values are within the acceptable limit. | |
| Amber indicates that either latency or error rate threshold value is greater than the acceptable limit and within the warning limit. | |
| Red indicates that both latency and error rate threshold values are greater than the unacceptable limit. | |
| Gray indicates that there is no adequate data to compute the performance of the runtimes in
the data plane. For example, if you register a runtime an hour ago and you try to view the dashboard by applying the Last 24 hours filter, the status appears as gray due to the lack of available data during that time period. |
How to determine the latency for all data planes?
Where,
Example,
To determine the latency of all data planes for 6 hours, assume,
| Runtime | Total latency | Total transactions |
|---|---|---|
| A | 0.05 ms | 60 |
| B | 0.08 ms | 80 |
Latency of the data plane for 6 hours = (0.05+0.08)/(60+80) = 9.28 ms
Used capacity
The used capacity of the data plane shows the percentage of storage that is used by all runtimes in the data plane. The following are the used capacity status of the data plane.
| Used capacity status | Description |
|---|---|
| Green indicates that the capacity rate is lesser than or equal to 30% | |
| Amber indicates that the capacity rate is greater than 30% and lesser than or equal to 70% | |
| Red indicates that the capacity rate is greater than 70% | |
Gray indicates one of the following,
|
How to determine the used capacity of the data plane?
Example,
To determine the used capacity of a data plane, assume that the capacity is defined as 10000 transactions per hour for runtime A. For more details about how to define the capacity of a runtime, see Managing runtimes in federated API management.
The expected TPS of the runtime A = 10000 / (1 hr * 60 min * 60 secs) = 2.77 transactions per second.
Assume that runtime A performed 39000 transactions in the last 6 hours.
The actual TPS of the runtime A = 39000 / (6 hrs * 60 mins * 60 secs) = 1.81 transactions per second.
Assume the expected TPS of the runtime B =3.33 transactions per second
Actual TPS of the runtime B = 1.89 transactions per second
Used Capacity of the data plane =((1.81+1.89)/(2.77+3.33) /)*100 = 60%
The capacity of the data plane is displayed as 60% in amber.
Problematic data planes by
The widget helps identify issues across data planes by analyzing total transaction volumes, error rate, response time, and latency of all associated runtimes. Choose a business metric to view its corresponding least and most problematic data plane. The data plane split and line graph display the data plane in a sequence from the most problematic to the least. Click the Show as table icon to view the line graph in table format. Click ellipsis icon to download the table in CSV, PNG, or JPG formats.
Transactions
The widget provides a comprehensive overview of total transaction volumes across all data planes with a trend percentage. It provides detailed insights into the most-performing and least-performing data planes to help you make informed business decisions. It presents both pictorial and graphical representations of the data planes based on performance, highlighting maximum and minimum transaction counts.
How to determine the total transactions and trend analysis for all data planes?
Total transactions of the data plane = Total transactions of runtime A + Total transactions of runtime B + …..
With Transactions trend analysis, you can compare current and past transactions based on the filter you choose to spot changes.
- A positive value, it is displayed in green with an upward arrow that indicates that the transaction volume is increased.
- A negative value, it is displayed in red with a downward arrow that indicates that the transaction volume is decreased.
Example,
To determine the transactions trend percentage for all data planes for 24 hours, assume,
| Total transactions of data plane in the last 24 hours | Total transactions of data plane in the previous 24 hours (that is 24 hours prior to the last 24 hours) |
|---|---|
| 1000 | 500 |
Transactions trend percentage for 24 hours for all data planes =[(1000-500)/500]*100 = 100%
The determined transactions trend percentage, 100% (positive), appears in green with an upward arrow indicating increased transactions compared to the previous 24 hours.
Availability
With the widget, you can monitor the availability of all data planes and the health status of all its associated runtimes.
How to determine the availability and trend analysis for all data planes?
Example,
To determine the availability of a data plane,
- If federated API management receives the heartbeat status from the runtime, the value of the heartbeat status is represented as 1.
- If federated API management does not receive the heartbeat status from the runtime, the value of the heartbeat status is represented as 0.
The frequency at which the runtime must send the heartbeat status to federated API management is defined in the federated API management Agent configuration. For details about how to configure the federated API management Agent, see Connecting IBM webMethods API Gateway to federated API management section in IBM webMethods API Gateway Administration guide. For details about how to develop an agent by using SDK, see Agent implementation approaches.
To determine the availability of a data plane with 2 runtimes for 1-hour time period, assume,
| Runtime | Sum of the heart beats received | Sum of the heart beats expected |
|---|---|---|
| A | 55 | 60 |
| B | 60 | 60 |
Number of minutes in 1 hour = 60
Availability of a data plane = [((55/60) + (60/60))/2]*100 = 95.8%
You can replace 1 hour with the time range that you select in the filter.
With Availability trend analysis, you can compare current and past availability status based on the filter you choose to spot changes.
- A positive value, it is displayed in green with an upward arrow that indicates that the availability is increased.
- A negative value, it is displayed in red with a downward arrow that indicates that the availability is decreased.
Example,
To determine the availability trend percentage for all data planes for 24 hours, assume,
| Availability of all data planes in last 24 hours | Availability of all data planes in the previous 24 hours (that is 24 hours prior to the last 24 hours) |
|---|---|
| 98% | 80% |
Availability trend percentage for data planes for 24 hours filter = (0.98-0.80)/0.80]*100 = 22%
The determined availability trend percentage, 22% (positive), appears in green with an upward arrow that indicates the availability is increased compared to the previous 24 hours.
Error rate
The error rate indicates the percentage of errors that occurred during API transactions. This widget provides a comprehensive overview of the error rate of all data planes. It provides the details of the data plane with maximum and minimum error rate to make informed business decisions. Click Top data planes or All data planes tab to view its respective error rate in the data plane split and line graph in the order of maximum to minimum. Click the Show as table icon to view the line graph in table format. Click eclipse icon to download the table in CSV, PNG, or JPG formats.
How to determine the error rate and trend analysis for all data planes?
Example,
To determine the error rate of all data planes for 6 hours, assume,
| Runtime | Total error count | Total transactions |
|---|---|---|
| A | 30 | 60 |
| B | 25 | 80 |
Error rate of all data planes for 6 hours = [(30+25)/(60+80)] * 100 = 39%
With Error rate trend analysis, you can compare current and past error rates that are based on the filter you choose to spot changes.
- A positive value, it is displayed in red with an upward arrow, which indicates that the error rate is increased.
- A negative value, it is displayed in green with a downward arrow, which indicates that the error rate is decreased.
Example,
To determine the error rate trend percentage for all data planes for 24 hours, assume,
| Error rate of all data planes in the last 24 hours | Error rate of all data planes in the previous 24 hours (that is 24 hours prior to the last 24 hours) |
|---|---|
| 39% | 52% |
Error rate trend percentage for 24 hours filter for all data planes = [(0.39-0.52)/0.52] * 100 = -25%
The determined error rate trend percentage, -25% (negative), appears in green with a downward arrow that indicates the error rate is decreased compared to the previous 24 hours.