Monitoring Microservices Runtime

Overview of Monitoring Microservices Runtime

Microservices Runtime provides capabilities for monitoring the health of a Microservices Runtime and gathering metrics about the server and the microservices it contains. External applications such as container management and monitoring tools, can use the metrics and data supplied by Microservices Runtime to help determine whether the Microservices Runtime is performing correctly or optimally.

Microservices Runtime exposes the monitoring features via endpoints. Requests sent to the endpoint result in the invocation of an internal service that gathers data and then returns the status and/or payload do the requester.

Microservices Runtime includes the following monitoring capabilities:

  • Health gauge which returns an overall up or down status for a Microservices Runtime based on a set of health indicators. The health gauge endpoint is http://host:port/health
  • Metrics which returns server and service metrics in Prometheus configuration format. The metrics endpoint is http://host:port/metrics
Note: The monitoring features are available by default for Microservices Runtime. To use the monitoring features with Integration Server, your Integration Server must have additional licensing.

About the Health Gauge

The health gauge returns an overall UP or DOWN status for the Microservices Runtime based on the collective status of enabled health indicators. When the health endpoint is invoked, Microservices Runtime executes all of the enabled health indicators. A health indicator determines the UP or DOWN status of a specific component of Microservices Runtime. If all of the health indicators return an UP status, the entire Microservices Runtime is considered to be up. The health gauge returns a HTTP 200 status code to the requester. If even one of the health indicators returns a DOWN status, the entire Microservices Runtime is considered to be down. The health gauge returns a HTTP 503 status code to the requester.

Regardless of the overall status, the response includes a payload in JSON format that contains more details for each health indicator, including whether the indicator returned a status of UP or DOWN. By default, the response follows the ASCII order. For example, the ServiceThread health indicator returns the current number of service threads in use, the current number of available threads, and the maximum number of available threads.

Predefined Health Indicators

Microservices Runtime includes predefined health indicators for some of the basic components of a Microservices Runtime. Some, but not all, of the health indicators have a configurable property that you can use to specify the threshold at which a health indicator returns an UP or DOWN status.

The following table describes the predefined health indicators included with Microservices Runtime.

Indicator Name Description
Cluster Checks for the number of available servers in a cluster of Microservices Runtime servers. Returns a status of UP if the number of available servers is greater than or equal to the defined minimum in the Number of cluster hosts property for the Cluster health indicator. Otherwise, returns a status of DOWN. This health indicator is returned only for a Microservices Runtime that is a member of a configured cluster. That is the Microservices Runtime must be a member of a stateful cluster.
Diskspace Checks for low disk space. Returns status of UP if the percentage of free disk space is greater than what is specified in the Free disk space threshold property for the Diskspace health indicator. Otherwise, returns a status of DOWN.
HybridConnections

Checks all the listed tenant connection aliases and associated accounts for webMethods Cloud. Returns a status of UP if all of the listed tenant connections and associated accounts are UP. Returns a status of DOWN if any of the listed tenant or account connections are down.

The health indicator includes individual statuses for tenant connections and associated accounts, along with the overall hybrid connection status.

If an enabled tenant does not have an account alias or the account alias is not enabled, the health indicator does not list the tenant alias.

If a tenant is disabled and the associated account is enabled, the health indicator shows the account status as UP. This behavior is as designed.

Note: The hybrid connectivity alerts are introduced as part of PIE-81173 in IS_10.5_Core_Fix23.
JDBC Checks for available JDBC connections across all JDBC functions such as ISInternal and ISCoreAudit. Returns a status of UP if, for each JDBC connection pool, Microservices Runtime can obtain a valid JDBC connection before a 200 millisecond time out elapses. Otherwise, returns a status of DOWN. The timeout value is not configurable. The JDBC health indicator skips any JDBC functions that do not have an associated pool.
JMS Checks that JMS connection alias are available. Returns a status of UP if all enabled JMS connection aliases are active, meaning that Microservices Runtime can ping the JMS Provider or create a connection successfully. Otherwise, returns a status of DOWN.
JNDIAliases Checks that the connections for a JNDI aliases are up by attempting to make a connection for JNDI.
Memory Checks for low available memory. Returns a status of UP if the percentage of free memory is greater than what is specified for the Free memory threshold property for the Memory health indicator. Otherwise, returns a status of DOWN.
RemoteServers Checks the status of remote servers. Returns a status of UP if Microservices Runtime can successfully invoke the internal service wm.server:ping on each server for which there exists a remote server alias. Otherwise, returns a status of DOWN.
SFTPServers Checks the connection to remote SFTP serves for which an SFTP server alias is configured. Returns a status of UP if a connection can be obtained for all of SFTP server aliases with at least one SFTP user alias.
ServiceThread Checks for low available server threads. Returns a status of UP if the percentage of available server threads is greater than what is specified in the Available threads threshold property for the ServiceThread health indicator. Otherwise, returns a status of DOWN.
Sessions Checks for low available licensed sessions. Returns a status of UP if the percentage of available licensed sessions is greater than the value specified for the Used licenses threshold property for the Sessions health indicator. Otherwise, returns a status of DOWN.
UMAliases Checks that the Universal Messaging connection aliases for webMethods messaging. Returns a status of UP if all of the enabled Universal Messaging connection aliases are available. Otherwise, returns a status of DOWN.
Note: Products installed on top of Microservices Runtime might provide their own health indicators.

Enabling and Disabling Health Indicators

About this task

Whether or not a health indicator is enabled determines if the healthy gauge includes the indicator when determining the UP or DOWN status of the Microservices Runtime. If you do not want the health gauge to include a particular indicator when determining the overall UP or DOWN status, disable the indicator. A disable indicator does not execute when the health endpoint is invoked.

To enable or disable a health indicator

Procedure

  1. In the Microservices menu of the Navigation panel, click Health Gauge.
  2. In the Health Indicator list, do one of the following:
    • To enable a disabled health indicator, click No in the Enabled column.
    • To disable an enabled health indicator, click Yes in the Enabled column.

Health Indicator Properties

Some of the health indicators have a configurable property that determines when a health indicator returns a status of UP or DOWN. For example, the ServiceThreads health indicator has the Available threads threshold which specifies the percentage of the server threads that must be available for the indicator to return a status of UP. You can edit the threshold to tailor the indicator to your environment.

The following table identifies the configurable properties for the predefined health indicators.

Health Indicator Property Name Value
Cluster Number of cluster hosts Specify the minimum number of cluster members that must be available for the Cluster health indicator to return a status of UP. When the number of servers in the cluster is less than the specified minimum number, the Cluster health indicator returns a status of DOWN. The default is 2.
Disks-pace Free disk space threshold (as percentage of maximum available disk space) Specify the percentage of free disk space out of the maximum available disk space above which the Diskspace health indicator returns a status of UP. When free disk space on the host or container on which Microservices Runtime resides is less than or equal to the specified percentage, the Diskspace health indicator returns a status of DOWN. The default is 10 percent.
Memory Free memory threshold (as percentage of maximum memory) Specify the percentage of free memory above which the Memory health indicator returns a status of UP. When free JVM memory for Microservices Runtime is less than or equal to the specified percentage, the Memory health indicator returns a status of DOWN. The default is 10 percent.
ServiceThread Available threads threshold (as percentage of maximum server threads) Specify the percentage of available server threads in the server thread pool at which the ServiceThread health indicator returns a status of UP. When the percentage of available threads is less than or equal to the specified percentage, the ServiceThread health indicator returns a status of DOWN. The default is 10 percent.
Sessions Used licenses threshold (as percentage of total licensed sessions) Specify the percentage of used licensed sessions at which the Sessions health indicator returns a status of DOWN. When the percentage of available licensed sessions is less than or equal to the specified percentage, the Sessions health indicator returns a status of DOWN. The default is 85 percent.

Configuring Health Indicator Properties

About this task
You can edit the properties of a health indicator to tailor the indicator for your environment. A health indicator with one or more configurable properties appears as a hypertext link in the Health Indicators List on the Microservices > Health Gauge page.

To configure health indicator properties

Procedure
  1. In the Microservices menu of the Navigation panel, click Health Gauge
  2. In the Health Indicator list, click the name of the health indicator you want to configure.
  3. On the Microservices > Health Gauge > IndicatorName Properties page, click Edit next to the property name.
  4. On the Microservices > Health Gauge > IndicatorName Properties > Edit page, in the Value field, set a new threshold value for the property.
  5. Click Save Changes.

Invoking the Health Gauge

About this task

You can invoke the health gauge via the health endpoint on the Microservices Runtime. When Microservices Runtime runs in a Docker container, you can use the health endpoint to monitor the state of the container from tools such as Kubernetes.

The request URL for the health endpoint is:

http://<hostname>:<port>/health

Where <hostname> is the IP address or name of the machine and <port> is the port number where Microservices Runtime is running.

Invocation of the health endpoint is restricted to users with Administrator access.

Note: The health endpoint is a predefined URL alias named “health” for the internal service that executes all enabled health indicators. Software AG does not recommend editing the predefined “health” URL alias. If you migrate to Microservices Runtime version 10.3 or higher from an earlier version and you already have a URL alias named “health”, Microservices Runtime does not create a health URL alias that points to the internal service. Any invocations of the health endpoint will not result in execution of health indicators. If you want to use the health gauge and the associated health indicators, you need to rename your existing health URL alias. Upon restart, Microservices Runtime creates a new health URL alias that corresponds to the health endpoint.

Obtaining Metrics for a Microservices Runtime

Microservices Runtime can generate metrics about the server and services on the server that the Prometheus server can use to provide insight to the operation of the Microservices Runtime and the services it contains. Microservices Runtime generates metrics in a Prometheus format. Prometheus is an open source monitoring and alerting toolkit which is frequently used for monitoring microservices.

Microservices Runtime exposes the metrics generating feature via the metrics endpoint. When the metrics endpoint is invoked, Microservices Runtime gathers server and service-level metrics and returns the data in a Prometheus format.

For a detailed list of the metrics returned by Microservices Runtime, see Prometheus Metrics Returned by Microservices Runtime.

Note: The Microservices Runtime documentation assumes a familiarity with Prometheus technology. An in-depth discussion of Prometheus is beyond the scope of this guide but is available elsewhere.

Invoking the Metrics Endpoint

About this task

To instruct Microservices Runtime to gather metrics, you invoke the metrics endpoint on the Microservices Runtime. The request URL would be:

http://<hostname>:<port>/metrics

Where <hostname> is the IP address or name of the machine and <port> is the port number where Microservices Runtime is running.

Invocation of the metrics endpoint is restricted to users with Administrator access.

Note: The metrics endpoint is a predefined URL alias named “metrics” for the internal service that gathers statistics. Software AG does not recommend editing the predefined “metrics” URL alias. If you migrate to Microservices Runtime version 10.3 or higher from an earlier version and you already have a URL alias named “metrics”, Microservices Runtime does not create a metrics URL alias that points to the internal service. Any invocations of the metrics endpoint will not result in the gathering and return of metrics. If you want to use the metrics gathering functionality, you need to rename your existing metrics URL alias name. Upon restart, Microservices Runtime creates a new metrics URL alias that corresponds to the metrics endpoint.