IBM Support

Why is it so important to run Health Monitor Agent in Sterling Order Management (OMS) ?

Technical Blog Post


Abstract

Why is it so important to run Health Monitor Agent in Sterling Order Management (OMS) ?

Body

 

IBM Sterling Order Management (OMS) records an entry into the YFS_HEARTBEAT table for each application server and agent/integration server that starts up. These entries enable OMS to manage servers and to broadcast cached data updates to them.

 

When an OMS server is shut down normally, the corresponding YFS_HEARTBEAT record is marked inactive. When an OMS server ends abnormally (or whenever an application server ends) the corresponding record can remain in the YFS_HEARTBEAT table as active even though it no longer points to a valid running server. These points to servers that are no longer running are known as "stale entries." Large number of stale entries could slow down the management of the servers. For example, the cache refresh broadcast will have to try to notify the servers pointed by the stale entries.

 

Periodically, each JVM updates its status in its YFS_HEARTBEAT record. By default, that refresh interval is set to yantra.statistics.persist.interval  / 2

 

To eliminate stale entries from the JNDI tree, you should run the Health Monitor Agent periodically. To run the Health Monitor Agent, run the startHealthMonitor.sh/cmd script file located in your <INSTALL_DIR>/bin directory
 

 

FAQ
Question: Why do we see different results when querying YFS_HEARTBEAT table for active servers while running health monitor agent?
SQL Query:
select count(*) from yfs_heartbeat where SERVER_TYPE='APPSERVER' and status='00';
 
Answer:
Please check the properties  yfs.heartbeat.refresh.interval and yfs.yantra.statistics.persist.interval in  your instance.  Healthmonitor agents consider the JVM as stale based on yfs.heartbeat.refresh.interval. The value of yfs.heartbeat.refresh.interval should be having a value greater than yfs.yantra.statistics.persist.interval. If you have a smaller value for yfs.heartbeat.refresh.interval, it might be possible that the healthmonitor agent consider the active JVMs are stale and mark it inactive.

 

 

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS6PEW","label":"Sterling Order Management"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB59","label":"Sustainability Software"}},{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS6PEW","label":"Sterling Order Management"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB59","label":"Sustainability Software"}}]

UID

ibm11124637