High availability services and service agent
High availability services are cluster management services that are controlled by the high availability (HA) manager. The HA manager uses HA service agents to monitor the services directly. Service agents can start, stop, and monitor one or more service daemons, and report their status to HA manager.
Services are only run on the active management node. If at any point a service stops or quits unexpectedly, the service is restarted on the same node. When a failover process occurs, the failover node takes over as the management node, and all the running services.
In order for the services to switch nodes, service access points are defined to enable the high availability process. A service access point defines a virtual IP address that is used by the active management node to access IBM® Spectrum Cluster Foundation Community Edition services. Compute nodes use the virtual IP address of the provision network to access the active management node. The virtual IP address of the public network is used to access the active management node and the Web Portal.
In a failover, the active management node also takes over the virtual IP address. If multiple provision networks are connected to a management node, an IP alias must be defined for each provision network. If the management node connects to a public network, define an IP alias for the public network too.
The virtual IP addresses for service access points are defined in the high availability definition file.
The following is a list of all the service agents that are controlled by the high availability system.
- High availability controller (PCMHA)
- The high availability controller service agent (PCMHA) monitors the virtual IP addresses and coordinates the HA manager to monitor and manage actions.
- Database (PCMDB)
- The database service agent (PCMDB) monitors the PostgreSQL database which stores cluster configuration information and monitoring data.
- Cluster management service (XCAT)
- The cluster management service agent (XCAT) provides the basic functions for the cluster management, for example, node provisioning, hardware remote control.
- Web Portal (WEBGUI)
- The Web Portal service agent (WEBGUI) provides the web-based IBM Spectrum Cluster Foundation Community Edition console.
- Loader controller (PLC)
- The loader controller service agent (PLC) controls the data loaders that collect data from the system and writes the data into the database.
- Data purger (PURGER)
- The data purger service agent (PURGER) maintains the size of the database by purging old records from the database.
- Data transformer controller (PTC)
- The data transformer controller service agent (PTC) controls the data transformers that convert data in the relational database into a format usable by the reporting feature.
To manage high availability services, use the pcm-ha-support command and specify the start or stop parameter. For example, to stop and start the PURGER service, run the pcm-ha-support stop --service PURGER and pcm-ha-support start --service PURGER commands.
To view a detailed status of the HA manager and corresponding service agents, run the service pcm status command.
To view the current HA settings, run the pcmhatool info command. HA settings include the virtual IP address, the management node name, and a list of share directories.