Configuring advanced node failure detection on hardware management console (HMC) with CIM server

A Hardware Management Console (HMC) can be used with advance node failure detection to prevent cluster partitions when a cluster node has actually failed.

Prior to configuration, verify the *CIMOM TCP and the *SSHD TCP servers are running.
  1. To ensure the *CIMOM TCP server is running on your IBM® i. Look for the QUMECIMOM job within the QSYSWRK subsystem to determine if it is running. If the server is not running, start it using the command STRTCPSVR *CIMOM.
  2. Ensure the *SSHD TCP server is running on your IBM i. At the command line display, enter STRTCPSVR *SSHD. To start the *SSHD server, set the QSHRMEMCTL system value is set to 1.
Note: You must have access to the HMC either through the physical monitor and keyboard or remotely through a configured SSH client. You cannot access the HMC with telnet or web interface. Detailed information about connections to the HMC are found in the section, Setting up secure script execution between SSH clients and the HMC.
The *CIMOM TCP server must be configured and started on each cluster node that has a cluster monitor configured on it. The default configuration of the *CIMOM server that is provided by the installation of the 5770-UME LP must be changed so that the IBM i system can communicate with the CIM server.

Use these steps to assist you in setting up the HMC to monitor for node failures. The cluster nodes to be monitored must register with the HMC through a CIM server on one of the nodes. To register the CIM and cluster require digital certification. The HMC must be used to copy this file to the cluster node with the following steps:

  1. Access the HMC or connect to it with a Secure Shell (SSH) session.
  2. At the HMC or in the SSH, locate the security certification file to share with your IBM i cluster node. The file is similar to, /etc/Pegasus/server.pem QSECOFR@LP0236A:/server_name.pem. Prepare to copy the file to your server with the SCP command. Before initiating this secure copy, change the location name fromLP0236A to the name of your IBM i system. Change the file server_name.pem to the name of your HMC, for example, yourHMC.pem.
  3. At the HMC or in the SSH, copy to your IBM i cluster node your modified file using the secure copy SCP command: scp /etc/Pegasus/server.pem QSECOFR@YOUR_IBM_i_system:/yourHMC.pem.
    You must have a home directory associated with your profile on the IBM i. For example: if using the QSECOFR profile as the profile running the SCP command, you need a /home/QSECOFR directory in the integrated file system on the IBM i. Verify that you have the directory created in the correct profile.
  4. Sign off the HMC or close your SSH session.

With the digital certificate on your IBM i cluster node, follow this procedure to enter the file into the truststore:

  1. Sign on your IBM i system and open the command line display.
  2. In the command line display, enter call qp2term to enter the PASE shell environment.
  3. Locate the HMC digital certificate: yourHMC.pem/QOpenSys/QIBM/UserData/UME/Pegasus/ssl/truststore/yourHMC.pem.
  4. Add this digital certificate to your IBM i system truststore with the MOVE command: mv/QOpenSys/QIBM/ProdData/UME/Pegasus/bin/cimtrust -a -U QSECOFR -f /QOpenSys/QIBM/UserData/UME/Pegasus/ssl/truststore/yourHMC.pem -T s.
  5. Press F3 to exit the PASE environment.
  6. On the command line display enter ENDTCPSVR *CIMOM to end the CIM server.

To configure the *CIMOM server to communicate with the IBM i, change the enableAuthentication and sslClientVerificationMode security settings following these steps:

  1. Restart the CIM server and pick up the new certificate, enter STRTCPSVR *CIMOMin the command line.
  2. In the command line display, enter call qp2term to start PASE shell and run the CIMCONFIG command.
  3. Enter /QOpenSys/QIBM/ProdData/UME/Pegasus/bin/cimconfig -s enableAuthentication=false -p.
  4. Enter /QOpenSys/QIBM/ProdData/UME/Pegasus/bin/cimconfig -s sslClientVerificationMode=optional -p.
    These two alterations change the security configuration attributes, permitting the IBM I to communicate with the CIM server. See Authentication on CIMOM for more information about sslClientVerificationMode attribute.
  5. Exit the PASE shell with F3 and end the *CIMOM server usingENDTCPSVR *CIMOM.
  6. Restart the *CIMOM server again from the command line with the STRTCPSVR *CIMOM.

Installing a new version of software on the HMC partition may generate a new certificate which will then cause communication between the HMC partition and the cluster node to fail producing error CPFBBCB with error code 4. If this occurs, add the new digital certificate to the truststore on the nodes which have that HMC or VIOS partition configured in a cluster monitor.

You are ready to perform the cluster configuration sequence using either the ADDCLUMON command or with the IBM Navigator for i. Follow the instructions in the topic, Add a cluster monitor to a node. For additional information about the ADDCLUMON, see the Add Cluster Monitor (ADDCLUMON) command in the Knowledge Center.