Host unreachable for multiple ipv4 address on compute node

Problem

After adding host, the host Health status is Critical - Host is unreachable on Host details page of management node web UI. From /var/log/nova/ibm-health.log, the ip address behind PING command is not the ip address you input(if you input hostname, it could be parsed to the ip address you desired) from add host page.

2020-*-* 03:12:38.356 3536774 INFO powervc_oslo.config.data_utils [req-62a76b8c-9ead-4cf8-8076-77d52084830d - - - - -] PING 172.26.2.26 (172.26.2.26) 56(84) bytes of data.
2020-*-* 03:12:38.356 3536774 INFO powervc_oslo.config.data_utils [req-62a76b8c-9ead-4cf8-8076-77d52084830d - - - - -] From 172.26.2.18 icmp_seq=1 Destination Host Unreachable
2020-*-* 03:12:38.356 3536774 INFO powervc_oslo.config.data_utils [req-62a76b8c-9ead-4cf8-8076-77d52084830d - - - - -] From 172.26.2.18 icmp_seq=2 Destination Host Unreachable
2020-*-* 03:12:38.357 3536774 INFO powervc_oslo.config.data_utils [req-62a76b8c-9ead-4cf8-8076-77d52084830d - - - - -]
2020-*-* 03:12:38.357 3536774 INFO powervc_oslo.config.data_utils [req-62a76b8c-9ead-4cf8-8076-77d52084830d - - - - -] --- 172.26.2.26 ping statistics ---
2020-*-* 03:12:38.357 3536774 INFO powervc_oslo.config.data_utils [req-62a76b8c-9ead-4cf8-8076-77d52084830d - - - - -] 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 63ms
2020-*-* 03:12:38.357 3536774 INFO powervc_oslo.config.data_utils [req-62a76b8c-9ead-4cf8-8076-77d52084830d - - - - -] pipe 2
2020-*-* 03:12:38.362 3536774 INFO powervc_health.abstract_health [req-62a76b8c-9ead-4cf8-8076-77d52084830d - - - - -] *** new hs ('id=34,', 'status=CRITICAL,', 'value=20,', 'reason=Host is unreachable.,unknown_reason_details=')

The host ip for hypervisor is the same as the ip after Ping command above, also not the desired ip address you input for add host.

[root@kvmcore18 ~]# openstack hypervisor list
+----+---------------------+-----------------+-------------+-------+
| ID | Hypervisor Hostname | Hypervisor Type | Host IP     | State |
+----+---------------------+-----------------+-------------+-------+
| 35 | kvmcore25           | QEMU            | 172.26.2.26 | up    |
+----+---------------------+-----------------+-------------+-------+

Explanation

If there're multiple ipv4 addresses on compute node, the ip address which connects to the default external gateway is updated to nova database, the host is using this ip to connect to the management network. So if this ip address is not the same as the one you input from add host page, the host is failed to connect from management node.

Resolution

  1. Edit /etc/nova/nova.conf of compute node, in section, add a new line of config: my_ip = ...(the ip address you input on add host page, please parse to ip address if you input hostname when add host) and save it.

  2. Run /opt/ibm/icic/bin/icic-services restart on compute node and wait for it complete.

  3. Go to management node's web UI and click Hosts tab, then click into the host's detail page, wait for about 2~5 minutes and check the Health field, its value should be OK. You can manually refresh it by clicking Refresh button.