Health status is not 'OK'
After certain actions, the health of a resource might change to 'Warning', 'Attention', or 'Critical'. Further action is required to change the health status back to 'OK'.
Problem
The health status of a resource is a value other than 'OK', but a potential resolution is not apparent. Either the complete problem is not known or no recourse is provided.Explanation
Actions such as changing a resource password or regenerating a certificate can adversely affect the relationship between PowerVC and a resource. If problems arise, they are reflected in the health status of the resource in PowerVC. Whenever the health status is 'Warning', 'Attention', or 'Critical', you potentially must take steps to repair the relationship and change the status back to 'OK'.Investigation: Health Manager Service overview
The PowerVC Health Manager Service provides resource health state processing that is based on defined health policies. Knowing how those policies work might help you solve problems that are related to the health of a resource. The following information details the properties that PowerVC uses to derive health for each type of monitored resource. It also references the policies that PowerVC supplies and uses to determine the health of each resource.- Hardware Management Console
- Hardware Management Console health is derived from the HMC
management service state. For example, if an HMC management service state value is
'connection_failed', the HMC health status value is 'Attention' and the following explanation is
given:
The Access State of HMC HMC_ID is Connection failed
. For more information, see the hmc-health-policy.json JSON encoded health policy in the /etc/nova/powervc-health-policy directory. - Hosts
- Host health is derived from the following properties:
- Local hypervisor state
- Related Nova host service state
The Hypervisor State of Host host_ID is "Error"
. For more information, see the hypervisor-health-policy.json JSON encoded health policy in the /etc/nova/powervc-health-policy directory. - Virtual servers
- Virtual server health is derived from the following properties:
- Local power state
- Local virtual machine state
- Local remote restart state
- Local Resource Monitoring and Control (RMC) state
- Related hypervisor state
- Related Nova host service state
- Related volume status
The RMC state of virtual machine VM_name is Inactive
. For more information, see the server-health-policy.json JSON encoded health policy in the /etc/nova/powervc-health-policy directory. - Storage providers
- Storage provider health is derived from the following properties:
- Local storage provider access state
- Related Cinder host service state
The Access State of Host storage_provider_host_ID is "Authentication Error"
. For more information, see the storage-provider-health-policy.json JSON encoded health policy in the /etc/cinder/powervc-health-policy directory. - Storage volumes
- Storage volume health is derived from the following properties:
- Local volume status state
- Related storage provider access status
- Related Cinder host service state
The Status of Volume volume_ID is "Error"
. For more information, see the volume-health-policy.json JSON encoded health policy in the /etc/cinder/powervc-health-policy directory.
Resolution
Complete the following steps to resolve the problem and change the health status of a resource back to 'OK'.- On the properties page for a resource, the Health and Fault fields provide some information about the reason for a status other than 'OK'. See that information and implement any recourses that seem appropriate.
- Look for messages that are related to the resource and implement any recommended recourses.
- If you are unable to determine and resolve the problem with the information that is provided in the health status field or the messages, implement basic troubleshooting procedures. For example, ensure that the resource is accessible and check its state on the management console.
- For virtual servers, if RMC health state is not
OK
even after you install cloud-init andRSCT Utilities
then make sure to implement one of the following steps.- Disable the firewall.
- Enable port 657 on virtual server.
- If the previous steps did not result in a health status of 'OK', further investigation is needed. For more information and potential solutions, search the PowerVC knowledge center or the resource documentation.