Cluster Health view

The health of your hosts is one of the major elements of your cluster that can seriously affect its efficiency. Healthy hosts mean better throughput and a more stable production environment.

The Cluster Health view is a window into your cluster. Use this dashboard to get an overall picture of the health of your cluster and to get early warnings of any systemic problems that may be occurring. The Cluster Health view summarizes the following information about your cluster:
  • Status of cluster hosts
  • The primary, SD, and failover host attributes
  • CPU utilization levels

By default, you see the Cluster Health view expanded. If you collapse the view, you see a summary of the hosts within the cluster.

Within the Cluster Health view, you can see the following information:
Host Status
A host can be in any one of three states: OK, Unavailable, and Closed. The host status view lists the number of hosts and the percentage of hosts in each state for the cluster. A pie chart shows the percentage of hosts in each state for the cluster. If you mouse over the pie chart, you can also view the number of hosts in the respective state.
Host Attributes
Mouse over the Master & SD Host icon for a summary of the host's real time attributes. If SD is not installed on the primary host, the SD host is shown separately, as are failover hosts, if applicable. Clicking the icon opens the respective host list page.
CPU Usage for All Hosts in Cluster
Monitor how your hosts are doing. Use this chart for a glimpse of CPU usage relative to the capacity of the cluster.

For your cluster to be healthy and working efficiently, your hosts must be running or available to run workload units. They must also be able to run workload units at an acceptable CPU utilization level. Use the cluster management console to monitor for a large number of closed or unavailable hosts, or hosts running at very high or very low CPU utilization. Either of these indicators means that your cluster is not running as efficiently as possible and may need some attention.

Decide what is the optimal CPU utilization level for your workload. For example, in some cases, 90% is acceptable and in others 70% is good, depending on your goal:
  • Predictability and output: If you want your cluster to be as predictable as possible and for workload units to finish and produce results at a possible cost of speed, ensure your hosts are only running at 70% CPU utilization or less.
  • Fast turnaround time: If what you require is speed (that is, you do not care if some workload units fail as long as most of them finish as quickly as possible), then 90% CPU utilization is probably the correct level for you.

Try to recognize the level at which your hosts are attaining the highest CPU utilization level without becoming frequently unavailable.