Monitoring multicluster clusters

A multicluster administrator, or a user with the SMC_CLUSTER_CONTROL can use the Resources Dashboard to monitor the heath of clusters in terms of hosts, slots, CPU usage, and memory usage from a single view.

Procedure

From the multicluster cluster management console, click Resources > Dashboard.
The dashboard shows an overview of the clusters you manage:
Cluster
Shows the total number of clusters participating in multicluster features. Each cluster is listed in alphabetical order by name.
Hosts
Shows a breakdown of the hosts within the cluster by state: OK, unavailable, and closed. A horizontal bar chart visually displays the proportion of OK, unavailable, and closed hosts for each cluster.

Healthy hosts denote better throughput and a more stable production environment. Select clusters with many available hosts.

Slots
Shows a breakdown of the slots within the cluster by state: allocated and free. A horizontal bar chart visually displays the proportion of allocated and free slots for each cluster. The number of allocated slots in relation to free slots indicates how busy the cluster is (for example, 100 allocated slots versus five free slots indicates a cluster that is close to fully used).
Host CPU usage
Shows a the number of hosts within the cluster using a high percentage of CPU versus a low percentage. A horizontal bar chart visually displays the usage by quartile for each cluster: starting with the minimum usage, the first quartile, the median usage (depicted by a horizontal line), third quartile, and the maximum usage.

Determine what the optimal CPU usage level is for your workload. In some cases, 90% is acceptable and in others 70% is good. Try to recognize the level at which your hosts are attaining the highest CPU usage level without becoming frequently unavailable.

If you want the clusters to be as predictable as possible and for workload units to finish and produce results at a possible cost of speed, ensure your hosts are only running at 70% CPU usage or less.

If what you require is speed (you do not care if some workload units fail as long as most of them finish as quickly as possible), then opt for 90% CPU usage.

Host memory usage
Shows the number of hosts within the cluster using a high percentage of memory versus a low percentage. A horizontal bar chart visually displays the usage by quartile for each cluster: starting with the minimum usage, the first quartile, the median usage (depicted by a horizontal line), third quartile, and the maximum usage.