IBM Cloud Private System Healthcheck service
IBM Cloud Private system healthcheck service is a REST API that provides the status of your nodes, Kubernetes API server, unhealthy pods, and the IBM Cloud Private management services and their dependencies.
The system healthcheck service provides the health status of your IBM Cloud Private system. View the table for a description of the health status details that are provided with the system healthcheck service:
| Status | Output description |
|---|---|
| IBM Cloud Private cluster node |
|
| Kubernetes API server |
|
| Unhealthy pods |
|
| IBM Cloud Private cluster management service |
|
See IBM Cloud Private components for a description of the components and their dependencies.
Prerequisite: Install IBM Cloud Private with Kubernetes. For more information, see the Kubernetes settings section on the Customizing the cluster with the config.yaml file page.
Important: By default, the system healthcheck service is enabled during installation.
Enabling the system healthcheck service after installation from the management console
As a cluster administrator, complete the following steps to enable the healthcheck service:
-
Log in to your IBM Cloud Private cluster from the management console.
-
Install the
system-healthcheck-servicechart by clicking Catalog. Select thesystem-healthcheck-servicechart. -
Enter a value for the Helm release name.
-
Select the
kube-systemnamespace from the Target namespace menu. -
Click Configure
-
Uninstall the
system-healthcheck-service:- From the navigation menu, click Manage > Helm Repositories.
- Click the Options icon (
) for the
system-healthcheck-servicechart. - Click Delete.
Enabling the system healthcheck service after installation from the command line interface (CLI)
Deploy the system healthcheck service on to your IBM Cloud Private cluster to enable the system healthcheck service. Complete the following steps to deploy the system healthcheck service:
-
Install the
system-healthcheck-servicechart. You must add the internal Helm repository. For more information, see Adding the internal Helm repository to Helm CLI. Install thesystem-healthcheck-serviceby running the following command:helm install mgmt-charts/system-healthcheck-service --name <release-name> --namespace kube-system --tls -
To uninstall the
system-healthcheck-servicechart, run the following command:helm delete <release-name> --purge --tls
Enabling the system healthcheck service after installation from the IBM Multicloud Manager management console
As a cluster administrator, complete the following steps to enable the system healthcheck service:
-
Log in to your IBM Multicloud Manager management console cluster.
-
Install the
system-healthcheck-servicechart on to your hub cluster by clicking Catalog. Select thesystem-healthcheck-servicechart. -
Enter a value for the Helm release name.
-
Select the
kube-systemnamespace from the Target namespace menu. -
Click Configure
-
Uninstall the
system-healthcheck-service:- From the navigation menu, click Manage > Helm Repositories.
- Click the Options icon (
) for the
system-healthcheck-servicechart. - Click Delete.
Enabling the system healthcheck service for IBM Multicloud Manager from the CLI
Deploy the system healthcheck service on to your hub cluster to enable the system healthcheck service for IBM Multicloud Manager. Complete the following steps to enable the system healthcheck service:
-
Install the
system-healthcheck-servicechart. You must add the internal Helm repository. For more information, see Adding the internal Helm repository to Helm CLI. Install thesystem-healthcheck-serviceby running the following command:helm install mgmt-charts/system-healthcheck-service --name <release-name> --namespace kube-system --tls -
To uninstall the
system-healthcheck-servicechart, run the following command:helm delete <release-name> --purge --tls
Getting the cluster service status
Cluster service status is a CustomResourceDefinition (CRD) for the system healthcheck service. ClusterServiceStatus resources provide the health status, failures, and dependencies of all IBM Cloud Private management services.
Note: The resource objects are updated every 10 minutes.
Required access: At least an operator role.
-
Get the health status of your cluster by running the following command:
kubectl get clusterservicestatusYour output might resemble the following information:
NAME SERVICE NAME SERVICE VERSION STATUS icp-audit-logging audit-logging Running icp-auth-apikeys auth-apikeys NotInstalled icp-auth-idp auth-idp Running icp-auth-pap auth-pap Running icp-auth-pdp auth-pdp Running icp-catalog-ui catalog-ui Running icp-heapster heapster NotInstalled icp-helm-api helm-api Running -
Get a description of any management service status. For example, run the following command to get a description of the
icp-monitoringservice status:kubectl describe clusterservicestatus icp-monitoringYour output might resemble the following information:
Name: icp-monitoring Namespace: Labels: app.kubernetes.io/managed-by=icp-system-healthcheck-service clusterhealth.ibm.com/service-name=monitoring Annotations: <none> API Version: clusterhealth.ibm.com/v1 Kind: ClusterServiceStatus Metadata: Creation Timestamp: 2019-07-23T19:46:13Z Generation: 2 Resource Version: 6676898 Self Link: /apis/clusterhealth.ibm.com/v1/clusterservicestatuses/icp-monitoring UID: 87136981-ad82-11e9-9e5a-00000a150b81 Status: Current State: Pending Pod Failure Status: Monitoring - Prometheus - Alertmanager - 5 F 6595 D 54 - 7 Frkq: Image: hyc-cloud-private-edge-docker-local.artifactory.swg-devops.com/ibmcom-amd64/alertmanager:v0.15.0-f Image ID: Last State: Name: alertmanager Ready: false Restart Count: 0 State: Waiting: Message: Back-off pulling image "hyc-cloud-private-edge-docker-local.artifactory.swg-devops.com/ibmcom-amd64/alertmanager:v0.15.0-f" Reason: ImagePullBackOff Image: hyc-cloud-private-edge-docker-local.artifactory.swg-devops.com/ibmcom-amd64/configmap-reload:v0.2.2-f3 Image ID: Last State: Name: configmap-reload Ready: false Restart Count: 0 State: Waiting: Reason: PodInitializing Image: hyc-cloud-private-edge-docker-local.artifactory.swg-devops.com/ibmcom-amd64/icp-management-ingress:latest Image ID: Last State: Name: router Ready: false Restart Count: 0 State: Waiting: Reason: PodInitializing Status Dependencies: iam Events: <none>
Getting the hub cluster service status
-
Get the health status of your hub cluster by running the following command:
kubectl get clusterservicestatusYour output might resemble the following information:
NAME SERVICE NAME SERVICE VERSION STATUS icp-audit-logging audit-logging Running icp-auth-apikeys auth-apikeys NotInstalled icp-auth-idp auth-idp Running icp-auth-pap auth-pap Running icp-auth-pdp auth-pdp Running icp-catalog-ui catalog-ui Running icp-heapster heapster NotInstalled icp-helm-api helm-api Running -
Get a description of any management service status on your hub cluster. For example, run the following command to get a description of the
icp-monitoringservice status:kubectl describe clusterservicestatus icp-monitoringYour output might resemble the following information:
Name: icp-monitoring Namespace: Labels: app.kubernetes.io/managed-by=icp-system-healthcheck-service clusterhealth.ibm.com/service-name=monitoring Annotations: <none> API Version: clusterhealth.ibm.com/v1 Kind: ClusterServiceStatus Metadata: Creation Timestamp: 2019-07-23T19:46:13Z Generation: 2 Resource Version: 6676898 Self Link: /apis/clusterhealth.ibm.com/v1/clusterservicestatuses/icp-monitoring UID: 87136981-ad82-11e9-9e5a-00000a150b81 Status: Current State: Pending Pod Failure Status: Monitoring - Prometheus - Alertmanager - 5 F 6595 D 54 - 7 Frkq: Image: hyc-cloud-private-edge-docker-local.artifactory.swg-devops.com/ibmcom-amd64/alertmanager:v0.15.0-f Image ID: Last State: Name: alertmanager Ready: false Restart Count: 0 State: Waiting: Message: Back-off pulling image "hyc-cloud-private-edge-docker-local.artifactory.swg-devops.com/ibmcom-amd64/alertmanager:v0.15.0-f" Reason: ImagePullBackOff Image: hyc-cloud-private-edge-docker-local.artifactory.swg-devops.com/ibmcom-amd64/configmap-reload:v0.2.2-f3 Image ID: Last State: Name: configmap-reload Ready: false Restart Count: 0 State: Waiting: Reason: PodInitializing Image: hyc-cloud-private-edge-docker-local.artifactory.swg-devops.com/ibmcom-amd64/icp-management-ingress:latest Image ID: Last State: Name: router Ready: false Restart Count: 0 State: Waiting: Reason: PodInitializing Status Dependencies: iam Events: <none>
Getting the IBM Multicloud Manager managed clusters service status from the hub cluster
When you deploy the system healthcheck service on to your hub cluster, the view-cluster-service-status resource view is created. Complete the following steps to get the resource view:
-
Get the resource view by running the following command from the command line interface (CLI):
kubectl get resourceviews -n kube-systemYour output might resemble the following response:
NAME CLUSTER SELECTOR STATUS REASON AGE view-cluster-service-status <none> Completed 6s -
Get the
view-cluster-service-statusresource view to get the cluster service status for all of your managed clusters by running the following command:kubectl get resourceviews view-cluster-service-status -n kube-systemYour output might resemble the following response:
CLUSTER NAME SERVICE NAME SERVICE VERSION STATUS tony-boy icp-metrics-server metrics-server Running tony-boy icp-mgmt-repo mgmt-repo Running tony-boy icp-monitoring monitoring Pending tony-boy icp-security-onboarding security-onboarding Succeeded tony-boy icp-service-catalog service-catalog Running local-cluster icp-audit-logging audit-logging Pending local-cluster icp-auth-apikeys auth-apikeys NotInstalled local-cluster icp-auth-idp auth-idp Running local-cluster icp-auth-pap auth-pap Running -
You can also view the cluster service status from the console:
- Log in to your IBM Multicloud Manager cluster.
- From the navigation menu, click Search.
- Click the search bar. Options to filter your search appear. Click kind.
- Continue to filter your query by selecting clusterservicestatus.
System healthcheck service API details
System healthcheck service endpoints are accessible from your cluster URL. Your URL might resemble the following example: https://<CLUSTER_IP>:8443/cluster-health/swagger-ui/.
Note: Your Kubernetes token is required to run the /nodestatus and /clusterstatus endpoints. From the swagger user interface, click Try it out and enter your token value. Your input might
resemble the following example, Bearer <kube-token>.
The following endpoints are available:
/health/nodestatus/clusterstatus
See the IBM Cloud Private system healthcheck service API for more information.