Configuring Kubernetes monitoring

After installing your Monitoring server, you can configure the Kubernetes data collector for monitoring the applications in your Kubernetes environment. Use the Kubernetes data collector to manage the collection, enrichment, and dispatch of Kubernetes topology, events, and performance data.

Prerequisites

Considerations

Connectivity check

About this task

Deploying the Kubernetes data collector involves downloading the data collectors installation eImage, logging into the IBM Cloud Pak console, downloading the data collector configuration package, installing the data collector, and validating the installation.

The eImage is the data collectors package and contains all the installable data collectors. The configuration package (ConfigPack) contains the ingress URLs and authentication information required to configure the data collector package to communicate with the Monitoring server.

Procedure

Download the eImage data collectors installation tar file and the data collector configuration package:

  1. If you haven't already, download the data collectors installation eImage from IBM Passport Advantage. For more information, see Part numbers.

  2. Download the data collector configuration package:

    1. Go to Administer > Monitoring > Integrations on the console.
    2. Click the New integration button.
    3. In the Standard monitoring agents section, select the Monitoring Data Collectors Configure button.
    4. Select Download file and specify the directory where you want to save the compressed data collector configuration package, ibm-cloud-apm-dc-configpack.tar.
  3. Move the downloaded installation package and the configuration package to a node in the cluster that you want to monitor:

    Examples using secure copy:

    <scp my_path_to_download>/app_mgmt_k8sdc.tar.gzroot@<my.env.com>:/<my_path_to_destination>
    scp <my_path_to_download>/ibm-cloud-apm-dc-configpack.tar root@my.env.com:/<my_path_to_destination>
    

    where

    • <my_path_to_download> is the path to where the installation tar file or configuration package file was downloaded.
    • root@<my.env.com> is your user ID on the system where the kubectl client is configured to point to the environment to be monitored.
    • <my_path_to_destination> is the path to the environment that you want to monitor.

    Install the Kubernetes data collector in the Kubernetes cluster that you want to monitor:

  4. If you are not installing from your master node, configure the kubectl client to point to the master node of the cluster that you want to monitor.

    This step isn't needed if you are installing from your master node because the kubectl client points to the node that you are on by default.

    In the IBM Cloud Pak console, you can click user icon > Configure client and follow the instructions to run the kubectl config commands.

  5. Initialize Helm:

    helm init
    
  6. Log in to your Docker registry. The Docker registry must be the same one referenced in the Ansible script command.

    docker login -u <my_username> -p <my_password> <my_clustername>:<my_clusterport>
    

    where

    • <my_username> and <my_password> are the user name and password for the Docker registry
    • <my_clustername> is the name of the cluster that you're monitoring
    • <my_clusterport> is the port number for the Docker registry
  7. Extract the Kubernetes data collector package from the installation tar file that you downloaded in step 3 and move ibm-cloud-apm-dc-configpack.tar to your working directory:

    tar -xvf cp4mcm_DataCollectors_2.3.tar.gz
    cd cp4mcm_DataCollectors_2.3
    tar -xvf app_mgmt_k8sdc.tar.gz
    cd app_mgmt_k8sdc
    mv <my_path_to_configpack>/ibm-cloud-apm-dc-configpack.tar
    
  8. If you want to use the default environment size, which is size0, you can skip this step. If you want to use another deployment size, complete this step. Edit the values.yaml file to specify environmentSize:

    tar -xvf app_mgmt_k8sdc_helm.tar.gz  --warning=no-timestamp
    sed -i 's/environmentSize:.*/environmentSize: "size1"/' k8monitor/values.yaml
    mv app_mgmt_k8sdc_helm.tar.gz app_mgmt_k8sdc_helm_old.tar.gz 
    tar -cvf app_mgmt_k8sdc_helm.tar k8monitor/
    gzip app_mgmt_k8sdc_helm.tar
    

    where size1 is the environment size Before running the installation script in the next step, you must decide on important deployment configurations in the following table to be passed to the install script.

  9. Check the following Table 1 to determine which options to specify, then run the Ansible script with the configuration options and defaults that are required for the Kubernetes environment that you are monitoring:

    ansible-playbook helm-main.yaml --extra-vars="configOption1=configValue1 configOption2=configValue2"
    

    Examples:

    ansible-playbook helm-main.yaml --extra-vars="cluster_name=myCluster release_name=camserver namespace=default docker_group=default tls_enabled=true"
    
Configuration option Description Required Default configuration value
cluster_name Unique name to distinguish your cluster from the other clusters being monitored. Only alphanumeric characters and - are supported, with no spaces. If you enter invalid characters, they are removed from the name. The assigned cluster name in the example is myCluster. It is not recommended that you change the cluster name after deployment. If you must change the cluster_name after deployment, see the What to do next section later in this topic and be advised that: The change will not be immediately available. You must wait for the monitor to restart and complete one collection cycle.Previous thresholds and policies that denote the cluster may need to be manually updated.Preexisting incidents will reference the old cluster name until they expire. No UnnamedCluster
release_name The Helm release name for the Kubernetes data collector. Choose a release name that does not yet exist in your environment. No icam-kubernetes-resources
namespace The Kubernetes namespace where you want your Kubernetes data collector and configuration secrets to be created. The namespace must already exist in your cluster, because it will not be created. No default
docker_registry The host and port of the Docker registry where you want to store the data collector images. Example: <CLUSTER_DOMAIN_NAME> No default-route-openshift-image-registry.apps.<CLUSTER_DOMAIN_NAME>
docker_group The Docker group in the registry where you want to store your images. Example: myRegistry:1000/mydockergroup. If you are installing the data collector in a different namespace from the default, you must also assign docker_group and with the same name as the namespace. No default
tls_enabled Specifies whether TLS (Transport Layer Security) is enabled in your environment: true or false. Yes No default. You must provide a value: true or false

Troubleshooting: In a slower environment, the Ansible script might fail during the TASK [Gather Facts] stage due to a time out and return a message such as "Timer expired after 10 seconds". If this happens, edit the Ansible config file at /etc/ansible/ansible.cfg by uncommenting the # gather_timeout = 10 line and extending the time out value (30 should be sufficient).

Validate the deployment

  1. After the installation script has completed, wait for the deployment to become ready as indicated by this message:

    kubectl get deployment my_ReleaseName-k8monitor --namespace=my_ReleaseNamespace
    

    Depending on the size and health of your environment, it can take up to 10 minutes for the Kubernetes data collector to start up and output logs that you can review. (For more information, see Checking the Kubernetes installation logs.) The data collector startup creates a Kubernetes event, which generates an informational incident.

  2. View the data collector metrics and incidents in the console to confirm that the data collector is successfully monitoring:

    • Select the Resources tab. Find your Kubernetes resource types. For instructions, see Viewing your managed resources. If this is your first installation, you'll see 1 cluster.

    • Select the Incidents tab and click All incidents, then click icon and filter by Priority 4 incidents. You should see incidents about Kubernetes monitoring availability. For more information, see Managing incidents.

      You can also review the logs. For more information, see Checking the Kubernetes installation logs.

Results

What to do next