Kubernetes Cluster Provisioning

In order to gather the data we need to perform container cost allocation, you will need to deploy the IBM FinOps Agent, the next-generation data collection agent for Cloudability Containers. In this document you may find references related to a migration scenario from the legacy Cloudability metrics agent, and can safely disregard if you are provisioning for the first time.

Provision the IBM FinOps Agent for Cloudability

This is achieved through a HELM deployment provisioned for each cluster. These HELM commands can be generated by following these steps:

  1. Navigate to Insights > Containers .

  2. Select the Provision Clusters button.

  3. Fill out the form with your cluster name and either your Kubernetes version or your OpenShift version.

  4. Click Next > Generate Command.

Deploy the Unified Agent

Prerequisites:

Google Kubernetes Engine (GKE) specific instructions

You need to add a cluster label in each cluster as follows:

  • Key: gke-cluster
  • Value: The cluster name you set in the Helm as the Cluster Id. This allows Cloudability to map GKE clusters to line items in the GCP billing file, and allocate costs to your clusters. It must be unique across all of your GKE provisioned clusters.

Cloudability will need to ingest a billing file with the cluster labels you added, which can take up to 48 hours. Once Apptio has processed the new billing file, you need to create a new tag mapping in Cloudability. Set a Cloudability tag dimension as "GKE Cluster Name" and map this to the gke-cluster tag.

Ensure that your account has cluster-admin role before deploying the metrics agent. By default, a user account does not have the cluster-admin role. Use the following command on the GKE cluster to grant a user the cluster-admin role:

"kubectl create clusterrolebinding username-cluster-admin-binding --
clusterrole=cluster-admin --user=username@emailaddress.com"

Networking Requirements

The unified agent requires outbound access to the following locations:

  • Cloudability API Endpoints
  • Frontdoor API Endpoints
  • S3 Upload Buckets

For the outbound network details, please reference the README on GitHub for the unified agent here: IBM FinOps Agent Helm Chart

Storage Requirements:

The IBM FinOps Agent requires a configurable persistent volume claim (default 8Gi). This change was to reduce the chance of data loss if/when the agent gets rescheduled. The agent now stores samples on this volume and will attempt to recover any samples after a restart. This improvement vastly improves the agent’s ability to recover data in failure scenarios.

Container Registry change:

The IBM FinOps Agent is not stored in docker like the existing metrics-agent. Instead the IBM FinOps Agent is stored in ICR. So you may need to update your container registry whitelisting to allow the IBM FinOps Agent deployment to pull the ICR image. The IBM FinOps Agent container registry is:

icr.io/ibm-finops/agent:vx.x.x

If you need to pull the image locally to copy to your container registry. You can run the following docker/podman pull command:

 podman pull icr.io/ibm-finops/agent:v0.0.25 --platform=linux/amd64

Authentication:

It is important to call out that the legacy Cloudability metrics agent used a Containers specific API key. The IBM FinOps Agent will no longer use or support this API key. Instead customers need to create an API key for the agent in Frontdoor and gather their Frontdoor Environment ID.

Creating User to manage container API key:

It is required that your Frontdoor environment has API keys enabled on it for the Container Insights feature to work

It is recommended that customers create a container-specific "service account" user within their own domain to manage their API key going forward. This way, the uploading API key is not associated with one specific user that may be deactivated in the future. The person creating the new user and API key needs to be an admin user in their Frontdoor environment.

  1. Navigate to “Access Wizard” in “Access Administration”, select “Add User(s)”, set “Customer” and “Environment”. Finally, hit “Confirm”

  2. Enter user information and hit “Next”

  3. Select “Grant Role(s)”, “Do not send User an activation Email” and hit “Confirm” to create the user

  4. Ensure user is selected for granting roles

  5. Grant the “CloudabilityContainerUploader” Role to the user and hit “Next”/”Confirm”.

  6. Navigate to “Home”, search for the newly created user, and click on their “Username”

  7. Hit “View User Profile”

  8. Add an API key for the user

  9. Enter a name for the API key (ex: IBM-Finops-agent), set “No Expiration” if not already set, and hit confirm.

  10. Store the API Key Credentials (Public Key and Private Key) for later to be used in the helm installation of the agent.

  11. Hit “Grant Access”

  12. Select your environment and hit “Next”

  13. Add the “CloudabilityContainerUploader” role to the key and hit “Next”. This role has limited access to on the containers uploading endpoint.

  14. Check that your key has been created and has the correct Role

Gathering Frontdoor Environment ID

  1. Navigate to Frontdoor

  2. In the top right select the Profile logo and click on “User Account”

  3. Navigate to the Environment Access tab

  4. Gather the environment ID under the Environment tab in the table

    1. Ensuring the environment is the same as what you use to access Cloudability

    2. Example id format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Deploying the metrics-agent with Helm

Prerequisites:

  • Install Helm must be installed to use the charts. Refer to Helm’s documentation to get started.

  • Update Cluster’s Networking policies to allow for new Networking Requirements (if necessary)

  • Gather API Public and Private Keys

  • Gather Frontdoor Environment ID

Important:

We strongly recommend using a unique clusterId (cluster name) for each Kubernetes cluster. Using duplicate names may lead to data discrepancies or inaccurate cost allocation in Cloudability reporting.

Once Helm has been set up correctly, Add the repo as follows:

helm repo add ibm-finops https://kubecost.github.io/finops-agent-chart

If you had already added this repo earlier, Update it:

helm repo update

Install Helm Repo

helm install ibm-finops-agent ibm-finops/finops-agent \   
--set agent.cloudability.enabled=true \
--set agent.cloudability.uploadRegion=<uploadRegion> \      
--set agent.cloudability.parseMetricData=false \
--set agent.cloudability.secret.create=true \
--set agent.cloudability.secret.cloudabilityAccessKey="<PublicKey>" \
--set agent.cloudability.secret.cloudabilitySecretKey="<PrivateKey>" \
--set agent.cloudability.secret.cloudabilityEnvId="<FDEnvID>" \
--set clusterId="<ClusterName>" \
--create-namespace -n ibm-finops-agent

To Uninstall Helm Repo:

helm uninstall ibm-finops-agent

If your cluster is currently running the old Cloudability metrics-agent, feel free to keep that running until you see the new IBM FinOps Agent start uploading successfully for 24 hours. The IBM FinOps Agent can be installed and run in parallel to the Cloudability metrics-agent but it is recommended to spin down the Cloudability metrics-agent once the new agent is stable in your cluster.

UploadRegion depends on what region the customer’s Cloudability environment exists in. The supported values are below

  • US: us (or us-west-2)

  • EU: eu (or eu-central-1)

  • AU: au (or ap-southeast-2)

  • ME: me (or me-central-1)

  • CA: ca (or ca-central-1)

  • Hybrid EU (customers who have EU frontdoor but upload containers data to the US region): hybrid-eu

  • Hybrid AU (customers who have AU frontdoor but upload containers data to the US region): hybrid-au

  • Hybrid ME (customers who have ME frontdoor but upload containers data to the US region): hybrid-me

The clusterId should be the unique cluster name, and if previously provisioned by the legacy metrics-agent it should match the CLOUDABILITY_CLUSTER_NAME to prevent any cost ingestion issues.

Note:

The unified agent supports many of the same configurations as the outgoing Cloudability metrics-agent. If your existing metrics-agent has any specific configurations (for example PROXY configurations) please check out the helm supported parameters here.

You can add these to your install command for example:

helm install ibm-finops-agent ibm-finops/finops-agent \   
--set agent.cloudability.enabled=true \
--set agent.cloudability.uploadRegion=<uploadRegion> \      
--set agent.cloudability.parseMetricData=false \
--set agent.cloudability.secret.create=true \
--set agent.cloudability.secret.cloudabilityAccessKey="<PublicKey>" \
--set agent.cloudability.secret.cloudabilitySecretKey="<PrivateKey>" \
--set agent.cloudability.secret.cloudabilityEnvId="<FDEnvID>" \
--set agent.cloudability.outboundProxy="http://x.x.x.x:8080" \
--set agent.cloudability.parseMetricsData="true"
--set clusterId="<ClusterName>" \
--create-namespace -n ibm-finops-agent
  1. Ensure the ibm-finops-agent pod is running

    kubectl get pods -n ibm-finops-agent
    
    ### Example Output Below ###
    NAME                                READY   STATUS    RESTARTS   AGE
    ibm-finops-agent-7bbf99d9fb-kmhh9   1/1     Running   0          1m
  2. Check the pods logs in order to confirm the agent is successfully uploading data to Cloudability. It will take 10 minutes in order to see the first successful upload log.

    kubectl logs <POD_NAME> -n ibm-finops-agent
    
    ### Example Output Below ###
    INF Starting IBM Finops Agent...
    DBG HTTP server started on port 9003
    INF Initializing cldy emitter
    INF emitting sample to Cldy 0
    INF added sample to Cldy
    INF emitting sample to Cldy 1
    INF added sample to Cldy
    INF emitting sample to Cldy 2
    INF added sample to Cldy
    INF Attempt 1: performing login request to FrontDoor using KeyAccess and KeySecret
    INF Attempt 1: acquiring presigned URL from Cloudability with acquired Open-token
    INF Attempt 1: uploading sample to Cloudability S3 using presigned URL
    INF successfully uploaded metric sample xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx_xxxx-xx-xx-xx-xx-xx.tgz to cloudability

Once again, it will take 10 minutes for the log “successfully uploaded metric sample to cloudability” to appear. This is a common point of failure in agent’s if they do not have the correct whitelisting/proxy settings enabled.

After 24 hours of the IBM FinOps Agent running and uploading successfully. If you are still running a Cloudability metrics-agent deployment, you may now tear down that infrastructure and keep only the IBM FinOps Agent helm chart running.

Note:

Cluster data should show up in Container Insights within 24-48 hours. If you run into any issues with the deployment, contact IBM Apptio support.