Amazon Managed Service for Prometheus
Overview
Kubecost leverages the open-source Prometheus project as a time series database and post-processes the data in Prometheus to perform cost allocation calculations and provide optimization insights for your Kubernetes clusters such as Amazon Elastic Kubernetes Service (Amazon EKS). Prometheus is a single machine statically-resourced container, so depending on your cluster size or when your cluster scales out, it could exceed the scraping capabilities of a single Prometheus server. In collaboration with Amazon Web Services (AWS), Kubecost integrates with Amazon Managed Service for Prometheus (AMP), a managed Prometheus-compatible monitoring service, to enable the customer to easily monitor Kubernetes cost at scale.
Reference resources
Architecture
The architecture of this integration is similar to Amazon EKS cost monitoring with Kubecost, which is described in the previous blog post, with some enhancements as follows:
In this integration, an additional AWS SigV4 container is added to the cost-analyzer pod, acting as a proxy to help query metrics from Amazon Managed Service for Prometheus using the AWS SigV4 signing process. It enables passwordless authentication to reduce the risk of exposing your AWS credentials.
When the Amazon Managed Service for Prometheus integration is enabled, the bundled Prometheus server in the Kubecost Helm Chart is configured in the remote_write mode. The bundled Prometheus server sends the collected metrics to Amazon Managed Service for Prometheus using the AWS SigV4 signing process. All metrics and data are stored in Amazon Managed Service for Prometheus, and Kubecost queries the metrics directly from Amazon Managed Service for Prometheus instead of the bundled Prometheus. It helps customers not worry about maintaining and scaling the local Prometheus instance.
There are two architectures you can deploy:
- The Quick-Start architecture supports a small multi-cluster setup of up to 100 clusters.
- The Federated architecture supports a large multi-cluster setup for over 100 clusters.
Quick-Start architecture
The infrastructure can manage up to 100 clusters. The following architecture diagram illustrates the small-scale infrastructure setup:
Federated architecture
To support the large-scale infrastructure of over 100 clusters, Kubecost leverages a Federated ETL architecture. In addition to Amazon Prometheus Workspace, Kubecost stores its extract, transform, and load (ETL) data in a central S3 bucket. Kubecost's ETL data is a computed cache based on Prometheus's metrics, from which users can perform all possible Kubecost queries. By storing the ETL data on an S3 bucket, this integration offers resiliency to your cost allocation data, improves the performance and enables high availability architecture for your Kubecost setup.
The following architecture diagram illustrates the large-scale infrastructure setup:
Instructions
Prerequisites
You have an existing AWS account. You have IAM credentials to create Amazon Managed Service for Prometheus and IAM roles programmatically. You have an existing Amazon EKS cluster with OIDC enabled. Your Amazon EKS clusters have Amazon EBS CSI driver installed
Create Amazon Managed Service for Prometheus workspace:
Step 1: Run the following command to get the information of your current EKS cluster:
kubectl config current-context
The example output should be in this format:
arn:aws:eks:${AWS_REGION}:${YOUR_AWS_ACCOUNT_ID}:cluster/${YOUR_CLUSTER_NAME}
Step 2: Run the following command to create new a Amazon Managed Service for Prometheus workspace
exportAWS_REGION=<YOUR_AWS_REGION>
aws amp create-workspace --aliaskubecost-amp --region$AWS_REGION
The Amazon Managed Service for Prometheus workspace should be created in a few seconds. Run the following command to get the workspace ID:
exportAMP_WORKSPACE_ID=$(aws amp list-workspaces --region${AWS_REGION}--output json --query'workspaces[?alias==`kubecost-amp`].workspaceId | [0]'| cut -d'"'-f 2)echo$AMP_WORKSPACE_ID
Setting up the environment:
Step 1: Set environment variables for integrating Kubecost with Amazon Managed Service for Prometheus
Run the following command to set environment variables for integrating Kubecost with Amazon Managed Service for Prometheus:
exportRELEASE="kubecost"exportYOUR_CLUSTER_NAME=<YOUR_EKS_CLUSTER_NAME>exportAWS_REGION=${AWS_REGION}exportVERSION="{X.XXX.X}"exportKC_BUCKET="kubecost-etl-metrics"# Remove this line if you want to set up small-scale infrastructureexportAWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)exportREMOTEWRITEURL="https://aps-workspaces.${AWS_REGION}.amazonaws.com/workspaces/${AMP_WORKSPACE_ID}/api/v1/remote_write"exportQUERYURL="http://localhost:8005/workspaces/${AMP_WORKSPACE_ID}"
Step 2: Set up S3 bucket, IAM policy and Kubernetes secret for storing Kubecost ETL files
Note: You can ignore Step 2 for the small-scale infrastructure setup.
a. Create Object store S3 bucket to store Kubecost ETL metrics. Run the following command in your workspace:
aws s3 mb s3://${KC_BUCKET}
b. Create IAM Policy to grant access to the S3 bucket. The following policy is for demo purposes only. You may need to consult your security team and make appropriate changes depending on your organization's requirements.
Run the following command in your workspace:
# create policy-kubecost-aws-s3.json file
cat <<EOF>policy-kubecost-aws-s3.json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": "arn:aws:s3:::${KC_BUCKET}"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucketMultipartUploads",
"s3:AbortMultipartUpload",
"s3:ListBucket",
"s3:DeleteObject",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::${KC_BUCKET}",
"arn:aws:s3:::${KC_BUCKET}/*"
]
}
]
}
EOF# create the AWS IAM policy
aws iam create-policy \
--policy-name kubecost-s3-federated-policy-$YOUR_CLUSTER_NAME\
--policy-document file://policy-kubecost-aws-s3.json
c. Create Kubernetes secret to allow Kubecost to write ETL files to the S3 bucket. Run the following command in your workspace:
# create manifest file for the secret
cat <<EOF>federated-store.yaml
type: S3
config:
bucket: "${KC_BUCKET}"
endpoint: "s3.amazonaws.com"
region: "${AWS_REGION}"
insecure: false
signature_version2: false
put_user_metadata:
"X-Amz-Acl": "bucket-owner-full-control"
http_config:
idle_conn_timeout: 90s
response_header_timeout: 2m
insecure_skip_verify: false
trace:
enable: true
part_size: 134217728
EOF# create Kubecost namespace and the secret from the manifest file
kubectl create namespace${RELEASE}
kubectl create secret generic \
kubecost-object-store -n${RELEASE}\
--from-file federated-store.yaml
Step 3: Set up IRSA to allow Kubecost and Prometheus to read & write metrics from Amazon Managed Service for Prometheus
These following commands help to automate the following tasks:
- Create an IAM role with the AWS-managed IAM policy and trusted policy for the following service accounts:
kubecost-cost-analyzer-amp
,kubecost-prometheus-server-amp
. - Modify current K8s service accounts with annotation to attach a new IAM role.
Run the following command in your workspace:
eksctl create iamserviceaccount \
--name kubecost-cost-analyzer-amp \
--namespace${RELEASE}\
--cluster${YOUR_CLUSTER_NAME}--region${AWS_REGION}\
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusQueryAccess \
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess \
--attach-policy-arn arn:aws:iam::${AWS_ACCOUNT_ID}:policy/kubecost-s3-federated-policy-${YOUR_CLUSTER_NAME}\# Remove this line if you want to set up small-scale infrastructure
--override-existing-serviceaccounts \
--approve
eksctl create iamserviceaccount \
--name kubecost-prometheus-server-amp \
--namespace${RELEASE}\
--cluster${YOUR_CLUSTER_NAME}--region${AWS_REGION}\
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusQueryAccess \
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess \
--override-existing-serviceaccounts \
--approve
For more information, you can check AWS documentation at IAM roles for service accounts and learn more about Amazon Managed Service for Prometheus managed policy at Identity-based policy examples for Amazon Managed Service for Prometheus
Integrating Kubecost with Amazon Managed Service for Prometheus
Preparing the configuration file
Run the following command to create a file called config-values.yaml, which contains the defaults that Kubecost will use for connecting to your Amazon Managed Service for Prometheus workspace.
cat <<EOF > config-values.yaml
global:
amp:
enabled: true
prometheusServerEndpoint: http://localhost:8005/workspaces/${AMP_WORKSPACE_ID}
remoteWriteService: https://aps-workspaces.${AWS_REGION}.amazonaws.com/workspaces/${AMP_WORKSPACE_ID}/api/v1/remote_write
sigv4:
region: ${AWS_REGION}
sigV4Proxy:
region: ${AWS_REGION}
host: aps-workspaces.${AWS_REGION}.amazonaws.com
EOF
Primary cluster
Run this command to install Kubecost and integrate it with the Amazon Managed Service for Prometheus workspace as the primary:
helm upgrade -i${RELEASE}\
oci://public.ecr.aws/kubecost/cost-analyzer --version$VERSION\
--namespace${RELEASE}--create-namespace \
-f https://tinyurl.com/kubecost-amazon-eks \
-f config-values.yaml \
-f https://raw.githubusercontent.com/kubecost/poc-common-configurations/main/etl-federation/primary-federator.yaml \# Remove this line if you want to set up small-scale infrastructure
--setglobal.amp.prometheusServerEndpoint=${QUERYURL}\
--setglobal.amp.remoteWriteService=${REMOTEWRITEURL}\
--setkubecostProductConfigs.clusterName=${YOUR_CLUSTER_NAME}\
--setkubecostProductConfigs.projectID=${AWS_ACCOUNT_ID}\
--setprometheus.server.global.external_labels.cluster_id=${YOUR_CLUSTER_NAME}\
--setfederatedETL.federator.primaryClusterID=${YOUR_CLUSTER_NAME}\# Remove this line if you want to set up small-scale infrastructure
--setserviceAccount.create=false\
--setprometheus.serviceAccounts.server.create=false\
--setserviceAccount.name=kubecost-cost-analyzer-amp \
--setprometheus.serviceAccounts.server.name=kubecost-prometheus-server-amp \
--setfederatedETL.federator.useMultiClusterDB=true\
Additional clusters
These installation steps are similar to those for a primary cluster setup, except you do not need to follow the steps in the section "Create Amazon Managed Service for Prometheus workspace", and you need to update these environment variables below to match with your additional clusters. Please note that the AMP_WORKSPACE_ID
and KC_BUCKET
are the same as the primary cluster.
exportRELEASE="kubecost"exportYOUR_CLUSTER_NAME=<YOUR_EKS_CLUSTER_NAME>exportAWS_REGION="<YOUR_AWS_REGION>"exportVERSION="1.103.4"exportKC_BUCKET="kubecost-etl-metrics"exportAWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)exportREMOTEWRITEURL="https://aps-workspaces.${AWS_REGION}.amazonaws.com/workspaces/${AMP_WORKSPACE_ID}/api/v1/remote_write"exportQUERYURL="http://localhost:8005/workspaces/${AMP_WORKSPACE_ID}"
Run this command to install Kubecost and integrate it with the Amazon Managed Service for Prometheus workspace as the additional cluster:
helm upgrade -i${RELEASE}\
oci://public.ecr.aws/kubecost/cost-analyzer --version$VERSION\
--namespace${RELEASE} --create-namespace \
-f https://tinyurl.com/kubecost-amazon-eks \
-f config-values.yaml \
-f https://raw.githubusercontent.com/kubecost/poc-common-configurations/main/etl-federation/agent-federated.yaml \# Remove this line if you want to set up small-scale infrastructure
--setglobal.amp.prometheusServerEndpoint=${QUERYURL}\
--setglobal.amp.remoteWriteService=${REMOTEWRITEURL}\
--setkubecostProductConfigs.clusterName=${YOUR_CLUSTER_NAME}\
--setkubecostProductConfigs.projectID=${AWS_ACCOUNT_ID}\
--setprometheus.server.global.external_labels.cluster_id=${YOUR_CLUSTER_NAME}\
--setserviceAccount.create=false\
--setprometheus.serviceAccounts.server.create=false\
--setserviceAccount.name=kubecost-cost-analyzer-amp \
--setprometheus.serviceAccounts.server.name=kubecost-prometheus-server-amp \
--setfederatedETL.useMultiClusterDB=true
Your Kubecost setup is now writing and collecting data from AMP. Data should be ready for viewing within 15 minutes.
To verify that the integration is set up, go to Settings in the Kubecost UI, and check the Prometheus Status section.
Read our Custom Prometheus integration troubleshooting guide if you run into any errors while setting up the integration. For support from AWS, you can submit a support request through your existing AWS support contract.
Add recording rules (optional)
You can add these recording rules to improve the performance. Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their results as a new set of time series. Querying the precomputed result is often much faster than running the original expression every time it is needed. Follow these instructions to add the following rules:
groups:-name:CPUrules:-expr:sum(rate(container_cpu_usage_seconds_total{container_name!=""}[5m]))record:cluster:cpu_usage:rate5m-expr:rate(container_cpu_usage_seconds_total{container_name!=""}[5m])record:cluster:cpu_usage_nosum:rate5m-expr:avg(irate(container_cpu_usage_seconds_total{container_name!="POD",container_name!=""}[5m]))by(container_name,pod_name,namespace)record:kubecost_container_cpu_usage_irate-expr:sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""})by(container_name,pod_name,namespace)record:kubecost_container_memory_working_set_bytes-expr:sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""})record:kubecost_cluster_memory_working_set_bytes-name:Savingsrules:-expr:sum(avg(kube_pod_owner{owner_kind!="DaemonSet"})by(pod)*sum(container_cpu_allocation)by(pod))record:kubecost_savings_cpu_allocationlabels:daemonset:"false"-expr:sum(avg(kube_pod_owner{owner_kind="DaemonSet"})by(pod)*sum(container_cpu_allocation)by(pod))/sum(kube_node_info)record:kubecost_savings_cpu_allocationlabels:daemonset:"true"-expr:sum(avg(kube_pod_owner{owner_kind!="DaemonSet"})by(pod)*sum(container_memory_allocation_bytes)by(pod))record:kubecost_savings_memory_allocation_byteslabels:daemonset:"false"-expr:sum(avg(kube_pod_owner{owner_kind="DaemonSet"})by(pod)*sum(container_memory_allocation_bytes)by(pod))/sum(kube_node_info)record:kubecost_savings_memory_allocation_byteslabels:daemonset:"true"
Troubleshooting
The below queries must return data for Kubecost to calculate costs correctly.
For the queries below to work, set the environment variables:
KUBECOST_NAMESPACE=kubecost
KUBECOST_DEPLOYMENT=kubecost-cost-analyzer
CLUSTER_ID=YOUR_CLUSTER_NAME
- Verify connection to AMP and that the metric for
container_memory_working_set_bytes
is available:If you have set
kubecostModel.promClusterIDLabel
, you will need to change the query (CLUSTER_ID
) to match the label (typicallycluster
oralpha_eksctl_io_cluster_name
).kubectlexec-i -t -n$KUBECOST_NAMESPACE\ deployments/$KUBECOST_DEPLOYMENT-c cost-analyzer-frontend \ -- curl"0:9090/model/prometheusQuery?query=container_memory_working_set_bytes\{CLUSTER_ID=\"$CLUSTER_ID\"\}"\ |jq
The output should contain a JSON entry similar to the following.
The value of
cluster_id
should match the value ofkubecostProductConfigs.clusterName
.{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"container_memory_working_set_bytes","cluster_id":"qa-eks1","alpha_eksctl_io_cluster_name":"qa-eks1","alpha_eksctl_io_nodegroup_name":"qa-eks1-nodegroup","beta_kubernetes_io_arch":"amd64","beta_kubernetes_io_instance_type":"t3.medium","beta_kubernetes_io_os":"linux","eks_amazonaws_com_capacityType":"ON_DEMAND","eks_amazonaws_com_nodegroup":"qa-eks1-nodegroup","id":"/","instance":"ip-10-10-8-66.us-east-2.compute.internal","job":"kubernetes-nodes-cadvisor"},"value":[1697630036,"3043811328"]}]}}
- Verify Kubecost metrics are available in AMP:
kubectlexec-i -t -n$KUBECOST_NAMESPACE\
deployments/$KUBECOST_DEPLOYMENT-c cost-analyzer-frontend \
-- curl"0:9090/model/prometheusQuery?query=node_total_hourly_cost\{CLUSTER_ID=\"$CLUSTER_ID\"\}"\
|jq
The output should contain a JSON entry similar to:
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"node_total_hourly_cost","cluster_id":"qa-eks1","alpha_eksctl_io_cluster_name":"qa-eks1","arch":"amd64","instance":"ip-192-168-47-226.us-east-2.compute.internal","instance_type":"t3.medium","job":"kubecost"},"value":[1697630306,"0.04160104542160034"]}]}}
If the above queries fail, check the following:
- Check logs of the
sigv4proxy
container (may be the Kubecost deployment or Prometheus Server deployment depending on your setup):kubectl logs deployments/$KUBECOST_DEPLOYMENT-c sigv4proxy --tail -1
In a working
sigv4proxy
, there will be very few logs.Correctly working log output:
time="2023-09-21T17:40:15Z" level=info msg="Stripping headers []" StripHeaders="[]" time="2023-09-21T17:40:15Z" level=info msg="Listening on :8005" port=":8005"
- Check logs in the `cost-model`` container for Prometheus connection issues:
kubectl logs deployments/$KUBECOST_DEPLOYMENT-c cost-model --tail -1 |grep -i err
Example errors:
ERR CostModel.ComputeAllocation: pod query 1 try 2 failed: avg(kube_pod_container_status_running...
Prometheus communication error: 502 (Bad Gateway) ...