Container platform monitored resources
After validating your targets, Turbonomic updates the supply chain with the entities that it discovered. The following table describes the entity mapping between the target and Turbonomic.
Container platform | Turbonomic |
---|---|
Service | Service |
Container | Container |
A container's spec | Container spec |
Pod | Container Pod |
Controller | Workload Controller |
Namespace | Namespace |
Cluster | Container platform cluster |
Node | Virtual Machine |
Persistent Volume (PV) |
Volume
Note:
If a container pod is attached to a volume, Turbonomic discovers it as a Persistent Volume
(PV), and shows which pods are connected to the PV.
|
Monitored resources for services
Turbonomic monitors the following resources:
-
Response time
Response time is the elapsed time between a request and the response to that request. Response time is typically measured in seconds (s) or milliseconds (ms).
For LLM inference workloads, response time is the turnaround time for each request, including both queuing time and service time. When there is no request, response time is unavailable.
-
Transaction
Transaction is a value that represents the per-second utilization of the transactions that are allocated to a given entity.
For LLM inference workloads, Transaction is the total number of tokens per second, which includes both input tokens and generated tokens. When there is no request, Transaction is zero.
-
Number of replicas
Number of replicas is the number of Application Component replicas running over a given time period.
-
Concurrent queries
For LLM inference workloads, concurrent queries is the number of concurrent queries to a workload. When there is no request, concurrent queries is zero.
-
Queueing time
For LLM inference workloads, queueing time is the amount of time that a request spends in a queue before it is processed. When there is no request, queueing time is zero.
-
Service time
For LLM inference workloads, service time SLO is the amount of processing time needed to generate the next token. This metric is relatively stable for a given model and GPU resource. When there is no request, service time is unavailable.
Monitored resources for containers
Turbonomic monitors the following resources:
-
Virtual memory (vMem)
Virtual memory (vMem) is the virtual memory utilized by a container against the memory limit. If no limit is set, node capacity is used.
-
vMem request
If applicable, vMem request is the virtual memory utilized by a container against the memory request.
-
vCPU
vCPU is the virtual CPU (in mCores) utilized by a container against the CPU limit. If no limit is set, node capacity is used).
-
vCPU request
If applicable, vCPU request is the virtual CPU (in mCores) utilized by a container against the CPU request.
-
vCPU throttling
vCPU throttling is the throttling of container virtual CPU that could impact response time, expressed as the percentage of throttling for all containers associated with a Container Spec. In the Capacity and Usage chart for containers, used and utilization values reflect the actual throttling percentage, while capacity value is always 100%.
Monitored resources for container specs
Turbonomic monitors the historical usage of any instance of a container running for the workload (assuming the workload name stays the same). Charts show the trend of usage even with restarts or redeployments.
Monitored resources for container pods
Turbonomic discovers pods with the following Status in Kubernetes and Red Hat OpenShift and matches them to a specific State in Turbonomic.
Kubernetes Status | Turbonomic State |
---|---|
Running | Active |
ImagePullBackOff |
Unknown |
CrashLoopBackoff |
Unknown |
Error | Unknown |
It is expected that the total number of pods in the cluster will not match in Turbonomic due to this. As Turbonomic also does not discover Pending pods or Completed job pods.
Turbonomic monitors the following resources:
-
vMem
vMem is the virtual memory utilized by a pod against the node physical capacity.
-
vMem request
vMem request is the virtual memory request allocated by a pod against the node allocatable capacity.
-
vCPU
vCPU is the virtual CPU (in mCores) utilized by a pod against the node physical capacity.
-
vCPU request
vCPU request is the virtual CPU request (in mCores) allocated by a pod against the node allocatable capacity.
-
vMem request quota
If applicable, vMem request quota is the amount of virtual memory request a pod has allocated against the namespace quota.
-
vCPU request quota
If applicable, vCPU request quota is the amount of virtual CPU request (in mCores) a pod has allocated against the namespace quota.
-
vMem limit quota
If applicable, vMem limit quota is the amount of virtual memory limit a pod has allocated against the namespace quota.
-
vCPU limit quota
If applicable, vCPU limit quota is the amount of virtual CPU limits (in mCores) that a pod allocated against the namespace quota.
Monitored resources for workload controllers
Turbonomic monitors quotas (limits and requests) for vCPU and vMem, and associates how much each Workload Controller is contributing to a quota based on all replicas. This allows Turbonomic to generate rightsizing decisions, and manage the quota as a constraint to rightsizing. Metrics on resource consumption are shown in the Container Spec, Container, and Container Pod views.
Monitored resources for namespaces
Turbonomic monitors the following resources:
-
vMem request quota
vMem request quota is the total amount of virtual memory request for all pods allocated to the namespace against the namespace quota.
-
vCPU request quota
vCPU request quota is the total amount of virtual CPU request (in mCores) for all pods allocated to the namespace against the namespace quota.
-
vMem limit quota
vMem limit quota is the total amount of virtual memory limit for all pods allocated to the namespace against the namespace quota.
-
vCPU limit quota
vCPU limit quota is the total amount of virtual CPU limits (in mCores) for all pods allocated to the namespace against the namespace quota.
Monitored resources for container platform clusters
Turbonomic monitors resources for the containers, pods, nodes (VMs), and volumes in a cluster.
Monitored resources for nodes (VMs)
Turbonomic monitors the following resources for nodes that host pods. These resources are monitored along with the resources from the infrastructure probes, such as vCenter or a public cloud mediation probe.
-
vMem
vMem is the virtual memory currently used by all containers on the node. The capacity for this resource is the Node Physical capacity.
-
vCPU
vCPU is the virtual CPU currently used by all containers on the node. The capacity for this resource is the Node Physical capacity.
-
Memory request allocation
Memory request allocation is the memory available to the node to support the
ResourceQuota
request parameter for a given Kubernetes namespace or Red Hat OpenShift project. -
CPU request allocation
CPU request allocation is the CPU available to the node to support the
ResourceQuota
request parameter for a given Kubernetes namespace or Red Hat OpenShift project. -
Virtual memory request
Virtual memory request is the memory currently guaranteed by all containers on the node with a memory request. The capacity for this resource is the Node Allocatable capacity, which is the amount of resources available for pods and can be less than the physical capacity.
-
Virtual CPU request
Virtual CPU request is the CPU currently guaranteed by all containers on the node with a CPU request. The capacity for this resource is the node allocatable capacity, which is the amount of resources available for pods and can be less than the physical capacity.
-
Memory allocation
Memory allocation is the memory
ResourceQuota
limit parameter for a given Kubernetes namespace or Red Hat OpenShift project. -
CPU allocation
CPU allocation is the CPU
ResourceQuota
limit parameter for a given Kubernetes namespace or Red Hat OpenShift project.