Container spec policies - declarative configuration

This policy controls the analysis of resize actions generated on container specs. To define the acceptance mode of these actions, create a workload controller policy.

Application owners who do not have access to the Turbonomic user interface can create the ContainerVerticalScalePolicy Custom Resource (CR) in their container platform clusters. This CR is a YAML resource that specifies the container spec policy settings (download a sample here).

Kubeturbo discovers the settings in this CR and then displays them as a container spec policy in the user interface. This policy is read-only (since the CR is the source of truth) and is synced with the CR every ten minutes.

Important:

Before creating the ContainerVerticalScalePolicy CR, be sure to add this CRD to the container platform cluster. This CRD is mandatory and is intended to ensure the validity of the ContainerVerticalScalePolicy CR.

The following example shows a sample CR:

apiVersion: policy.turbonomic.io/v1alpha1
kind: ContainerVerticalScale
metadata:
  labels:
    app.kubernetes.io/name: containerverticalscale
    app.kubernetes.io/instance: containerverticalscale-sample
    app.kubernetes.io/part-of: turbo-policy
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: turbo-policy
  name: <Your_Value>
spec:
  settings:
    limits:
      cpu:
        max: <Your_Value>
        min: <Your_Value>
        recommendAboveMax: <Your_Value>
        recommendBelowMin: <Your_Value>
      memory:   
        max: <Your_Value>
        min: <Your_Value>
        recommendAboveMax: <Your_Value>
        recommendBelowMin: <Your_Value>
    requests:
      cpu:
        min: <Your_Value>
        recommendBelowMin: <Your_Value>
      memory:
        min: <Your_Value>
        recommendBelowMin: <Your_Value>
    increments:
      cpu: <Your_Value>
      memory: <Your_Value>
    observationPeriod:
      min: <Your_Value>
      max: <Your_Value>
    rateOfResize: <Your_Value>
    aggressiveness: <Your_Value>
    cpuThrottlingTolerance: <Your_Value>
  behavior:
    resize: <Your_Value>

The following sections describe the CR settings that you need to configure.

Name

Setting Example
name:
  name: container-vertical-scale-sample

Specify the name of the policy. In the user interface, this policy will be added to Settings > Policies.

CPU and memory limits

Settings Example
   limits:
      cpu:
        max: 
        min: 
        recommendAboveMax: 
        recommendBelowMin: 
      memory:   
        max: 
        min: 
        recommendAboveMax: 
        recommendBelowMin: 
    limits:
      cpu:
        max: 64
        min: 500m
        recommendAboveMax: true
        recommendBelowMin: false
      memory:   
        max: 104857M
        min: 10M
        recommendAboveMax: true
        recommendBelowMin: true

Specify the capacity range (maximum and minimum thresholds) for resize limits.

  • Resize CPU limits, in cores

    Add m if the value is in mCores.

  • Resize memory limits, in GB

    Add M if the value is in MB.

For recommendAboveMax and recommendBelowMin values, set to either true or false.

  • true – When resize values fall outside the capacity range, Turbonomic will post resize actions for you to review. You can only execute these actions outside Turbonomic.

  • falseTurbonomic will not generate resize actions if resize values fall outside the normal range.

CPU and memory requests

Settings Example
    requests:
      cpu:
        min:
        recommendBelowMin:
      memory:
        min:
        recommendBelowMin:
    requests:
      cpu:
        min: 10m
        recommendBelowMin: true
      memory:
        min: 10M
        recommendBelowMin: true

Specify the minimum thresholds for resize requests.

  • Resize CPU requests, in cores

    Add m if the value is in mCores.

  • Resize memory requests, in GB

    Add M if the value is in MB.

For recommendBelowMin values, set to either true or false.

  • true – When resize values fall outside the minimum thresholds, Turbonomic will post resize actions for you to review. You can only execute these actions outside Turbonomic.

  • falseTurbonomic will not generate resize actions if resize values fall outside the minimum thresholds.

Increment constants

Settings Example
    increments:
      cpu:
      memory:
    increments:
      cpu: 100m
      memory: 128M

Turbonomic recommends changes in terms of the specified change constants.

  • Resize increment for CPU, in cores

    Add m if the value is in mCores.

  • Resize increment for memory, in GB

    Add M if the value is in MB.

For example, assume the vCPU request increment constant is 100 mCores and you have requested 800 mCores for a container. Turbonomic could recommend reducing the request by 100, down to 700 mCores.

For vMem, do not set the increment value to be lower than what is necessary for the container to operate. If the vMem change value is too low, Turbonomic might allocate insufficient vMem. For a container that is underutilized, Turbonomic will reduce vMem allocation by the increment constant, but it will not leave a container with zero vMem. For example, if you set this to 128, then Turbonomic cannot reduce the vMem to less than 128 MB.

Rate of resize

Setting Example
    rateOfResize:
    rateOfResize: high

When resizing resources, Turbonomic calculates the optimal values for vCPU and vMem, but it does not necessarily make a change to that value in one action. Turbonomic uses the rate of resize setting to determine how to make the change in a single action.

  • low

    Change the value by one increment constant, only. For example, if the resize action calls for increasing vMem, and the increment constant is set at 128, Turbonomic increases vMem by 128 MB.

  • medium

    Change the value by an increment constant that is 1/4 of the difference between the current value and the optimal value. For example, if the current vMem is 2 GB and the optimal vMem is 10 GB, then Turbonomic will raise vMem to 4 GB (or as close to that as the increment constant will allow).

  • high

    Change the value to be the optimal value. For example, if the current vMem is 2 GB and the optimal vMem is 8 GB, then Turbonomic will raise vMem to 8 GB (or as close to that as the increment constant will allow).

Aggressiveness and observation period

Settings Example
    observationPeriod:
      min:
      max:

    aggressiveness:
    observationPeriod:
      min: 1d
      max: 30d

    aggressiveness: p99

Turbonomic uses these settings to calculate utilization percentiles for vCPU and vMem. It then recommends actions to improve utilization based on the observed values for a given time period.

Aggressiveness

When evaluating vCPU and vMem performance, Turbonomic considers resource utilization as a percentage of capacity. The utilization drives actions to scale the available capacity either up or down. To measure utilization, the analysis considers a given utilization percentile. For example, assume a 99th percentile. The percentile utilization is the highest value that 99% of the observed samples fall below. Compare that to average utilization, which is the average of all the observed samples.

Using a percentile, Turbonomic can recommend more relevant actions. This is important in the cloud, so that analysis can better exploit the elasticity of the cloud. For scheduled policies, the more relevant actions will tend to remain viable when their execution is put off to a later time.

For example, consider decisions to reduce the capacity for vCPU on a container. Without using a percentile, Turbonomic never resizes below the recognized peak utilization. For most containers there are moments when peak vCPU reaches high levels. Assume utilization for a container peaked at 100% just once. Without the benefit of a percentile, Turbonomic will not reduce allocated vCPU for that container.

With Aggressiveness, instead of using the single highest utilization value, Turbonomic uses the percentile you set. For the previous example, assume a single vCPU burst to 100%, but for 99% of the samples vCPU never exceeded 50%. If you set Aggressiveness to 99th Percentile, then Turbonomic can see this as an opportunity to reduce vCPU allocation for the container.

In summary, a percentile evaluates the sustained resource utilization, and ignores bursts that occurred for a small portion of the samples. You can think of this as aggressiveness of resizing, as follows:

  • p100

    Least aggressive, recommended for critical workloads that need maximum guaranteed performance at all times

  • p99

    Recommended setting to achieve maximum performance

  • p90

    Most aggressive, recommended for non-production workloads that can stand higher resource utilization

Max Observation Period

To refine the calculation of resource utilization percentiles, you can set the sample time to consider. Turbonomic uses historical data from up to the number of days that you specify as a sample period. (If the database has fewer days' data then it uses all of the stored historical data.)

A shorter period means there are fewer data points to account for when Turbonomic calculates utilization percentiles. This results in more dynamic, elastic resizing, while a longer period results in more stable or less elastic resizing. You can make the following settings:

  • 90d – Less elastic

  • 30d – Recommended

  • 7d – More elastic

Min Observation Period

This setting ensures historical data for a minimum number of days before Turbonomic will generate an action based on the percentile set in Aggressiveness. This ensures a minimum set of data points before it generates the action.

Especially for scheduled actions, it is important that resize calculations use enough historical data to generate actions that will remain viable even during a scheduled maintenance window. A maintenance window is usually set for "down" time, when utilization is low. If analysis uses enough historical data for an action, then the action is more likely to remain viable during the maintenance window.

  • None (empty) – More elastic

  • 1d – Recommended

  • 3d or 7d – Less elastic

Max CPU throttling tolerance

Setting Example
    cpuThrottlingTolerance:
    cpuThrottlingTolerance: 20%

This value defines your acceptable level of throttling and directly impacts the resize actions generated on CPU limits.

A low percentage value indicates more sensitivity to throttling, while a high value indicates more tolerance for throttling and a higher risk of congestion.

Learn more about CPU throttling here.

Resize behavior

Setting Example
  behavior:
    resize:
  behavior:
    resize: Manual

Specify the degree of automation for the generated resize actions.

To turn on resizes, set the value to Automatic, Manual, or Recommend. To turn off, set the value to Disabled. See Action Acceptance Modes for details.