Actions for AWS VMs

Turbonomic supports the following actions:

  • Scale

    Change the VM instance to use a different instance type or tier to optimize performance and costs.

    See additional information for scale actions.

  • Discount-related actions

    If you have a high percentage of on-demand VMs, you can reduce your monthly costs by increasing RI coverage. To increase coverage, you scale VMs to instance types that have existing capacity.

    If you need more capacity, then Turbonomic will recommend actions to purchase additional RIs.

    Purchase actions should be taken along with the related VM scaling actions. To purchase discounts for VMs at their current sizes, run a Buy VM Reservation Plan.

  • Stop and start (also known as 'parking' actions)

    Stop a VM for a given period of time to reduce your cloud expenses, and then start it at a later time.

    For details, see Parking: Stop or Start Cloud Resources.

Controlling scale actions for AWS VMs

For scale actions, you can create policies to control the scale actions that Turbonomic recommends. In those policies, choose from the following options:

  • Cloud scale all – execute all scaling actions

  • Cloud scale for performance – only execute scaling actions that improve performance

  • Cloud scale for savings – only execute scaling actions that reduce costs

The default action acceptance mode for these actions is Manual. When you examine the pending actions, only actions that satisfy the policy are allowed to execute. All other actions are read-only.

When policy conflicts arise, Cloud scale all overrides the other two scaling options in most cases. For more information, see Default and User-defined Automation Policies.

Supported instance types for AWS VM scale actions

In the user interface, you can view the instance types that Turbonomic currently supports.

  1. Navigate to Settings > Policies.

  2. In the Policy Management page, search for and click Virtual Machine Defaults.

  3. In the Configure Virtual Machine Policy page:

    1. Scroll to the end of the page.

    2. Click Add Scaling Constraint.

    3. Choose Cloud Instance Types.

    4. Click Edit.

Expand an instance family to see individual instance types and the resources allocated to them. An example of an instance family is a1.

By default, Turbonomic considers all instance types that are currently available for scaling when making scaling decisions. However, you may have set up some of your cloud VMs to only scale to certain instance types to reduce complexity and cost, improve discount utilization, or meet application demand. To limit scaling to certain instance types, create policies for the affected cloud VMs and configure an inclusion or exclusion list as a scaling constraint.

Scale actions for AWS VMs running EC2 GPU instance types

Currently, Turbonomic supports the following EC2 GPU instance type families with Linux AMIs.

  • G3 instance family (based on NVIDIA Tesla M60 GPUs)

  • G4dn instance family (based on NVIDIA T4 GPUs)

  • G5 instance family (based on NVIDIA A10G Tensor Core GPUs)

  • G5g instance family (based on NVIDIA T4G Tensor Core GPUs)

  • P2 instance family (based on NVIDIA Kepler K80 GPUs)

  • P3/P3dn instance family (based on NVIDIA Volta V100 GPUs)

  • P4d instance family (based on NVIDIA A100 Tensor Core GPUs)

In the Turbonomic user interface, the default virtual machine policy shows the currently supported EC2 GPU instance type families and the resources allocated to them. These instance type families are grouped under the Accelerated Computing category. See the previous section for the steps to view the default virtual machine policy.

Turbonomic collects NVIDIA GPU metrics for VMs running these instance types and then uses these metrics to generate VM scale actions that optimize performance and costs.

Note:

For additional information about these metrics, see the NVIDIA documentation.

Metric Description Actions
GPU Count Number of GPU cards in use Scale down the number of GPU cards within the same instance type
GPU Memory Amount of GPU memory in use Scale GPU memory up or down within the same instance type
GPU Memory BW (bandwidth)

Fraction of cycles where data was sent to or received from device memory, measured in GB/s

Scale GPU memory BW up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

GPU FP16 Fraction of cycles the FP16 (half precision) pipe was active

Scale GPU FP16 up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

GPU FP32 Fraction of cycles the FP32 (single precision) pipe was active

Scale GPU FP32 up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

GPU FP64 Fraction of cycles the FP64 (double precision) pipe was active

Scale GPU FP64 up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

GPU Tensor Fraction of cycles the Tensor (mixed/multi-precision) pipe was active

Scale GPU Tensor up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

Note:

For GPU VMs with accelerator cards, GPU card information is available when you set the scope to a VM and view the Entity Information chart. For example, the Accelerator Model field shows the GPU model.

To enable metrics discovery, configure AWS CloudWatch and NVIDIA Data Center GPU Manager (DCGM) as described in this topic.

Turbonomic can also recommend actions that scale standard VM resources (such as vCPU and vMem) to the supported GPU instance types and the G4ad instance family (based on AMD Radeon Pro V520 GPUs).

When generating or executing scale actions, Turbonomic considers the following policies and settings as constraints:

  • Read-only tier exclusion policies

    Turbonomic automatically creates these policies and displays them in the Policy Management page (Settings > Policies).

    • Cross-target policies

      Cross-target policies ensure that VMs with certain GPU types only scale to an instance type with the same (or higher) GPU card count and memory per card.

      An example policy is AWS GPU Nvidia - Cloud Compute Tier Exclusion Policy.

    • Per-target policies

      Per-target policies ensure that any VMs in GPU supported instance families do not scale out to instance families that do not support GPUs.

      An example policy is Cloud Compute Tier AWS:gpu - Cloud Compute Tier Exclusion Policy.

  • Scaling target utilization

    Turbonomic uses scaling target utilization values for GPU resources in conjunction with aggressiveness constraints to control scale actions for VMs. You can configure utilization values in automation policies for cloud VMs. For more information, see this topic.

  • Ignore NVIDIA GPU compute capability constraints

    This constraint is a setting that you can choose to turn on in automation policies for cloud VMs. When turned on, scale actions that change the GPU compute capability of VMs are allowed to execute in Turbonomic. When turned off, actions are only executable in the cloud provider console (web-based UI or CLI). For more information, see this topic.

Support for AWS EC2 accelerator instance types

Turbonomic can recommend actions to scale standard VM resources (such as vCPU and vMem) to the following AWS EC2 Accelerator instance types.

  • Inf1 instance family (based on AWS Inferentia chips)

  • Inf2 instance family (based on AWS Inferentia2 chips)

Turbonomic also creates the appropriate read-only tier exclusion policies and displays them in the Policy Management page (Settings > Policies).

  • Cross-target policies

    Cross-target policies ensure that AWS VMs with certain Accelerator types only scale to an instance type with the same Accelerator configuration (card count and memory per card). Policies include:

    • AWS ML_ACCELERATOR - Inferentia1 - Cloud Compute Tier Exclusion Policy

    • AWS ML_ACCELERATOR - Inferentia2 - Cloud Compute Tier Exclusion Policy

  • Per-target policies

    Per-target policies ensure that any VMs in Inferentia1 instance families do not scale out to other instance families. Policies include:

    • Cloud Compute Tier AWS:inf1 - Cloud Compute Tier Exclusion Policy

    • Cloud Compute Tier AWS:inf2 - Cloud Compute Tier Exclusion Policy

Scaling prerequisites for AWS VMs

In AWS some instances require VMs to be configured in specific ways before they can scale to those instance types. If Turbonomic recommends scaling a VM that is not suitably configured onto one of these instances, then it sets the action to Recommend, and describes the reason. Turbonomic will not automate the action, even if you have set the action acceptance mode for that scope to Automatic. You can execute the action manually, after you have properly configured the instance.

Note that if you have VMs that you cannot configure to support these requirements, then you can set up a policy to keep Turbonomic from making these recommendations. Create a group that contains these VMs, and then create policy for that scope. In the policy, exclude instance types by configuring the Cloud Instance Types scaling constraint. For information about excluding instance types, see Cloud Instance Types.

The instance requirements that Turbonomic recognizes are:

  • Enhanced Network Adapters

    Some VMs can run on instances that support Enhanced Networking via the Elastic Network Adapter (ENA), while others can run on instances that do not offer this support. Turbonomic can recommend scaling a VM that does not support ENA onto an instance that does. However, you must enable ENA on the VM before executing the scaling action. If you scale a non-ENA VM to an instance that requires ENA, then AWS cannot start up the VM after the scaling action.

    For information about ENA configuration, visit this page.

  • Linux AMI Virtualization Type

    An Amazon Linux AMI can use ParaVirtual (PV) or Hardware Virtual Machine (HVM) virtualization. Turbonomic can recommend scaling a PV VM to an HVM instance that does not include the necessary PV drivers.

    To check the virtualization type of an instance, open the Amazon EC2 console to the Details pane, and review the Virtualization field for that instance.

  • 64-bit vs 32-bit

    Not all AWS instance can support a 32-bit VMs. Turbonomic can recommend scaling a 32-bit VM to an instance that only supports a 64-bit platform.

  • NVMe Block

    Some instances expose EBS volumes as NVMe block devices, but not all VMs are configured with NVMe drivers. Turbonomic can recommend scaling such a VM to an instance that supports NVMe. Before executing the action, you must install the NVMe drivers on the VM.

In addition, Turbonomic recognizes processor types that you currently use for your VM. For scale actions, Turbonomic keeps your VMs on instance types with compatible processors. For example, if your VM is on an ARM-based instance, then Turbonomic will only recommend scaling to other compatible ARM-based instance types.

Scaling storage for AWS VMs

When a VM needs more storage capacity Turbonomic recommends actions to scale its volume to an instance that provides more storage. Note that AWS supports both Elastic Block Store (EBS) and Instance storage. Turbonomic recognizes these storage types as it recommends volume actions.

If the root storage for your VM is Instance Storage, then Turbonomic will not recommend an action. This is because Instance Storage is ephemeral, and such an action would cause the VM to lose all the stored data.

If the root storage is EBS, then Turbonomic recommends volume actions. EBS is persistent, and the data will remain after the action. However, if the VM uses Instance Storage for extra storage, then Turbonomic does not include that storage in its calculations or actions.

Nodes in AWS EMR clusters

Turbonomic treats nodes in AWS EMR clusters like regular VMs. As such, it could incorrectly generate scale actions for such nodes. After a node action executes, AWS detects the action as a defect, terminates the node, and replaces it with a new instance of the initial size. To avoid this issue, disable scale actions for nodes in EMR clusters.

AWS automatically assigns system tags to EMR clusters. To disable scale actions, create a VM group that uses these tags as a filter, and then create a VM policy that disables the Cloud Scale All action type for the VM group.