Actions for AWS VMs

Controlling scale actions for AWS VMs

For scale actions, you can create policies to control the scale actions that Turbonomic recommends. In those policies, choose from the following options:

Cloud scale all – execute all scaling actions
Cloud scale for performance – only execute scaling actions that improve performance
Cloud scale for savings – only execute scaling actions that reduce costs

The default action acceptance mode for these actions is Manual. When you examine the pending actions, only actions that satisfy the policy are allowed to execute. All other actions are read-only.

When policy conflicts arise, Cloud scale all overrides the other two scaling options in most cases. For more information, see Default and User-defined Automation Policies.

Supported instance types for AWS VM scale actions

In the user interface, you can view the instance types that Turbonomic currently supports.

Navigate to Settings > Policies.
In the Policy Management page, search for and click Virtual Machine Defaults.
In the Configure Virtual Machine Policy page:
1. Scroll to the end of the page.
2. Click Add Scaling Constraint.
3. Choose Cloud Instance Types.
4. Click Edit.

Expand an instance family to see individual instance types and the resources allocated to them. An example of an instance family is a1.

By default, Turbonomic considers all instance types that are currently available for scaling when making scaling decisions. However, you may have set up some of your cloud VMs to only scale to certain instance types to reduce complexity and cost, improve discount utilization, or meet application demand. To limit scaling to certain instance types, create policies for the affected cloud VMs and configure an inclusion or exclusion list as a scaling constraint.

Scale actions for powered off VMs

Turbonomic can recommend scale actions for VMs that are powered off (for example, during maintenance). If you accept the actions, Turbonomic scales the VMs while they are powered off so that the changes are already applied as soon as they are powered on.

AWS Billing target and scale actions

Turbonomic uses historical billing data discovered through your AWS Billing target to generate the most optimal scaling recommendations for VMs.

When VMs are associated with AWS accounts that Turbonomic monitors through your AWS Billing target, be aware of the following caveats for VM scale actions.

After you add a new VM to an account, it may take Turbonomic up to two days to generate scale actions for the VM because AWS does not immediately add the VM to the billing data.
If your AWS Billing target discovers billing data using a legacy CUR export, costs that include Enterprise Discount Program (EDP) discounts are not reflected in scale actions for the affected VMs.

Scale actions for AWS VMs running EC2 GPU instance types

Currently, Turbonomic supports the following EC2 GPU instance type families with Linux AMIs.

G3 instance family (based on NVIDIA Tesla M60 GPUs)
G4dn instance family (based on NVIDIA T4 GPUs)
G5 instance family (based on NVIDIA A10G Tensor Core GPUs)
G5g instance family (based on NVIDIA T4G Tensor Core GPUs)
P2 instance family (based on NVIDIA Kepler K80 GPUs)
P3/P3dn instance family (based on NVIDIA Volta V100 GPUs)
P4d instance family (based on NVIDIA A100 Tensor Core GPUs)

In the Turbonomic user interface, the default virtual machine policy shows the currently supported EC2 GPU instance type families and the resources allocated to them. These instance type families are grouped under the Accelerated Computing category. See the previous section for the steps to view the default virtual machine policy.

Turbonomic collects NVIDIA GPU metrics for VMs running these instance types and then uses these metrics to generate VM scale actions that optimize performance and costs.

Note:

For additional information about these metrics, see the NVIDIA documentation.

Metric	Description	Actions
GPU Count	Number of GPU cards in use	Scale down the number of GPU cards within the same instance type
GPU Memory	Amount of GPU memory in use	Scale GPU memory up or down within the same instance type
GPU Memory BW (bandwidth)	Fraction of cycles where data was sent to or received from device memory, measured in GB/s	Scale GPU memory BW up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.
GPU FP16	Fraction of cycles the FP16 (half precision) pipe was active	Scale GPU FP16 up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.
GPU FP32	Fraction of cycles the FP32 (single precision) pipe was active	Scale GPU FP32 up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.
GPU FP64	Fraction of cycles the FP64 (double precision) pipe was active	Scale GPU FP64 up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.
GPU Tensor	Fraction of cycles the Tensor (mixed/multi-precision) pipe was active	Scale GPU Tensor up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

Note:

For GPU VMs with accelerator cards, GPU card information is available when you set the scope to a VM and view the Entity Information chart. For example, the Accelerator Model field shows the GPU model.

To enable metrics discovery, configure AWS CloudWatch and NVIDIA Data Center GPU Manager (DCGM) as described in this topic.

Turbonomic can also recommend actions that scale standard VM resources (such as vCPU and vMem) to the supported GPU instance types and the G4ad instance family (based on AMD Radeon Pro V520 GPUs).

When generating or executing scale actions, Turbonomic considers the following policies and settings as constraints:

Read-only tier exclusion policies

Turbonomic automatically creates these policies and displays them in the Policy Management page (Settings > Policies).
- Cross-target policies
  
  Cross-target policies ensure that VMs with certain GPU types only scale to an instance type with the same (or higher) GPU card count and memory per card.
  
  An example policy is AWS GPU Nvidia - Cloud Compute Tier Exclusion Policy.
- Per-target policies
  
  Per-target policies ensure that any VMs in GPU supported instance families do not scale out to instance families that do not support GPUs.
  
  An example policy is Cloud Compute Tier AWS:gpu - Cloud Compute Tier Exclusion Policy.
Scaling target utilization

Turbonomic uses scaling target utilization values for GPU resources in conjunction with aggressiveness constraints to control scale actions for VMs. You can configure utilization values in automation policies for cloud VMs. For more information, see this topic.
Ignore NVIDIA GPU compute capability constraints

This constraint is a setting that you can choose to turn on in automation policies for cloud VMs. When turned on, scale actions that change the GPU compute capability of VMs are allowed to execute in Turbonomic. When turned off, actions are only executable in the cloud provider console (web-based UI or CLI). For more information, see this topic.

Support for AWS EC2 accelerator instance types

Turbonomic can recommend actions to scale standard VM resources (such as vCPU and vMem) to the following AWS EC2 Accelerator instance types.

Inf1 instance family (based on AWS Inferentia chips)
Inf2 instance family (based on AWS Inferentia2 chips)

Turbonomic also creates the appropriate read-only tier exclusion policies and displays them in the Policy Management page (Settings > Policies).

Cross-target policies

Cross-target policies ensure that AWS VMs with certain Accelerator types only scale to an instance type with the same Accelerator configuration (card count and memory per card). Policies include:
- AWS ML_ACCELERATOR - Inferentia1 - Cloud Compute Tier Exclusion Policy
- AWS ML_ACCELERATOR - Inferentia2 - Cloud Compute Tier Exclusion Policy
Per-target policies

Per-target policies ensure that any VMs in Inferentia1 instance families do not scale out to other instance families. Policies include:
- Cloud Compute Tier AWS:inf1 - Cloud Compute Tier Exclusion Policy
- Cloud Compute Tier AWS:inf2 - Cloud Compute Tier Exclusion Policy

Scaling prerequisites for AWS VMs

In AWS some instances require VMs to be configured in specific ways before they can scale to those instance types. If Turbonomic recommends scaling a VM that is not suitably configured onto one of these instances, then it sets the action to Recommend, and describes the reason. Turbonomic will not automate the action, even if you have set the action acceptance mode for that scope to Automatic. You can execute the action manually, after you have properly configured the instance.

Note that if you have VMs that you cannot configure to support these requirements, then you can set up a policy to keep Turbonomic from making these recommendations. Create a group that contains these VMs, and then create policy for that scope. In the policy, exclude instance types by configuring the Cloud Instance Types scaling constraint. For information about excluding instance types, see Cloud Instance Types.

The instance requirements that Turbonomic recognizes are:

Enhanced Network Adapters

Some VMs can run on instances that support Enhanced Networking via the Elastic Network Adapter (ENA), while others can run on instances that do not offer this support. Turbonomic can recommend scaling a VM that does not support ENA onto an instance that does. However, you must enable ENA on the VM before executing the scaling action. If you scale a non-ENA VM to an instance that requires ENA, then AWS cannot start up the VM after the scaling action.

For information about ENA configuration, visit this page.
Linux AMI Virtualization Type

An Amazon Linux AMI can use ParaVirtual (PV) or Hardware Virtual Machine (HVM) virtualization. Turbonomic can recommend scaling a PV VM to an HVM instance that does not include the necessary PV drivers.

To check the virtualization type of an instance, open the Amazon EC2 console to the Details pane, and review the Virtualization field for that instance.
64-bit vs 32-bit

Not all AWS instance can support a 32-bit VMs. Turbonomic can recommend scaling a 32-bit VM to an instance that only supports a 64-bit platform.
NVMe Block

Some instances expose EBS volumes as NVMe block devices, but not all VMs are configured with NVMe drivers. Turbonomic can recommend scaling such a VM to an instance that supports NVMe. Before executing the action, you must install the NVMe drivers on the VM.

In addition, Turbonomic recognizes processor types that you currently use for your VM. For scale actions, Turbonomic keeps your VMs on instance types with compatible processors. For example, if your VM is on an ARM-based instance, then Turbonomic will only recommend scaling to other compatible ARM-based instance types.

Scaling storage for AWS VMs

When a VM needs more storage capacity Turbonomic recommends actions to scale its volume to an instance that provides more storage. Note that AWS supports both Elastic Block Store (EBS) and Instance storage. Turbonomic recognizes these storage types as it recommends volume actions.

If the root storage for your VM is Instance Storage, then Turbonomic will not recommend an action. This is because Instance Storage is ephemeral, and such an action would cause the VM to lose all the stored data.

If the root storage is EBS, then Turbonomic recommends volume actions. EBS is persistent, and the data will remain after the action. However, if the VM uses Instance Storage for extra storage, then Turbonomic does not include that storage in its calculations or actions.

Nodes in AWS EMR clusters

Turbonomic treats nodes in AWS EMR clusters like regular VMs. As such, it could incorrectly generate scale actions for such nodes. After a node action executes, AWS detects the action as a defect, terminates the node, and replaces it with a new instance of the initial size. To avoid this issue, disable scale actions for nodes in EMR clusters.

AWS automatically assigns system tags to EMR clusters. To disable scale actions, create a VM group that uses these tags as a filter, and then create a VM policy that disables the Cloud Scale All action type for the VM group.