Actions for Azure VMs
Turbonomic supports the following actions:
-
Scale
Change the VM instance to use a different instance type or tier to optimize performance and costs.
See additional information for scale actions.
-
Discount-related actions
If you have a high percentage of on-demand VMs, you can reduce your monthly costs by increasing Azure reservations coverage. To increase coverage, you scale VMs to instance types that have existing capacity.
If you need more capacity, then Turbonomic will recommend actions to purchase additional reservations.
Purchase actions should be taken along with the related VM scaling actions. To purchase discounts for VMs at their current sizes, run a Buy VM Reservation Plan.
-
Stop and start (also known as 'parking' actions)
Stop a VM for a given period of time to reduce your cloud expenses, and then start it at a later time.
For details, see Parking: Stop or Start Cloud Resources.
Controlling scale actions for Azure VMs
For scale actions, you can create policies to control the scale actions that Turbonomic recommends. In those policies, choose from the following options:
-
Cloud scale all – execute all scaling actions
-
Cloud scale for performance – only execute scaling actions that improve performance
-
Cloud scale for savings – only execute scaling actions that reduce costs
The default action acceptance mode for these actions is Manual. When you examine the pending actions, only actions that satisfy the policy are allowed to execute. All other actions are read-only.
When policy conflicts arise, Cloud scale all overrides the other two scaling options in most cases. For more information, see Default and User-defined Automation Policies.
Supported instance types for Azure VM scale actions
In the user interface, you can view the instance types that Turbonomic currently supports.
-
Navigate to Settings > Policies.
-
In the Policy Management page, search for and click Virtual Machine Defaults.
-
In the Configure Virtual Machine Policy page:
-
Scroll to the end of the page.
-
Click Add Scaling Constraint.
-
Choose Cloud Instance Types.
-
Click Edit.
-
Expand an instance family to see individual instance
types and the resources allocated to them. An example of an instance family is
B-series
.
By default, Turbonomic considers all instance types that are currently available for scaling when making scaling decisions. However, you may have set up some of your cloud VMs to only scale to certain instance types to reduce complexity and cost, improve discount utilization, or meet application demand. To limit scaling to certain instance types, create policies for the affected cloud VMs and configure an inclusion or exclusion list as a scaling constraint.
Scale actions for Azure VMs running GPU instance types
Currently, Turbonomic supports the following GPU instance type series with Linux source images.
-
NC A100 v4-series (based on NVIDIA A100 PCIe GPUs)
-
NCads H100 v5-series (based on NVIDIA H100 NVL GPUs)
-
NCasT4_v3-series (based on Nvidia Tesla T4 GPUs)
-
NCv3-series (based on NVIDIA Tesla V100 GPUs)
-
NDv2-series (based on NVIDIA Tesla V100 GPUs)
-
NVadsA10 v5-series (based on NVIDIA A10 GPUs)
-
NVv3-series (based on NVIDIA Tesla M60 GPUs)
VMs that run NVadsA10 v5-series have partial GPUs. For these VMs, Turbonomic discovers partial GPU information and then adjusts GPU capacities (such as GPU memory) accordingly. Capacity information is available when you set the scope to an Azure GPU VM and view the Capacity and Usage chart.
In the Turbonomic user interface, the default virtual machine policy shows the currently supported GPU instance type series and the resources allocated to them. These instance series are grouped under the Accelerated Computing category. GPU instance series that Azure retired are grouped under the GPU (Retired) category. See the previous section for the steps to view the default virtual machine policy.
Turbonomic collects NVIDIA GPU metrics for VMs running these instance types and then uses these metrics to generate VM scale actions that optimize performance and costs.
For additional information about these metrics, see the NVIDIA documentation.
Metric | Description | Actions |
---|---|---|
GPU Count | Number of GPU cards in use | Scale down the number of GPU cards within the same instance type |
GPU Memory | Amount of GPU memory in use | Scale GPU memory up or down within the same instance type |
GPU Memory BW (bandwidth) |
Fraction of cycles where data was sent to or received from device memory, measured in GB/s |
Scale GPU memory BW up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity. |
GPU FP16 | Fraction of cycles the FP16 (half precision) pipe was active |
Scale GPU FP16 up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity. |
GPU FP32 | Fraction of cycles the FP32 (single precision) pipe was active |
Scale GPU FP32 up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity. |
GPU FP64 | Fraction of cycles the FP64 (double precision) pipe was active |
Scale GPU FP64 up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity. |
GPU Tensor | Fraction of cycles the Tensor (mixed/multi-precision) pipe was active |
Scale GPU Tensor up or down This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity. |
For GPU VMs with accelerator cards, GPU card information is available when you set the scope to a VM and view the Entity Information chart. For example, the Accelerator Model field shows the GPU model.
To enable the discovery of GPU metrics, configure NVIDIA Data Center GPU Manager (DCGM) as described in this topic.
Turbonomic can also recommend actions that scale standard VM resources (such as vCPU and vMem) to the supported GPU instance types.
When generating or executing scale actions, Turbonomic considers the following policies and settings as constraints:
-
Read-only tier exclusion policies
Turbonomic automatically creates these policies and displays them in the Policy Management page (Settings > Policies).
-
Cross-target policies
Cross-target policies ensure that VMs with certain GPU types only scale to an instance type with the same (or higher) GPU card count and memory per card.
An example policy is
Azure GPU NVIDIA - Cloud Compute Tier Exclusion Policy
. -
Per-target policies
Per-target policies ensure that any VMs in GPU supported instance families do not scale out to instance families that do not support GPUs.
An example policy is
Cloud Compute Tier Azure:gpu - Cloud Compute Tier Exclusion Policy
.
-
-
Scaling target utilization
Turbonomic uses scaling target utilization values for GPU resources in conjunction with aggressiveness constraints to control scale actions for VMs. You can configure utilization values in automation policies for cloud VMs. For more information, see this topic.
-
Ignore NVIDIA GPU compute capability constraints
This constraint is a setting that you can choose to turn on in automation policies for cloud VMs. When turned on, scale actions that change the GPU compute capability of VMs are allowed to execute in Turbonomic. When turned off, actions are only executable in the cloud provider console (web-based UI or CLI). For more information, see this topic.
Azure resource group discovery
For Azure environments that include Resource Groups, Turbonomic discovers the Azure Resource Groups and the tags that are used to identify these groups.
In the Turbonomic user interface, to search for a specific Azure Resource Group, choose Resource Groups in the Search Page.
You can set the scope of your Turbonomic session to an Azure Resource Group by choosing a group in the Search results and clicking Scope To Selection.
You can also use Azure tags as filter criteria when you create a custom Turbonomic resource group. You can choose the Azure Resource Groups that match the tag criteria to be members of the new custom group.
To find the available tags for a specific Azure Resource Group, add the Basic Info chart configured with Related Tag Information to your view or custom dashboard. See Basic Info Charts.
When you inspect Resource Groups, Turbonomic does not currently show the billed costs for those Resource Groups.
Azure instance requirements
In Azure environments, some instance types require workloads to be configured in specific ways, and some workload configurations require instance types that support specific features. When Turbonomic generates resize actions in Azure, these actions consider the following features:
-
Accelerated Networking (AN)
In an Azure environment, not all instance types support AN, and not all workloads on AN instances actually enable AN. Turbonomic maintains a dynamic group of workloads that have AN enabled, and it assigns a policy to that group to exclude any templates that do not support AN. In this way, if a workload is on an instance that supports AN, and that workload has enabled AN, then Turbonomic will not recommend an action that would move the workload to a non-AN instance.
-
Azure Premium Storage
Turbonomic recognizes whether a workload uses Premium Storage, and will not recommend a resize to an instance that does not support Azure Premium Storage.
In addition, Turbonomic recognizes processor types that you currently use for your workloads. If your workload is on a GPU-based instance, then Turbonomic will only recommend moves to other compatible GPU-based instance types. For these workloads, Turbonomic does not recommend resize actions.
IOPS-aware scaling for Azure VMs
Turbonomic considers IOPS utilization when making scaling decisions for Azure VMs. To measure utilization, Turbonomic takes into account a variety of attributes, such as per-disk IOPS utilization, whole VM IOPS utilization, cache settings, and IOPS capacity for the VMs. It also respects IOPS utilization and aggressiveness constraints that you set in VM policies. For details, see Aggressiveness and Observation Periods.
Analysis impacts VM scaling decisions in different ways. For example:
-
If your instance experiences IOPS bottlenecks, Turbonomic can recommend scaling up to a larger instance type to increase IOPS capacity, even if you do not fully use the current VCPU or VMEM resources.
-
If your instance experiences underutilization of VMEM and VCPU, but high IOPS utilization, Turbonomic might not recommend scaling down. It might keep you on the larger instance to provide sufficient IOPS capacity.
-
If the instance experiences underutilization of IOPS capacity along with normal utilization of other resources, you might see an action to resize to an instance that is very similar to the current one. If you inspect the action details, you should see that you are changing to a less expensive instance with less IOPS capacity.