Actions for Azure VMs

Turbonomic supports the following actions:

  • Scale

    Change the VM instance to use a different instance type or tier to optimize performance and costs.

    See additional information for scale actions.

  • Discount-related actions

    If you have a high percentage of on-demand VMs, you can reduce your monthly costs by increasing Azure reservations coverage. To increase coverage, you scale VMs to instance types that have existing capacity.

    If you need more capacity, then Turbonomic will recommend actions to purchase additional reservations.

    Purchase actions should be taken along with the related VM scaling actions. To purchase discounts for VMs at their current sizes, run a Buy VM Reservation Plan.

  • Stop and start (also known as 'parking' actions)

    Stop a VM for a given period of time to reduce your cloud expenses, and then start it at a later time.

    For details, see Parking: Stop or Start Cloud Resources.

Controlling scale actions for Azure VMs

For scale actions, you can create policies to control the scale actions that Turbonomic recommends. In those policies, choose from the following options:

  • Cloud scale all – execute all scaling actions

  • Cloud scale for performance – only execute scaling actions that improve performance

  • Cloud scale for savings – only execute scaling actions that reduce costs

The default action acceptance mode for these actions is Manual. When you examine the pending actions, only actions that satisfy the policy are allowed to execute. All other actions are read-only.

When policy conflicts arise, Cloud scale all overrides the other two scaling options in most cases. For more information, see Default and User-defined Automation Policies.

Supported instance types for Azure VM scale actions

In the user interface, you can view the instance types that Turbonomic currently supports.

  1. Navigate to Settings > Policies.

  2. In the Policy Management page, search for and click Virtual Machine Defaults.

  3. In the Configure Virtual Machine Policy page:

    1. Scroll to the end of the page.

    2. Click Add Scaling Constraint.

    3. Choose Cloud Instance Types.

    4. Click Edit.

Expand an instance family to see individual instance types and the resources allocated to them. An example of an instance family is B-series.

By default, Turbonomic considers all instance types that are currently available for scaling when making scaling decisions. However, you may have set up some of your cloud VMs to only scale to certain instance types to reduce complexity and cost, improve discount utilization, or meet application demand. To limit scaling to certain instance types, create policies for the affected cloud VMs and configure an inclusion or exclusion list as a scaling constraint.

Scale actions for Azure VMs running GPU instance types

Currently, Turbonomic supports the following GPU instance type series with Linux source images.

  • NC A100 v4-series (based on NVIDIA A100 PCIe GPUs)

  • NCads H100 v5-series (based on NVIDIA H100 NVL GPUs)

  • NCasT4_v3-series (based on Nvidia Tesla T4 GPUs)

  • NCv3-series (based on NVIDIA Tesla V100 GPUs)

  • NDv2-series (based on NVIDIA Tesla V100 GPUs)

  • NVadsA10 v5-series (based on NVIDIA A10 GPUs)

  • NVv3-series (based on NVIDIA Tesla M60 GPUs)

Note:

VMs that run NVadsA10 v5-series have partial GPUs. For these VMs, Turbonomic discovers partial GPU information and then adjusts GPU capacities (such as GPU memory) accordingly. Capacity information is available when you set the scope to an Azure GPU VM and view the Capacity and Usage chart.

In the Turbonomic user interface, the default virtual machine policy shows the currently supported GPU instance type series and the resources allocated to them. These instance series are grouped under the Accelerated Computing category. GPU instance series that Azure retired are grouped under the GPU (Retired) category. See the previous section for the steps to view the default virtual machine policy.

Turbonomic collects NVIDIA GPU metrics for VMs running these instance types and then uses these metrics to generate VM scale actions that optimize performance and costs.

Note:

For additional information about these metrics, see the NVIDIA documentation.

Metric Description Actions
GPU Count Number of GPU cards in use Scale down the number of GPU cards within the same instance type
GPU Memory Amount of GPU memory in use Scale GPU memory up or down within the same instance type
GPU Memory BW (bandwidth)

Fraction of cycles where data was sent to or received from device memory, measured in GB/s

Scale GPU memory BW up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

GPU FP16 Fraction of cycles the FP16 (half precision) pipe was active

Scale GPU FP16 up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

GPU FP32 Fraction of cycles the FP32 (single precision) pipe was active

Scale GPU FP32 up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

GPU FP64 Fraction of cycles the FP64 (double precision) pipe was active

Scale GPU FP64 up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

GPU Tensor Fraction of cycles the Tensor (mixed/multi-precision) pipe was active

Scale GPU Tensor up or down

This action moves the VM from its current instance type to another instance type with the same (or higher) GPU count and GPU memory capacity.

Note:

For GPU VMs with accelerator cards, GPU card information is available when you set the scope to a VM and view the Entity Information chart. For example, the Accelerator Model field shows the GPU model.

To enable the discovery of GPU metrics, configure NVIDIA Data Center GPU Manager (DCGM) as described in this topic.

Turbonomic can also recommend actions that scale standard VM resources (such as vCPU and vMem) to the supported GPU instance types.

When generating or executing scale actions, Turbonomic considers the following policies and settings as constraints:

  • Read-only tier exclusion policies

    Turbonomic automatically creates these policies and displays them in the Policy Management page (Settings > Policies).

    • Cross-target policies

      Cross-target policies ensure that VMs with certain GPU types only scale to an instance type with the same (or higher) GPU card count and memory per card.

      An example policy is Azure GPU NVIDIA - Cloud Compute Tier Exclusion Policy.

    • Per-target policies

      Per-target policies ensure that any VMs in GPU supported instance families do not scale out to instance families that do not support GPUs.

      An example policy is Cloud Compute Tier Azure:gpu - Cloud Compute Tier Exclusion Policy.

  • Scaling target utilization

    Turbonomic uses scaling target utilization values for GPU resources in conjunction with aggressiveness constraints to control scale actions for VMs. You can configure utilization values in automation policies for cloud VMs. For more information, see this topic.

  • Ignore NVIDIA GPU compute capability constraints

    This constraint is a setting that you can choose to turn on in automation policies for cloud VMs. When turned on, scale actions that change the GPU compute capability of VMs are allowed to execute in Turbonomic. When turned off, actions are only executable in the cloud provider console (web-based UI or CLI). For more information, see this topic.

Azure resource group discovery

For Azure environments that include Resource Groups, Turbonomic discovers the Azure Resource Groups and the tags that are used to identify these groups.

In the Turbonomic user interface, to search for a specific Azure Resource Group, choose Resource Groups in the Search Page.

You can set the scope of your Turbonomic session to an Azure Resource Group by choosing a group in the Search results and clicking Scope To Selection.

You can also use Azure tags as filter criteria when you create a custom Turbonomic resource group. You can choose the Azure Resource Groups that match the tag criteria to be members of the new custom group.

To find the available tags for a specific Azure Resource Group, add the Basic Info chart configured with Related Tag Information to your view or custom dashboard. See Basic Info Charts.

Note:

When you inspect Resource Groups, Turbonomic does not currently show the billed costs for those Resource Groups.

Azure instance requirements

In Azure environments, some instance types require workloads to be configured in specific ways, and some workload configurations require instance types that support specific features. When Turbonomic generates resize actions in Azure, these actions consider the following features:

  • Accelerated Networking (AN)

    In an Azure environment, not all instance types support AN, and not all workloads on AN instances actually enable AN. Turbonomic maintains a dynamic group of workloads that have AN enabled, and it assigns a policy to that group to exclude any templates that do not support AN. In this way, if a workload is on an instance that supports AN, and that workload has enabled AN, then Turbonomic will not recommend an action that would move the workload to a non-AN instance.

  • Azure Premium Storage

    Turbonomic recognizes whether a workload uses Premium Storage, and will not recommend a resize to an instance that does not support Azure Premium Storage.

In addition, Turbonomic recognizes processor types that you currently use for your workloads. If your workload is on a GPU-based instance, then Turbonomic will only recommend moves to other compatible GPU-based instance types. For these workloads, Turbonomic does not recommend resize actions.

IOPS-aware scaling for Azure VMs

Turbonomic considers IOPS utilization when making scaling decisions for Azure VMs. To measure utilization, Turbonomic takes into account a variety of attributes, such as per-disk IOPS utilization, whole VM IOPS utilization, cache settings, and IOPS capacity for the VMs. It also respects IOPS utilization and aggressiveness constraints that you set in VM policies. For details, see Aggressiveness and Observation Periods.

Analysis impacts VM scaling decisions in different ways. For example:

  • If your instance experiences IOPS bottlenecks, Turbonomic can recommend scaling up to a larger instance type to increase IOPS capacity, even if you do not fully use the current VCPU or VMEM resources.

  • If your instance experiences underutilization of VMEM and VCPU, but high IOPS utilization, Turbonomic might not recommend scaling down. It might keep you on the larger instance to provide sufficient IOPS capacity.

  • If the instance experiences underutilization of IOPS capacity along with normal utilization of other resources, you might see an action to resize to an instance that is very similar to the current one. If you inspect the action details, you should see that you are changing to a less expensive instance with less IOPS capacity.