AWS metrics collection
It is highly recommended that you enable collection of metrics in your environment. Enabling metrics allows Turbonomic to generate scale actions to optimize VM resource usage. For Turbonomic to collect metrics, you must enable the collection of these metrics on the VMs in your environment.
This topic describes the collection of the following metrics:
-
Standard memory for AWS VMs
-
NVIDIA GPU metrics for AWS VMs running EC2 GPU instance types
Metrics collection requirements
To enable metrics collection, you must meet the following requirements. Some requirements may be different depending on whether your VM runs Linux or Windows.
-
The VM image must have an SSM agent installed.
-
Linux VMs
By default, Linux AMIs dated 2017.09 and later include an installed SSM agent.
-
Windows VMs
You must install the SSM agent on the VMs. For more information, see the AWS documentation.
-
-
Access to the CloudWatch service
Your AWS instance must have internet access or direct access to CloudWatch to push data to CloudWatch.
-
Access from Turbonomic
For Turbonomic to access metrics, the account that it uses to connect to the AWS target must include the correct permissions. If you configured the AWS target through an AWS key (not an IAM role), include the permissions as specified in the section for configuring an AWS target.
If you use an IAM role for the Turbonomic connection, that role must include the following minimum permissions:
AmazonEC2ReadOnlyAccess
AmazonS3ReadOnlyAccess
AmazonRDSReadOnlyAccess
Enabling metrics collection
To enable metrics collection for your VMs, perform the following steps:
-
Attach an IAM role to each VM instance.
Each EC2 instance must have an attached IAM role that grants CloudWatch access. To grant that access, include the
AmazonSSMFullAccess
policy in the role.Use the AWS System Manager to attach the necessary roles to your VMs.
Note:If you want to grant the role lesser access, you can use the
AmazonEC2RoleforSSM
policy. This is a custom policy that allows the actionssm:GetParameter
to access the resource,arn:aws:ssm:*:*:parameter/*
. -
Install the CloudWatch agent on your Linux VMs.
Navigate to the AWS System Manager service for the account and region that you want to configure. In the service, navigate to the Run Command screen and set up the AWS-ConfigureAWSPackage command to install AmazonCloudWatchAgent on your VMs. Use the following required values to install AmazonCloudWatchAgent:Item Configuration Action Install
Name CloudWatchAgent
-
Create a parameter store.
-
In the AWS System Manager, navigate to Parameter Store and create a parameter. Copy and paste the JSON object configuration (shown in the next step) into the parameter Value field.
-
Specify a parameter name, such as
AmazonCloudWatch-MyMemoryParam
. You can use a different name, but according to the Amazon documentation, the name must begin withAmazonCloudWatch
. - In the value field copy and paste the JSON configuration (shown in the next step) that corresponds to your VM.
-
-
Create configuration data for the CloudWatch agent.
The configuration data is a JSON object that you add as a parameter to the Parameter Store. The object must include the following configuration, depending on whether the object is for a Linux or a Windows VM instance.
-
Linux configuration for standard memory
{ "agent":{ "metrics_collection_interval":60, "logfile":"/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log" }, "metrics":{ "metrics_collected":{ "mem":{ "measurement":[ { "name":"mem_used", "rename":"MemoryUsed" }, { "name":"mem_available_percent", "rename":"MemoryAvailablePercent" }, { "name":"mem_used_percent", "rename":"MemoryUsedPercent" }, { "name":"mem_available", "rename":"MemoryAvailable" } ] } }, "append_dimensions":{ "InstanceId":"${aws:InstanceId}", "ImageId":"${aws:ImageId}", "InstanceType":"${aws:InstanceType}", "AutoScalingGroupName":"${aws:AutoScalingGroupName}" }, "aggregation_dimensions":[ [ "AutoScalingGroupName" ] ] } }
-
Linux configuration for standard memory and NVIDIA GPU card/memory utilization
{ "agent":{ "metrics_collection_interval":60, "logfile":"/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log" }, "metrics":{ "namespace": "CWAgent", "metrics_collected":{ "nvidia_gpu": { "measurement": [ "utilization_gpu", "memory_used" ] }, "mem":{ "measurement":[ { "name":"mem_available", "rename":"MemoryAvailable", "unit": "Bytes" } ] } }, "append_dimensions":{ "AutoScalingGroupName": "${aws:AutoScalingGroupName}", "ImageId": "${aws:ImageId}", "InstanceId": "${aws:InstanceId}", "InstanceType": "${aws:InstanceType}" } } }
-
Linux configuration for NVIDIA GPU metrics (DCGM)
Run the
setup_aws_dcgm_exporter.py
script to automate the collection of NVIDIA GPU metrics through Data Center GPU Manager (DCGM). Certain prerequisites must be met before you run the script. For more information, see this GitHub page.If you need assistance with the script, contact your Turbonomic representative.
-
Windows configuration for standard memory
{ "metrics":{ "namespace":"CWAgent", "append_dimensions":{ "InstanceId":"${aws:InstanceId}", "AutoScalingGroupName":"${aws:AutoScalingGroupName}" }, "aggregation_dimensions":[ [ "InstanceId" ], [ "AutoScalingGroupName" ] ], "metrics_collected":{ "Memory":{ "measurement":[ { "name":"Available Bytes", "rename":"MemoryAvailable", "unit":"Bytes" } ], "metrics_collection_interval":60 }, "Paging File":{ "measurement":[ { "name":"% Usage", "rename":"paging_used" } ], "metrics_collection_interval":60, "resources":[ "*" ] } } } }
Note that you can configure optional parameters for the CW namespace and region. However, if you configure more metrics for CloudWatch to collect, these metrics do not affect Turbonomic analysis and they do not show up in the user interface.
-
- Create the parameter and record the parameter name for use in a later step.
-
Deploy the CloudWatch parameter to your VMs.
-
In AWS System Manager, navigate to the Run Command screen to configure and run the AmazonCloudWatch-ManageAgent command. The configuration should include the following items:
Item Configuration Action configure
Mode ec2
Optional Configuration Source ssm
Optional Configuration Location Specify the name of the parameter that you created earlier. Optional Restart yes
This configuration restarts the CloudWatch Agent, not the VM instance.
Targets Specify the VMs to which you deploy the CloudWatch configuration. -
Run the command to enable metrics collections for your instances.
-
-
Verify that metrics collection is enabled.
-
Navigate to the CloudWatch page and open Metrics in the CWAgent namespace.
-
Inspect the instances by ID. You should see
MemoryAvailable
orutilization_gpu
andmemory_used
metrics if metrics collection is enabled.
-
For more information about enabling metrics collection for AWS, see the following Support articles: