How to optimize AWS cloud costs
8 February 2023
9 min read

Amazon Web Services (AWS) offers clients the ability to build modern, scalable applications that help drive digital business success. However, due to its complexity, achieving operational excellence in the cloud is difficult. Fundamentally, as a cloud operator, you need to ensure great end-user experiences while staying within budget.

In this post, we’ll give you a quick overview of the various methods of AWS cloud cost management—what problems they solve and how best to use them. However, regardless of what cloud cost optimization strategy you employ, achieving operational excellence at scale and taking advantage of the elasticity of the cloud requires software that optimizes your consumption simultaneously for performance and cost—and makes it easy for you to automate it, safely and confidently. Let’s see how IBM® Turbonomic® software helps clients optimize their AWS cloud costs.

Rightsize instances

The AWS operating expense (OpEx) model charges clients for the capacity available for different resources regardless of whether they’re fully utilized or not. Clients can purchase instances in different sizes and types, but often default to buying the largest instance available to ensure performance. Rightsizing resources is the process of matching instance types and sizes to workload performance and capacity requirements. To optimize costs, rightsizing resources must be done on a continuous basis, however, organizations often rightsize reactively like, for example, after executing a “lift and shift” cloud migration or development.

AWS clients can use the AWS Cost Explorer rightsizing recommendations to rightsize their Amazon Elastic Compute Cloud (Amazon EC2) instances, however, these recommendations are generated based on historical utilization and don’t consider important metrics, such as memory utilization, without using third-party monitoring tools or Amazon CloudWatch. 

Let’s see how IBM Turbonomic software helps clients rightsize Amazon EC2 instances through percentile-based scaling. The diagrams depict the IBM Turbonomic UI. Diagram A shows the application stack. The supply chain on the left represents the resource relationships that the Turbonomic software maps out from the business application down to the cloud region and can include other components, such as container pods, storage volumes, virtual machines (VMs) and more—depending on the infrastructure that supports the application. This full-stack understanding is what makes Turbonomic recommendations trustworthy and gives cloud engineers and operations the confidence to automate. For this particular AWS account, the Turbonomic software has identified 110 pending scaling actions.

After selecting SHOW ALL, clients are brought to the Turbonomic Action Center, which can be found in Diagram B. This diagram represents all the scaling actions available for this AWS account. From viewing this dashboard, clients can see important information, including the account name, instance type, discount coverage and on-demand cost. Clients can select different actions and execute them by clicking EXECUTE ACTIONS in the top right corner.

For clients looking for more details on a particular action, they can select DETAILS where the Turbonomic software provides additional information that it considers in its recommendations. As shown in Diagram C, this instance needs to be scaled down because it has underutilized vCPU, VMem and net throughput. Other information for this action includes the cost impact of executing the action, the resulting CPU and memory utilization, uptime and net throughput.

Scale instances

Public cloud environments are ephemeral, and to meet budget and performance goals, AWS clients must scale their instances both vertically—rightsizing—and horizontally. To scale horizontally, AWS users must monitor application load balances and then scale out instances as load increases from increased demand. Distributing load across multiple instances through horizontal scaling increases performance and reliability, but instances must be scaled back as demand changes to avoid incurring unnecessary costs. AWS clients can use Amazon EC2 Auto Scaling to enable horizontal scaling in their environment, however, this tool scales under the constraint of user-defined policies and only for designated Amazon EC2 instances called Auto Scaling groups.

The only way to optimize horizontal scaling is to do it in real-time through automation. IBM Turbonomic software continuously generates scaling actions so applications can always perform at the lowest cost. Diagram D represents an AWS account that needs to be scaled out.

The horizontal scaling action for this AWS account can be executed in the Action Center under the Provision Actions subcategory found in Diagram E. Here you can find information on the actions and the corresponding workload, such as the container cluster, the namespace and the risk posed to the workload, which in this case is transaction congestion.

In Diagram F, you can see how the Turbonomic software provides the rationale behind taking the action, in this case a container pod is experiencing transaction congestions and needs to provision additional CPU to improve performance. The software also specifies all the container pod details, including the name, workload controller, namespace and container cluster.

Suspending instances

Another impactful way to reduce AWS cloud costs is to shut down idle instances. An organization may suspend instances if it’s not currently using the instance, such as during nonbusiness hours, but expects to resume use in the near term. When deleting an instance, the instance will be shut down and any data stored on the attached Amazon EBS volume is also deleted. However, when suspending an instance, users don’t delete the underlying data contained in the attached EBS volume. When starting the instance again, the EBS volume is simply attached to a newly provisioned instance, resulting in faster boot time. Amazon CloudWatch can help AWS clients suspend idle EC2 instances. AWS CloudWatch will send an alarm to users when an instance has an average CPU utilization that’s below 10% for over 24 hours. Users must define which instances to monitor and their corresponding utilization thresholds.

IBM Turbonomic software automatically identifies and provides recommendations for suspending instances. To suspend an instance with the Turbonomic software, clients will need to first select an AWS account with a pending suspension action, as shown in Diagram G.

To execute a suspension action, Turbonomic clients simply need to go to the Action Center, select the corresponding action and execute it. See Diagram H. Under the Suspend Actions tab of the Action Center, clients can see the Vmem, VCPU and Vstorage capacity for each instance with a pending action.

If clients need additional details before executing, they can select the DETAILS shown in Diagram I. The details provided for this action include the reasoning behind the action, in this case to improve infrastructure efficiency, as well as the cost impact, age of the instance, the virtual CPU and memory and the number of clients for this instance.

Use discounting

Clients can also use discounted pricing by optimizing reserved instance (RI) coverage and utilization to reduce costs. AWS Cost Explorer is also the native tool offered for RI purchasing recommendations. Cost Explorer recommendations are generated from individual account usage of the previous 60 days. Cost Explorer doesn’t forecast and assumes that historical usage will reflect future usage in its recommendations.

The IBM Turbonomic analytics engine automatically ingests and displays negotiated rates with public cloud providers and then generates specific RI purchasing and scaling actions so clients can take full advantage of existing RI inventory and maximize reservation-to-instance coverage. Diagram J represents an AWS account that has pending actions to increase RI utilization and coverage.

Diagram K represents the “Buy” actions that can be executed in the Action Center to increase RI coverage. Some important details listed in the Action Center here are the platform the RI will be purchased on, the term of the RI, the payment type and the region. Diagram L provides more details for this action, such as the cost impact and resulting RI coverage. All of this information can again be found in the corresponding DETAILS tab.

For longer-term commitments, the Turbonomic solution provides RI planning scenarios for added customization. On the Turbonomic platform, users simply select PLAN on the dashboard on the left and then select NEW PLAN. After selecting NEW PLAN, clients can then select the reservation planning option shown in Diagram M.

First, clients need to indicate the scope of the reservation purchasing plan. For this purpose, we’re going to select AWS. Turbonomic users can also scope by custom groups, accounts, billing families and regions, as shown in Diagram N.

After indicating the scope of the reservation plan, clients can then configure the plan to the preferred offering class, term and payment type shown in Diagram O. Clients can also customize the historical usage period they want the Turbonomic platform to analyze, as well as whether to use data from deleted virtual machines (VMs).

After running the plan, the Turbonomic software will provide a dashboard, as found in Diagram P. In this plan, clients can see the cloud cost comparison currently versus after executing the plan. This comparison covers RI coverage, RI utilization, on-demand compute cost and reserved compute cost. The Turbonomic solution also provides insight on VM mapping and discount inventory.

Finally, the Turbonomic software summarizes all of the actions that will be executed as part of your VM reservation plan. These actions are summarized under the PLAN ACTIONS and are shown in Diagram Q.

Delete unattached resources

Finally, as previously discussed, the AWS OpEx model charges clients not just for the resources that are actively in use, but also for the entire pool of resources available. As organizations build and deploy new releases into their environment, some resources are left unattached. Unattached resources are when clients create a resource but stops using it entirely. After development, hundreds of different resource types can be left unattached. Deleting unattached resources can significantly reduce wasted cloud expenditure. Diagram R shows an AWS account that has identified 230 unattached resources that can be removed. Similar to suspending idle instances, AWS CloudWatch also allows customers to delete Amazon EC2 instances. Just as with instance suspension, AWS CloudWatch will send an alarm to users to delete an instance based on utilization thresholds.

The delete actions for this account are listed in the Action Center in Diagram S. The information listed in the Delete category of the Action Center includes the size of the Amazon EBS volume, the storage tier, the amount of time it has been unattached and the cost impact of removing it.

For additional insight on the impact of these delete actions, again, clients can select the DETAILS tab and find more information as shown in Diagram T. The purpose of this action is to increase savings. Clients can also see additional information, such as the volume details, whether the action is disruptive, and the resource and cost impact.

Trustworthy automation is the best way to maximize business value on AWS

For cloud engineering and operations teams looking to achieve budget goals without negatively impacting customer experience, IBM Turbonomic software offers a tested path that you can trust. The Turbonomic solution can analyze your AWS environment and continuously match real-time application demand to the AWS unprecedented number of configuration options across Amazon EC2 compute, EBS volumes storage, Amazon Relational Database Service (RDS) databases and existing savings inventory. Are you looking to reduce spend across your AWS environment as soon as possible? IBM Turbonomic automation can be operationalized, allowing teams to see tangible results immediately and continuously while achieving 471% ROI in less than six months.

Want to explore IBM Turbonomic cloud optimization at your own leisure? Try IBM Turbonomic software in our live sandbox environment.

Want to learn more about how the IBM Turbonomic platform supports your own specific use case?

Read the Forrester Consulting commissioned study to see what outcomes our clients have achieved with the IBM Turbonomic solution.

 
Author
Spencer Mehm Product Marketing Manager