This is first article of a new blog series: “Mastering Cloud Cost Optimization.”

This blog series is designed to provide tactical guidance and information on decreasing cloud costs without negatively affecting application performance.

Why is cloud cost control so hard?

According to most organizations, the biggest drivers to cloud are elasticity and agility.

In other words, it allows you to instantly provision and de-provision resources based on the needs of the business. You no longer have to build the church for Sunday. Once in the cloud, approximately 80% of companies report receiving bills two to three times what they expected. The truth is that while the promise of cloud is that you only pay for what you use, the reality is that you pay for what you provision. The gap between consumption and allocation is what causes the large and unexpected bills.

Cost isn’t the only challenge. While most organizations report cost being their biggest problem in managing a public cloud environment, you cannot truly separate performance from cost — the two are tightly coupled. If an organization was optimizing for cost alone, moving all applications to the smallest instance type would be the way to go, but no one is willing to take the performance hit.

In the cloud, more than ever, cost and performance are tied together.

Digital transformation and a rush to the cloud are placing enterprise IT teams under tremendous pressure. While cloud addresses an old pain point – that infrastructure supply is static while application demand is dynamic – matching demand with supply in real-time across multiple metrics and dimensions requires more decisions than any human being can make.

Hybrid cloud estates are unbelievably complex. There are millions of configuration options for EC2 instances alone, AWS has 212 additional products and services and Microsoft lists over 600 Azure services (as of May 2020). This is simply too much complexity for the average IT team to manage and, as a result, many organizations that kicked off digital transformation initiatives with high hopes end up watching innovation grind to a halt while the IT team struggles just to keep the lights on.

The looping cost-reduction war-room ritual

It’s the seventh of the month and your CFO just received another large cloud bill. Again — just like many times before — it has reached an all-time high much sooner than predicted or budgeted for.

Sound familiar?

We hear the same story from almost every organization to which we are introduced. Sometimes, the CIO must face the board and ask for a budget increase, but usually, before that happens, a committee is established tasked with reducing the bill. The team usually consists of top cloud architects, a few finance personnel and, in most cases, a subset of a newly formed “cloud governance” team. These top talents are removed from their day job to fix the problem of cloud bills running amok.

They take over a conference room and establish contact with the closest pizza deliverer. They spread the hundreds of pages of their cloud bills on the desk (or their spreadsheets) to try and find ways to reduce the bill. They then try to correlate this information with information they gather from a multitude of monitoring tools and conversations with application owners.

A few days later, a set of recommendations on how to get the bill below the targeted threshold is distributed —­­ what components can be deleted, what should be rightsized, what can leverage cheaper storage, what RIs should be bought and so on.

Most of the recommendations are then taken and the cloud bills slowly start decreasing. Everyone is relieved and life goes back to normal. The team is dismantled, and they go back to their day jobs.

Slowly but surely, the cloud bills start creeping up and it’s the seventh of the month again. The CFO calls, the team is reassembled, a new target is set, a conference room is taken over and pizzas start getting delivered — it is an endless “break-fix” loop for cost optimization.

Breaking the loop

So how do we get out of this loop? By solving the problem, and not the symptom.

Cloud bills keep creeping up because we don’t continuously ensure the estate is optimized, and we fall back into the same patterns that caused the problem to begin with.

Cloud platforms enable elasticity and increase an organization’s ability to be agile, but how do you truly take advantage of these without drowning in overwhelming cloud bills?

By helping people focus on what they do best — develop, create, innovate — and let software manage the complex resource and cost tradeoffs, ensuring cloud environments are constantly optimized. The pace of innovation increases, and cloud costs are always in check.

The principles of cost optimization

Keeping the environment constantly optimized requires that you capitalize on the promise of the cloud — only pay for what you use — by constantly ensuring that applications are receiving exactly the resources they need to deliver on their SLAs, as cost-effectively as possible.

There are the core principles and required capabilities to accomplish that:

  1. Multidimensional rightsizing with application awareness: Understand what applications consume what underlying resources across compute, storage and network — whether running on IaaS, containers or other services.
  2. Real-time vertical and horizontal scale decisions: Understand application SLAs and ensure resources are continuously performant as cost effectively as possible within the constraints and policies defined by the business.
  3. Identify and delete/suspend unused resources: Constantly clean up the estate from unnecessary resources.
  4. Leverage the right pricing models for your workloads: RIs, promo SKUs, spot instances, etc.
  5. Align with work cycles: Schedule suspension of workloads to ensure that when people aren’t using them, they also aren’t paying for them
  6. Automation and workflows: Real-time or schedules to fit change windows automate the actions and integrate optimization to be part of the deployment and daily process of the management of the estate, instead of sporadic efforts. Approval workflow is the ideal approach for workloads that are under strict change control.

Learn more

In addition to these principles, there are a few core frameworks organizations must put in place in order to achieve optimal results. The next blog post in this series — “Mastering Cloud Cost Optimization: Frameworks for Success” — will explore these frameworks in more depth to help you become a cost optimization master.

Learn more about how IBM Turbonomic is significantly impacting businesses’ bottom line in Forrester’s TEI study: “The Total Economic Impact™ Of IBM Turbonomic

Start your journey to assuring app performance at the lowest possible cost. Request your IBM Turbonomic demo today.


More from Cloud

IBM Cloud VMware as a Service introduces multitenant as a new, cost-efficient consumption model

4 min read - Businesses often struggle with ongoing operational needs like monitoring, patching and maintenance of their VMware infrastructure or the added concerns over capacity management. At the same time, cost efficiency and control are very important. Not all workloads have identical needs and different business applications have variable requirements. For example, production applications and regulated workloads may require strong isolation, but development/testing, training environments, disaster recovery sites or other applications may have lower availability requirements or they can be ephemeral in nature,…

IBM accelerates enterprise AI for clients with new capabilities on IBM Z

5 min read - Today, we are excited to unveil a new suite of AI offerings for IBM Z that are designed to help clients improve business outcomes by speeding the implementation of enterprise AI on IBM Z across a wide variety of use cases and industries. We are bringing artificial intelligence (AI) to emerging use cases that our clients (like Swiss insurance provider La Mobilière) have begun exploring, such as enhancing the accuracy of insurance policy recommendations, increasing the accuracy and timeliness of…

IBM NS1 Connect: How IBM is delivering network connectivity with premium DNS offerings

4 min read - For most enterprises, how their users access applications and data is an essential part of doing business, and how they service those application and data responses has a direct correlation to revenue generation.    According to We Are Social’s Digital 2023 Global Overview Report, there are 5.19 billion people around the world using the internet in 2023. There’s an imperative need for businesses to trust their networks to deliver meaningful content to address customer needs.  So how responsive is the…

Kubernetes version 1.28 now available in IBM Cloud Kubernetes Service

2 min read - We are excited to announce the availability of Kubernetes version 1.28 for your clusters that are running in IBM Cloud Kubernetes Service. This is our 23rd release of Kubernetes. With our Kubernetes service, you can easily upgrade your clusters without the need for deep Kubernetes knowledge. When you deploy new clusters, the default Kubernetes version remains 1.27 (soon to be 1.28); you can also choose to immediately deploy version 1.28. Learn more about deploying clusters here. Kubernetes version 1.28 In…