In the rush to embrace generative AI, companies are stumbling upon an unexpected hurdle: soaring computing costs that threaten to derail innovation and business transformation efforts.
Major AI players are feeling the economic pressure, too. OpenAI is reportedly experiencing explosive revenue growth, with monthly earnings hitting USD 300 million in August 2024. In early October, the company announced it had raised USD 6.6 billion in a new funding round at a 157 billion valuation—an effort to keep up with its skyrocketing costs and ambitious growth plans.
A new report from IBM’s Institute for Business Value (IBV) paints a stark picture of executives’ economic challenges as they navigate the AI revolution. The report, titled “The CEO’s guide to generative AI: Cost of compute,” reveals that the average cost of computing is expected to climb 89% between 2023 and 2025. A staggering 70% of executives IBM surveyed cite generative AI as a critical driver of this increase. And the impact is already being felt across industries, with every executive reporting the cancellation or postponement of at least one generative AI initiative due to cost concerns.
“At the moment, a lot of organizations are experimenting, so these costs are not necessarily kicking in as much as they will once they start scaling AI,” says Jacob Dencik, Research Director at IBV. “The cost of computing, often reflected in cloud costs, will be a key issue to consider, as it is potentially a barrier for them to scale AI successfully.”
The AI cost equation
The economics of AI are emerging as a critical factor in determining its true business impact. As Dencik points out, “Even if something is technically feasible to do with AI, if the business case doesn’t stack up because of the cost of computing or the cost of training these models, then we’re not going to see the impact of AI on business activity that many people anticipate.”
Adnan Masood, Chief AI Architect at UST, frames this challenge in stark terms: “We’re entering a strategic inflection point, where innovation—once viewed as a competitive necessity—now carries substantial financial risk.” He adds: “The long march to AI dominance is not for the faint of heart. We’re looking at a future where companies must make strategic bets on whether to continue pushing the boundaries of AI, or risk falling behind… in the AI arms race.”
Many organizations are turning to hybrid cloud architectures to combat rising costs. “Hybrid cloud becomes a mechanism for ensuring you can manage your computing cost,” Dencik says. “Using a hybrid cloud platform that includes a common control plane and financial operations capabilities gives you the visibility you need to run data, workloads and applications in the lowest-cost environments. It will allow you to see where costs are being generated, and how to optimize.”
Increasing efficiency
Experts say the path forward is about more than just cutting costs. “You can use generative AI to improve your coding efficiency; the way you code an application can make it more or less energy-intensive in its use. Some estimates suggest you can reduce the energy consumption of using an application by as much as 50% by switching to better coding language and more efficient code.”
Organizations are also using generative AI to optimize data center layouts and improve the design of servers. “There are various ways in which generative AI can support the efficiency of your computing and the efficiency of your compute resources,” Dencik says. “It can be part of the solution, rather than just the source of the problem.”
Masood suggests additional strategies: “There are creative ways to address these challenges, like LLM routing, or intelligently directing incoming requests to the most suitable large language model based on factors like complexity, cost and performance, ensuring efficient resource utilization and optimal results.” He also mentions “reducing the cost of running LLMs by shrinking their size and making them faster. Using quantization to reduce the memory needed for the model and efficient fine-tuning to speed up training means lower hardware costs and faster processing times, making these models more affordable to deploy and use.”
The increasing complexity of AI models is another factor driving up costs. Dencik recommends a strategic approach: “You don’t need to use large language models for everything,” he says. “A small model trained on high-quality data can be more efficient and achieve the same results—or better—depending on the task at hand. Selecting the appropriate model is key, while reusing and fine-tuning existing models can be better than creating new models for every new task you want to use AI for.”
He advocates for a multimodal, multi-model approach to AI deployment: “To be cost-effective, you should allow your organization to move toward a multimodal, multi-model use of AI and have a platform that allows you to do that within the organization,” he says. “Although that might sound more complex, it’s a way for your organization to get the most out of AI in the most cost-efficient way.”
Sustainability concerns are also influencing the total cost of ownership for AI systems. While energy costs may be largely hidden in cloud expenses, rather than showing up directly on utility bills, there’s a growing awareness of the environmental impact of generative AI. “It’s not just an economic cost; it’s an environmental cost associated with using AI,” Dencik says. He points to emerging practices like “green ops,” which aim to optimize cloud use for reduced environmental impact.
As companies grapple with these challenges, learning how to effectively manage the cost of computing could become a key market differentiator. The report concludes that “the CEOs that best manage these costs will be able to run their business like a high-performance machine—reducing drag while using the latest technology to outpace the competition.”
eBook: How to choose the right foundation model