October 14, 2024 By Sascha Brodsky 4 min read

In the rush to embrace generative AI, companies are stumbling upon an unexpected hurdle: soaring computing costs that threaten to derail innovation and business transformation efforts.

Major AI players are feeling the economic pressure, too. OpenAI is reportedly experiencing explosive revenue growth, with monthly earnings hitting USD 300 million in August 2024. In early October, the company announced it had raised USD 6.6 billion in a new funding round at a 157 billion valuation—an effort to keep up with its skyrocketing costs and ambitious growth plans.

A new report from IBM’s Institute for Business Value (IBV) paints a stark picture of executives’ economic challenges as they navigate the AI revolution. The report, titled “The CEO’s guide to generative AI: Cost of compute,” reveals that the average cost of computing is expected to climb 89% between 2023 and 2025. A staggering 70% of executives  IBM surveyed cite generative AI as a critical driver of this increase. And the impact is already being felt across industries, with every executive reporting the cancellation or postponement of at least one generative AI initiative due to cost concerns.

“At the moment, a lot of organizations are experimenting, so these costs are not necessarily kicking in as much as they will once they start scaling AI,”  says Jacob Dencik, Research Director at IBV. “The cost of computing, often reflected in cloud costs, will be a key issue to consider, as it is potentially a barrier for them to scale AI successfully.”

The AI cost equation

The economics of AI are emerging as a critical factor in determining its true business impact. As Dencik points out, “Even if something is technically feasible to do with AI, if the business case doesn’t stack up because of the cost of computing or the cost of training these models, then we’re not going to see the impact of AI on business activity that many people anticipate.”

Adnan Masood, Chief AI Architect at UST, frames this challenge in stark terms: “We’re entering a strategic inflection point, where innovation—once viewed as a competitive necessity—now carries substantial financial risk.” He adds: “The long march to AI dominance is not for the faint of heart. We’re looking at a future where companies must make strategic bets on whether to continue pushing the boundaries of AI, or risk falling behind… in the AI arms race.”

Many organizations are turning to hybrid cloud architectures to combat rising costs. “Hybrid cloud becomes a mechanism for ensuring you can manage your computing cost,” Dencik says. “Using a hybrid cloud platform that includes a common control plane and financial operations capabilities gives you the visibility you need to run data, workloads and applications in the lowest-cost environments. It will allow you to see where costs are being generated, and how to optimize.”

Increasing efficiency

Experts say the path forward is about more than just cutting costs. “You can use generative AI to improve your coding efficiency; the way you code an application can make it more or less energy-intensive in its use. Some estimates suggest you can reduce the energy consumption of using an application by as much as 50% by switching to better coding language and more efficient code.”

Organizations are also using generative AI to optimize data center layouts and improve the design of servers. “There are various ways in which generative AI can support the efficiency of your computing and the efficiency of your compute resources,” Dencik says. “It can be part of the solution, rather than just the source of the problem.”

Masood suggests additional strategies: “There are creative ways to address these challenges, like LLM routing, or intelligently directing incoming requests to the most suitable large language model based on factors like complexity, cost and performance, ensuring efficient resource utilization and optimal results.” He also mentions “reducing the cost of running LLMs by shrinking their size and making them faster. Using quantization to reduce the memory needed for the model and efficient fine-tuning to speed up training means lower hardware costs and faster processing times, making these models more affordable to deploy and use.”

The increasing complexity of AI models is another factor driving up costs. Dencik recommends a strategic approach: “You don’t need to use large language models for everything,” he says. “A small model trained on high-quality data can be more efficient and achieve the same results—or better—depending on the task at hand. Selecting the appropriate model is key, while reusing and fine-tuning existing models can be better than creating new models for every new task you want to use AI for.”

He advocates for a multimodal, multi-model approach to AI deployment: “To be cost-effective, you should allow your organization to move toward a multimodal, multi-model use of AI and have a platform that allows you to do that within the organization,” he says. “Although that might sound more complex, it’s a way for your organization to get the most out of AI in the most cost-efficient way.”

Sustainability concerns are also influencing the total cost of ownership for AI systems. While energy costs may be largely hidden in cloud expenses, rather than showing up directly on utility bills, there’s a growing awareness of the environmental impact of generative AI. “It’s not just an economic cost; it’s an environmental cost associated with using AI,” Dencik says. He points to emerging practices like “green ops,” which aim to optimize cloud use for reduced environmental impact.

As companies grapple with these challenges, learning how to effectively manage the cost of computing could become a key market differentiator. The report concludes that “the CEOs that best manage these costs will be able to run their business like a high-performance machine—reducing drag while using the latest technology to outpace the competition.”

eBook: How to choose the right foundation model
Was this article helpful?
YesNo

More from Artificial intelligence

IBM watsonx Platform: Compliance obligations to controls mapping

4 min read - US regulators including the Office of the Comptroller of the Currency (OCC), Securities and Exchange Commission (SEC), Federal Reserve Board (FRB) and others mandate financial services organizations to prove that laws, rules and regulations (LRRs) are covered across their risk governance framework. This oversight helps ensure a secure and sound control environment that aligns with the organization's risk tolerance and heightened regulatory standards. However, interpreting banking regulations can be complex and subjective, requiring expert judgment to determine applicability to specific…

How a company transformed employee HR experience with an AI assistant

3 min read - IBM Build Partner Inspire for Solutions Development is a regional consulting firm that provides enterprise IT solutions across the Middle East. Jad Haddad, Head of AI at Inspire for Solutions Development has embraced the IBM watsonx™ AI and data platform to enhance the HR experience for its 450 employees. Next-gen HR for a next-gen workforce As a new generation of digital natives enters the workforce, we are seeing new expectations around the employee experience. Gen Z employees prefer an HR…

Navigating the data deluge with robust data intelligence

3 min read - In the age of relentless digital progression, businesses stand on the brink of a data renaissance. The proliferation of digital devices and interactions has resulted in an unparalleled influx of data, which businesses must navigate with precision and strategy. Enterprises require more than just traditional data management; they need to harness the momentum of advanced data intelligence solutions to help ensure innovative prowess and maintain market dominance. The data conundrum: Managing the rise of data creation As we delve into…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters