DeepSeek shows there's room for more AI players

Woman working on tablet device in conference room

Author

Sascha Brodsky

Staff Writer

IBM

The AI arms race is no longer just for the billion-dollar giants.

Companies like OpenAI, Google and Microsoft have dominated the headlines when it comes to artificial intelligence conversation. However, a new wave of open-source innovation—exemplified by the recent DeepSeek model—is leveling the playing field. The model’s success underscores a growing trend: smaller firms can increasingly challenge AI’s most prominent players.

“This just reinforces things we already knew,” says David D. Cox, Vice President of AI Models at IBM Research. “We don’t think you need billions and billions of dollars to build great models. DeepSeek is proof that open-source approaches are catching up—and that’s a good thing.”

One clever memory trick

AI researchers are in a constant race to make models more powerful, without driving up computational costs. With growing concerns over hardware limitations and energy consumption, innovations that improve efficiency are becoming just as important as raw performance gains.

“For too long, the AI race has been a game of scale where bigger models meant better outcomes,” wrote IBM CEO Arvind Krishna on LinkedIn. “But there is no law of physics that dictates AI models must remain big and expensive. The cost of training and inference is just another technology challenge to be solved.”

DeepSeek's breakthrough in AI efficiency comes from a new technique called Multi-Head Latent Attention (MLA). This method changes how AI models handle and store their information. The key improvement is that MLA reduces the size of something called the KV cache, which is essential for AI systems to work efficiently. According to Cox, this makes the AI systems use less memory and allows them to grow larger more easily.

“They did some really nice work here,” Cox notes. “Reducing KV cache size is crucial because it allows models to run faster and use fewer resources.”

Under DeepSeek’s hood, the breakthroughs multiplied. Prasanna Sattigeri, a Principal Research Scientist at IBM Research, pointed out that the company’s innovations were about efficiency and architectural improvements.

“They optimized communication between GPUs, which is often a bottleneck in large-scale AI training,” Sattigeri says. “This enabled them to train effectively using older hardware, a remarkable engineering feat.”

But like any ambitious engineering project, this leap forward came with costs. DeepSeek also utilized reinforcement learning (RL) techniques, similar to the ones used in OpenAI’s o1 inference scaling approach. This method refines the model’s performance by reinforcing successful outputs over multiple iterations. However, Cox points out that DeepSeek’s implementation led to trade-offs, such as weaker function-calling capabilities and safety alignment concerns.

“It’s a great step forward, but there are some rough edges,” he says. “The model is fantastic in reasoning tasks, but other areas took a hit.”

3D design of balls rolling on a track

The latest AI News + Insights 


Discover expertly curated insights and news on AI, cloud and more in the weekly Think Newsletter. 

Big dreams, small budgets

Even as advancements make it easier to build large AI models, a bigger challenge remains: the enormous computing power required to stay competitive. Xia “Ben” Hu, an Associate Professor of Computer Science at Rice University, acknowledges that DeepSeek is a more efficient step forward in AI development. However, he notes that it doesn’t fundamentally shift the overall power dynamics in AI infrastructure, where access to vast computing resources still determines who leads the race.

“DeepSeek is backed by a large venture fund in China, and has access to tens of thousands of GPUs,” Hu says. “That’s still a major barrier for many smaller startups.”

However, Hu predicts that the most significant shift would likely be in enterprise AI adoption. “Traditional industries—oil and gas, manufacturing—have been hesitant to develop their own AI solutions,” he says. “With costs dropping and open-source models improving, companies that once relied on external AI services are now considering building in-house models tailored to their specific needs.”

The implications go beyond one model. With open-source AI projects multiplying, smaller startups can now access tools that once required massive data centers and enormous budgets. Cox said that OpenAI and its counterparts have long projected an “air of inevitability”—that only those with deep pockets could lead in AI. But as DeepSeek and other models emerge, that notion is starting to crack.

“We’re seeing a shift where a much broader aperture of players can compete in this space,” Cox says. “It’s not that anybody with USD 5 million can roll up and build a top-tier model overnight. But well-funded startups and midsized companies? Absolutely.”

Researchers are also focusing on efficiency rather than raw computing power. Cox and his research team have zeroed in on the Mixture of Experts approach, which allows AI to be more selective about how it uses processing resources.

"Mixture of Experts is just one piece of the puzzle—there's a lot more coming," he says, suggesting that the future of AI may depend less on access to advanced chips and more on smarter ways of using existing hardware.

Sattigeri highlighted one such innovation: the rise of synthetic data, or artificially generated information that mimics real-world data. “With models like DeepSeek, we’re seeing a shift toward using AI-generated synthetic data to refine and train models more efficiently,” he says. “This could significantly lower costs and make high-quality AI accessible to more players.”

The increasing accessibility of AI development raises new questions about the future of competition. Will infrastructure and computing power still determine the winners, or will the ability to innovate quickly become the most valuable asset? According to Cox, it’s a mix of both.

“You still need serious infrastructure, you still need great talent, but the moat that OpenAI and Google have isn’t as deep as they’d like people to believe,” he says. “Secrets don’t stay secret in this field. Ideas spread, and people move around. We’re seeing rapid convergence.”

Hu added that AI development still requires four critical components: “I call it the ABCD model—Algorithms, Big Data, Compute and Distribution,” he says. “The best AI companies have all four. DeepSeek is making a dent in the first two, but compute and distribution still give the major players an edge.”

AI Academy

Why foundation models are a paradigm shift for AI

Learn about a new class of flexible, reusable AI models that can unlock new revenue, reduce costs and increase productivity, then use our guidebook to dive deeper.

AI's next chapter: Think small to win big

The growing number of AI companies enabled by more efficient techniques isn’t just about competition—it could spark a creative revolution. If more companies can develop AI without billion-dollar budgets, innovation will be driven by diverse perspectives rather than a handful of corporate agendas, Cox says. That means more tailored AI solutions and specialized models, as well as a more dynamic market.

“Innovation will happen faster, in a safer and more inclusive way,” Cox said. “If we move beyond a monoculture where a few players set the terms, we will see a flourishing of different approaches.”

Cox said that for IBM, which has committed to open-source AI, DeepSeek’s rise validates its approach. “It’s actually a good thing for us,” he says. “It proves that open models can work and that there’s demand for them. The more people contribute, the more we all benefit.”

Hu points out that while smaller firms are gaining ground, the major players are adapting. “Amazon, Meta and Microsoft won’t just sit back and let open source eat their lunch,” he says. “They are working hard to figure out how to integrate open-source models while maintaining control over infrastructure and data.”

What happens next? Cox and other experts say that AI development won’t become a free-for-all, but it’s clear that smaller firms are no longer at the mercy of tech giants. Open-source tools are accelerating progress, and it’s the companies that embrace this shift that stand to benefit the most.

“This is part of an ongoing trend,” Cox says. “It didn’t start with DeepSeek, and it won’t end with it. But it’s definitely woken some people up.”

Related solutions
IBM® watsonx Orchestrate™ 

Easily design scalable AI assistants and agents, automate repetitive tasks and simplify complex processes with IBM® watsonx Orchestrate™.

Explore watsonx Orchestrate
Artificial intelligence solutions

Put AI to work in your business with IBM’s industry-leading AI expertise and portfolio of solutions at your side.

Explore AI solutions
AI consulting and services

Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.

Explore AI services
Take the next step

Whether you choose to customize pre-built apps and skills or build and deploy custom agentic services using an AI studio, the IBM watsonx platform has you covered.

Explore watsonx Orchestrate Explore watsonx.ai