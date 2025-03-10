A new class of AI models is challenging the dominance of GPT-style systems, promising faster, cheaper and potentially more powerful alternatives.

Inception Labs, a startup founded by researchers from Stanford, recently released Mercury, a diffusion-based language model (dLLM) that refines entire phrases at once, rather than predicting words one by one. Unlike traditional large language models (LLMs), which use an autoregressive approach—generating one word at a time, based on the preceding text—diffusion models improve text iteratively, through refinement.

“dLLMs expand the possibility frontier,” Stefano Ermon, a Stanford University computer science professor and co-founder of Inception Labs, tells IBM Think. “Mercury provides unmatched speed and efficiency, and—by leveraging more test-time compute—dLLMs will also set the bar for quality and improve overall customer satisfaction for edge and enterprise applications.”

IBM Research Engineer Benjamin Hoover sees the writing on the wall: “It’s just a matter of two or three years before most people start switching to using diffusion models,” he says. “When I saw Inception Labs’ model, I realized, ‘This is going to happen sooner rather than later.’”