Artificial intelligence companies spend vast sums of money training machines to predict the next word, but neuroscientists now say the human brain may save enormous amounts of energy by knowing when not to bother.
A recent study in Nature Neuroscience suggests the brain processes language differently from today’s large language models (LLMs). Instead of constantly trying to predict every possible upcoming word, researchers found the brain appears to ease off on prediction at the boundaries between sentences and major phrases. Scientists say the strategy may help humans process language far more efficiently than modern AI systems.
“Our core finding is that the brain sometimes sacrifices next word prediction, especially when a word starts a new sentence or a major phrase,” Nai Ding, a neuroscience professor at Zhejiang University and one of the study’s authors, told IBM Think in an interview. “The brain is optimized not just for predicting the next word, a feature it can use, but also for compressing the memory of linguistic representations.”
The findings come as companies roll out generative AI across customer service, cybersecurity, software development and research, even as they grapple with soaring computing costs, hallucinations and “context drift,” where AI systems lose track of instructions during long conversations or complex tasks.
Ding and other researchers say the human brain may avoid this problem by compressing information into larger conceptual structures, rather than constantly recalculating every possible relationship between words.
Transformer systems, the AI architecture behind tools like ChatGPT that learn by finding relationships between words and ideas across huge amounts of data, gain power from their ability to connect information across broad contexts, though the process remains computationally expensive, said Nikolaus Kriegeskorte, a professor of psychology and neuroscience at Columbia University.
“The self-attention in transformers provides a powerful way to relate a set of representations,” Kriegeskorte told IBM Think in an interview. “But the computational costs of relating each element to each other element are high.”
Scientists have long known that human working memory remains sharply constrained. Most people can actively hold only a small number of items in mind at once. Yet humans still navigate conversations, stories and abstract reasoning with remarkable speed and flexibility.
Researchers behind the study argue that the brain compensates by compressing completed language units into higher-order conceptual representations. Instead of preserving every individual word with equal importance, the brain may summarize completed linguistic structures into a broader meaning.
“Compressing a sentence, turning many individual word representations into a single higher-level constituent representation, takes computational resources,” Ding said. “Because of that cost, the brain can deprioritize predicting the next word at constituent boundaries.”
Researchers say today’s AI systems handle language very differently. Current LLMs consume vast computing power partly because they try to evaluate too many possible relationships at once, said Stanislaw Wozniak, a research scientist at IBM Research.
“LLMs tend to extensively analyze all possible interactions over the entire context, which is the main driver for their intensive computational resources usage,” Wozniak told IBM Think in an interview. “Constraining this based on some criteria, for example, clear constituent boundaries, can thus improve efficiency.”
Researchers believe brain-inspired architectures could help models decide which information deserves continued attention and which information can collapse into compressed abstractions.
“If such a hierarchical representation can be built, we believe the model size can be significantly reduced,” Ding said.
Still, not all researchers agree with the broader idea that language and thought can be fully explained as computation. Randy Harris, Professor of Rhetoric and Communication Design at the University of Waterloo, said he remains unconvinced that human language activity should be viewed through a purely computational lens.
“I am deeply suspicious of framing human language activity as computational,” Harris told IBM Think in an interview.
While the researchers behind the study argue that the findings reveal something meaningful about how the brain manages limited memory resources during language processing, Harris suggested the observed effects may instead reflect longstanding psychological principles involving linguistic structure and closure. “Once we are out of a constituent, we are more open about what the next one might be,” Harris said.
Even so, Harris said humans appear to possess broad neurocognitive abilities that current machine learning systems still struggle to replicate.
The enormous computational demands of current AI systems have led researchers at IBM Research and elsewhere to investigate ways to limit how models process relationships, so they focus only on the most relevant information rather than evaluating every possible connection simultaneously.
Wozniak said the brain appears remarkably efficient at narrowing computation toward the information that matters most while adapting rapidly to changing context.
“We don’t really know how the brain works, but it seems that it is very efficient at quickly adapting to the context and constraining the computations to the key important aspects,” Wozniak said. “This enables solving efficiently even very complex challenges, which may be difficult for LLMs.”
Get curated insights on the most important—and intriguing—AI news. Subscribe to our weekly Think newsletter. See the IBM Privacy Statement.
See how InstructLab enables developers to optimize model performance through customization and alignment, tuning toward a specific use case by taking advantage of existing enterprise and synthetic data.
Move your applications from prototype to production with the help of our AI development solutions.
Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.