Gerstein’s reaction comes with an asterisk: he called the results promising, but stressed that performance on curated benchmarks doesn’t always translate to messy real-world biology.

AlphaGenome, as he sees it, is powerful at describing what a single change might do within a genome model. But real genomes do not change one letter at a time. They come as whole, inherited packages, full of variants that shape one another’s effects. “In terms of limitations, one major issue is that the model predicts the effect of only a single variant and does not take into account the full genetic background of an individual’s personal genome,” he said. “Background genetics can substantially influence the impact of a particular variant, particularly by strongly affecting how a gene is expressed in response to a mutation.”

He thinks the next step is imaginable, even if it is harder: a future version of this kind of work could move beyond scoring a single mutation in isolation and instead operate directly on personal genomes. “One could imagine extending AlphaGenome by building large models that operate directly on personal genomes,” he said.

Medicine demands forms of evidence that many model developers simply do not have access to, Gerstein noted.

“With respect to translation into clinical practice, the main requirement is the accumulation of many use cases in which the effects of particular mutations are documented, followed by downstream validation showing that the predictions are accurate and clinically useful,” Gerstein said. “There is no substitute in the medical world for experimental data and actual clinical validation, and this will be necessary before outputs from tools like this are accepted.”

He also stressed what AlphaGenome does not claim to do: “It is important to remember that this tool provides the molecular consequences of specific mutations, not downstream phenotypic or disease-level effects,” he said. “As a result, additional work would be required to bridge that gap.”

The computational advances like AphaFold build on a foundation that took decades to establish. AlphaFold itself relied on massive protein structure databases built through painstaking crystallography and other experimental techniques. Similarly, the genomic datasets used to train AlphaGenome came from large-scale efforts such as ENCODE, which spent years mapping functional elements across the genome.

Whether AI models will compress the timeline from genetic discovery to approved therapies remains an open question. Drug development still requires navigating the complexities of human biology, designing careful clinical studies and conducting long, rigorous trials to establish safety and efficacy. According to a January 2026 World Economic Forum analysis by Novartis, AI doesn’t allow researchers to circumvent those complexities, but it does offer a way to navigate them more intelligently. By enhancing how scientists choose targets, design molecules and avoid safety risks, AI is helping them make better decisions faster, the analysis states.

Rosen-Zvi framed the moment in sweeping terms. “We have already seen how AI has transformed text, images and code,” she said. “Biology and chemistry are next, and we are only at the beginning of that curve.”

Biomedical foundation models have the potential to fundamentally change how experiments are designed, prioritized and interpreted, shifting from slow, iterative wet-lab cycles to AI-guided hypothesis generation and decision-making, she said. “For enterprises, this means faster discovery, lower R&D risk and the ability to explore biological space that was previously inaccessible,” she said. “Organizations that engage now will help shape how these models are applied, validated and integrated into real workflows, rather than reacting once the transformation is already underway.”