The most effective use of these advanced AI models will likely involve a partnership between human expertise and machine capability. “The human will always have to provide input, be okay with the planning, and verify these things,” Hay said.

Hay cautioned against overestimating the models’ capabilities: “I think you can get great outputs. I think when people hear the words AGI, they’re thinking of this big pulsating head in the clouds… actually, if I think about it, the models, as they are, with their next token prediction and good training data and their planning, etc., they do a pretty good job—better than humans in quite a lot of tasks.”

The development of these models raises questions about the nature of artificial intelligence and its comparison to human cognition. The new models have demonstrated remarkable prowess in certain areas—outperforming humans on standardized tests like the bar exam and SATs. Yet they still struggle with tasks that most humans find intuitive.

Hay pointed out that the models can struggle with tasks that humans find simple: “The model excels at specific, individual tasks. However, it currently has difficulty distinguishing between different parts of a conversation. This leads to confusion in its ability to handle multiple concepts simultaneously. The model overemphasizes context, often considering too much irrelevant information when processing requests.”

Baracaldo added a note of caution: “Even though this model is super impressive, sometimes it makes mistakes. And if you read the technical report, sometimes it creates solutions that a real expert, a human being, will think are not feasible, but the model does not know all the assumptions.”

The implications of these advancements extend beyond the tech industry. In research and academia, they might accelerate the pace of discovery by assisting in complex data analysis and hypothesis generation. In fields like medicine and law, they could serve as tools to augment human expertise, potentially leading to more accurate diagnoses or more comprehensive legal analyses.

Hay summarized the practical value of the new models for enterprises: “They are a lot better coders than they were before.”