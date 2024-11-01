The release of ChatGPT 2 years ago opened a new chapter in AI, driven by large language models of unprecedented size and complexity. These models are now a leading force in research and business, but many of them don't release their data, the full trading recipe or their checkpoints. That’s where nonprofit organization Allen Institute for Artificial Intelligence (Ai2) comes in. Ai2 got its start in 2014, founded by Microsoft co-founder Paul Allen. The research group works on language models, multimodal models and evaluation frameworks in open source.

Recently, Ai2 released Molmo, a family of state-of-the-art multimodal AI models aiming to significantly close the gap between open and proprietary systems. “Even our smaller models outperform competitors 10x their size,” says Ai2.

Earlier in September, Ai2 released OlmoE, a mixture of experts model with 1 billion active and 7 billion total parameters that was developed conjointly with Contextual AI. It was trained on 5 trillion tokens and built on a new data mix incorporating lessons from Ai2’s Dolma.

We spoke with Hanna Hajishirzi, Senior Director of NLP Research at Ai2, after her keynote at the PyTorch Conference in San Francisco to discuss open source models and AI literacy.