Modern AI (especially generative AI) relies on large volumes of data to deliver real value. Fortunately, data generation isn’t limited to large enterprises. Organizations of all sizes produce substantial volumes of data each year through their websites, social media, internal systems and customer interactions.
Yet most organizations are underutilizing their data. Estimates suggest that only around 1% of enterprise data is leveraged in traditional large language models (LLMs).2
Why let such valuable AI fuel go to waste? Because most enterprise data is unstructured. It lacks a predefined format and comes from diverse data sources such as PDFs, social media posts, images, instant messages and emails. Less than 1% of this unstructured data is in a format suitable for direct AI consumption.3 In other words, the vast majority of enterprise data is not AI-ready.
While structured data remains immensely valuable, failing to tap into the potential of unstructured data—diverse, flexible and rich with insights—is a strategic misstep and significant barrier to scaling enterprise AI.
This challenge is reflected in grim AI outcomes: According to the IBM Institute for Business Value’s (IBV) 2025 CEO Study, just 16% of AI initiatives have reached enterprise scale.
Now is a critical moment for businesses. The success or failure of AI initiatives depends on how effectively organizations manage and prepare high-quality data—both structured and unstructured—for AI.