True to their name, generative AI models generate text, images, code, or other responses based on a user’s prompt. Organizations that utilize them correctly can see a myriad of benefits—from increased operational efficiency and improved decision-making to the rapid creation of marketing content. But what makes the generative functionality of these models—and, ultimately, their benefits to the organization—possible?
That’s where the foundation model enters the picture. It’s the underlying engine that gives generative models the enhanced reasoning and deep learning capabilities that traditional machine learning models lack. Together with data stores, foundation models make it possible to create and customize generative AI tools for organizations across industries that are looking to optimize customer care, marketing, HR (including talent acquisition), and IT functions.
Also known as a transformer, a foundation model is an AI algorithm trained on vast amounts of broad data. The term “foundation model” was coined by the Stanford Institute for Human-Centered Artificial Intelligence in 2021.
A foundation model is built on a neural network model architecture to process information much like the human brain does. Foundation models can be trained to perform tasks such as data classification, the identification of objects within images (computer vision) and natural language processing (NLP) (understanding and generating text) with a high degree of accuracy. They can also perform self-supervised learning to generalize and apply their knowledge to new tasks.
Instead of spending time and effort on training a model from scratch, data scientists can use pretrained foundation models as starting points to create or customize generative AI models for a specific use case. For example, a foundation model might be used as the basis for a generative AI model that is then fine-tuned with additional manufacturing datasets to assist in the discovery of safer and faster ways to manufacturer a type of product.
A specific kind of foundation model known as a large language model (LLM) is trained on vast amounts of text data for NLP tasks. BERT (Bi-directional Encoder Representations from Transformers) is one of the earliest LLM foundation models developed. An open-source model, Google created BERT in 2018. It was pretrained on a large corpus of English language data with self-supervision and can be used for a variety of tasks such as:
A foundation model used for generative AI differs from a traditional machine learning model because it can be trained on large quantities of unlabeled data to support applications that generate content or perform tasks.
Meanwhile, a traditional machine learning model is typically trained to perform a single task using labeled data, such as using labeled images of cars to train the model to then recognize cars in unlabeled images.
IBM’s watsonx.ai studio a suite of language and code foundation models, each with a geology-themed code name, that can be customized for a range of enterprise tasks. All watsonx.ai models are trained on IBM’s curated, enterprise-focused data lake.
Slate refers to a family of encoder-only models, which while not generative, are fast and effective for many enterprise NLP tasks.
Granite models are based on a decoder-only, GPT-like architecture for generative tasks.
Sandstone models use an encoder-decoder architecture and are well suited for fine-tuning on specific tasks.
Obsidian models utilize a new modular architecture developed by IBM Research, providing high inference efficiency and levels of performance across a variety of tasks.
Without secure access to trustworthy and domain-specific knowledge, foundation models would be far less reliable and beneficial for enterprise AI applications. Fortunately, data stores serve as secure data repositories and enable foundation models to scale in both terms of their size and their training data.
Data stores suitable for business-focused generative AI are built on an open lakehouse architecture, combining the qualities of a data lake and data warehouse. This architecture delivers savings from low-cost object storage and enables sharing of large volumes of data through open table formats like Apache Iceberg, built for high performance analytics and large-scale data processing.
Foundation models can query very large volumes of domain-specific data in a scalable, cost-effective container. And because these types of data stores combined with cloud allow virtually unlimited scalability, a foundation model’s knowledge gaps are narrowed or even eliminated over time with the addition of more data. The more gaps that are closed, the more reliable a foundation model becomes and the greater its scope.
Data stores provide data scientists with a repository they can use to gather and cleanse the data used to train and fine-tune foundation models. And data stores that take advantage of third-party providers’ cloud and hybrid cloud infrastructures for processing a vast amount of data are critical to generative AI cost-efficiency.
When foundation models access information across data stores and are fine-tuned in how they use this information to perform different tasks and generate responses, the resulting generative AI tools can help organizations achieve benefits such as:
Data scientists can use pretrained models to efficiently deploy AI tools across a range of mission-critical situations.
Developers can write, test and document faster using AI tools that generate custom snippets of code.
Executives can receive AI-generated summaries of lengthy reports, while new employees receive concise versions of onboarding material and other collateral.
Organizations can use generative AI tools for the automation of various tasks, including:
Marketing teams can use generative AI tools to help create content on a wide range of topics. They can also quickly and accurately translate marketing collateral into multiple languages.
Business leaders and other stakeholders can perform AI-assisted analyses to interpret large amounts of unstructured data, giving them a better understanding of the market, reputational sentiment, etc.
To help organizations multiply the impact of AI across your business, IBM offers watsonx, our enterprise-ready AI and data platform. The platform comprises three powerful products:
Learn how CEOs can balance the value generative AI can create against the investment it demands and the risks it introduces.
Learn fundamental concepts and build your skills with hands-on labs, courses, guided projects, trials and more.
Learn how to confidently incorporate generative AI and machine learning into your business.
Want to get a better return on your AI investments? Learn how scaling gen AI in key areas drives change by helping your best minds build and deliver innovative new solutions.
We surveyed 2,000 organizations about their AI initiatives to discover what's working, what's not and how you can get ahead.
IBM® Granite™ is our family of open, performant and trusted AI models tailored for business and optimized to scale your AI applications. Explore language, code, time series and guardrail options.
Learn how to select the most suitable AI foundation model for your use case.
Dive into the 3 critical elements of a strong AI strategy: creating a competitive edge, scaling AI across the business and advancing trustworthy AI.
Train, validate, tune and deploy generative AI, foundation models and machine learning capabilities with IBM watsonx.ai, a next-generation enterprise studio for AI builders. Build AI applications in a fraction of the time with a fraction of the data.
Put AI to work in your business with IBM's industry-leading AI expertise and portfolio of solutions at your side.
Reinvent critical workflows and operations by adding AI to maximize experiences, real-time decision-making and business value.