June 14, 2023 By David D. Cox
Kate Blair
5 min read

Sometimes the problem with artificial intelligence (AI) and automation is that they are too labor intensive. That sounds like a joke, but we’re quite serious. Traditional AI tools, especially deep learning-based ones, require huge amounts of effort to use. You need to collect, curate, and annotate data for any specific task you want to perform. This is often a very cumbersome exercise that takes significant amount of time to field an AI solution that yields business value. And then you need highly specialized, expensive and difficult to find skills to work the magic of training an AI model. If you want to start a different task or solve a new problem, you often must start the whole process over again—it’s a recurring cost.

But that’s all changing thanks to pre-trained, open source foundation models. With a foundation model, often using a kind of neural network called a “transformer” and leveraging a technique called self-supervised learning, you can create pre-trained models for a vast amount of unlabeled data. The model can learn the domain-specific structure it’s working on before you even start thinking about the problem that you’re trying to solve. This is usually text, but it can also be code, IT events, time series, geospatial data, or even molecules.

Starting from this foundation model, you can start solving automation problems easily with AI and using very little data—in some cases, called few-shot learning, just a few examples. In other cases, it’s sufficient to just describe the task you’re trying to solve.

Hear expert insights and technical experiences during IBM watsonx Day

Solving the risks of massive datasets and re-establishing trust for generative AI

Some foundation models for natural language processing (NLP), for instance, are pre-trained on massive amounts of data from the internet. Sometimes, you don’t know what data a model was trained on because the creators of those models won’t tell you. And those massive large-scale datasets contain some of the darker corners of the internet. It becomes difficult to ensure that the model algorithms outputs aren’t biased, or even toxic. This is an open, hard problem for the entire field of AI applications. At IBM, we want to infuse trust into everything we do, and we’re building our own foundation models with transparency at their core for clients to use.

As a first step, we’re carefully curating an enterprise-ready data set using our data lake tooling to serve as a foundation for our, well, foundation models. We’re carefully removing problematic datasets, and we’re applying AI-based hate and profanity filters to remove objectionable content. That’s an example of negative curation—removing things.

We also do positive curation—adding things we know our clients care about. We’ve curated a rich set of data from enterprise-relevant domains—finance, legal and regulatory, cybersecurity, sustainability data. Datasets like this are measured in how many “tokens”—think of those as words or word parts—that we’re including. We’re targeting a 2 trillion token dataset, which would make it among the largest that anyone has assembled.

Next, we’re training the models, bringing together best-in-class innovations from the open community and those developed by IBM Research. Over the next few months, we’ll be making these models available for clients, alongside the open-source model catalog mentioned earlier.

Harnessing the power of foundation models at scale

Foundation models represent a paradigm shift in AI, one that requires not only a new technical stack to allow hybrid cloud environments to flourish, but also fundamentally new user interactions that harness the power of these models for enterprise. Coming soon, our enterprise-ready next-generation AI studio for AI builders, watsonx.ai has two tools for generative AI capabilities powered by foundation models to help bridge this gap for clients: a Prompt Lab and a Prompt Tuning Studio.

Watsonx.ai homepage

The Prompt Lab

The Prompt Lab enables users to rapidly explore and build solutions with large language and code models by experimenting with prompts. Prompts are simple text inputs that effectively nudge the model to do your bidding with direct instructions. Prompts can also include a few examples to guide the model towards the exact behavior you’re looking for.

With language models, all you have to do is write the instructions in natural language. It usually takes a certain amount of trial and error to craft the right prompt that can enables the model to generate the desired result, a new field called prompt engineering. For instance, within the Prompt Lab, users can leverage different prompts for both zero-shot prompting and few-shot prompting to accomplish different tasks such as:

  • Generate text for marketing campaign: Create high-quality content for marketing campaigns given target audiences, campaign parameters, and other keywords.
  • Extract facts from SEC 10-K filings: Extract details from dense financial filings, like Maximum Borrowing Capacity 10-K filings.
  • Summarize meeting transcripts: Summarize a transcript from a meeting, understanding key takeaways without having to read through the entire conversation.
  • Answer questions about an article or dynamic content. Use this to build a question-answering interface grounded on specific content and recommend optimal next steps to provide customer service assistance.
Prompt Lab

With Prompt Lab, practically anyone can harness the power of foundation models for enterprise use cases. Engineers and developers can also use our APIs to embed these capabilities into external and internal applications. We’re actively working on more enhanced developer experience that offers useful libraries and code samples.

The Tuning Studio

With the watsonx.ai Tuning Studio, users can further customize foundation model behavior using a state-of the art method that requires as few a 100 to 1,000 examples. By using advanced prompt tuning within watsonx.ai, you can efficiently create and deploy a foundation model that is customized to your data.

Tuning can be useful to adapt existing models to domain-specific tasks (i.e., learn new tasks). It also allows enterprises to harness their proprietary data to differentiate their applications.

In the Tuning Studio, all you have to do is specify your task and provide labelled examples in the required format. Once the model training is complete, you can deploy the model and use it in both the Prompt Lab and via an API.

Tuning Studio (mockup preview)

What are we doing ahead of the release?

As we gear up towards our broader watsonx.ai release in July, we’re actively seeing new use cases being built out through our Tech Preview program. We are investing in a roadmap of state-of-the-art tooling to efficiently customize models with proprietary data. We’re improving our Prompt Lab with interfaces that help novice users construct better prompts and guide the models to providing the right answers more quickly.

In addition, we recently open-sourced a preview of our python SDK and announced a partnership with Hugging Face to integrate their open-source libraries into watsonx.ai. The foundation model capabilities within watsonx.ai fit into a greater data and AI platform, watsonx, alongside two other key pillars watsonx.data and watsonx.governance. Together, watsonx offers organizations the ability to:

  • Train, tune and deploy AI across your business with watsonx.ai
  • Scale AI workloads, for all your data, anywhere with watsonx.data
  • Accelerate responsible, transparent and explainable AI workflows with watsonx.governance

You can learn more about what watsonx has to offer and how watsonx.ai works alongside the platform’s other capabilities by clicking the buttons below.

Hear from experts, partners and end-users during IBM watsonx Day Read more about IBM watsonx
Was this article helpful?

More from Artificial intelligence

In preview now: IBM watsonx BI Assistant is your AI-powered business analyst and advisor

3 min read - The business intelligence (BI) software market is projected to surge to USD 27.9 billion by 2027, yet only 30% of employees use these tools for decision-making. This gap between investment and usage highlights a significant missed opportunity. The primary hurdle in adopting BI tools is their complexity. Traditional BI tools, while powerful, are often too complex and slow for effective decision-making. Business decision-makers need insights tailored to their specific business contexts, not complex dashboards that are difficult to navigate. Organizations…

Introducing the watsonx platform on Microsoft Azure

4 min read - Artificial intelligence (AI) is revolutionizing industries by enabling advanced analytics, automation, and personalized experiences. According to The business value of AI, from the IBM Institute of Business Value, AI adoption has more than doubled since 2017. Enterprises are taking an intentional design approach to hybrid cloud and AI to drive technology decisions and enable adoption of Generative AI. According to the McKinsey report,  The economic potential of generative AI: The next productivity frontier, generative AI is projected to add $2.6…

Democratizing Large Language Model development with InstructLab support in watsonx.ai

5 min read - There is no doubt that generative AI is changing the game for many industries around the world due to its ability to automate and enhance creative and analytical processes. According to McKinsey, generative AI has a potential to add $4 trillion to the global economy. With the advent of generative AI and, more specifically, Large Language Models (LLMs), driving tremendous opportunities and efficiencies, we’re finding that the path to success for organizations to effectively use and scale their generative AI…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters