21 October 2024
Developers and AI builders, now you can accelerate your AI development process with watsonx.ai™ by taking advantage of a curated set of tools including application programming interfaces (APIs), models, runtimes and more. An extension to your independent development environment (IDE) of choice is included as well as our front-end UI for a low-code friendly experience and more.
Visit the new watsonx developer hub to access our library of APIs, software development kits (SDKs), templates and guides to help developers get started building AI applications.
We’re finding that the market might have reached a point of peak expectations where generative AI (gen AI) is at its highest. We’re seeing that gen AI can help “cut coding time from days to minutes, personalize products down to the tiniest detail, and spot security vulnerabilities,” according to research from IBM Institute for Business Value (IBV). While this jump largely reflects the success of pilots, sandbox experimentation and other small-scale investments, these early results have business leaders rethinking what’s possible, as the ingenuity of generative AI report from the IBV found.
Despite the positive deliveries of gen AI for business, Gartner® recently predicted that "at least 30% of generative AI (GenAI) projects will be abandoned after proof of concept by the end of 2025, due to poor data quality, inadequate risk controls, escalating costs or unclear business value."1
And yet, 2 out of 3 of those gen AI projects will reach completion. As the technology continues to blaze forward, what will it take for businesses to achieve return on investment (ROI) and shorten the time to value with generative AI?
A key to unlocking the full potential of generative AI lies in developers’ capacity to build and deploy AI applications. The number of highly trained AI builders remains small today so organizations often rely on the larger pool of traditional app developers to bridge the gap. This approach provides varying levels of AI knowledge and experience to develop gen AI systems. For these developers, effective building and deploying of AI services comes down to having the right tools that reduce time to value.
The skills required to build AI applications are complex and vast. And with new technologies, tools and frameworks being released daily, it’s difficult for many developers to keep up.
With watsonx, we’re meeting developers where they are by creating an enhanced developer toolkit to scale AI development and adoption of specific AI use cases. With IBM watsonx.ai, we’re providing AI developers and model builders an intuitive and collaborative development experience. Automation capabilities with prebuilt patterns, access to third-party AI frameworks, models and integrations with the broader IT stack helps accelerate time to value for real business impact and returns. We provide developers with tools to begin building valuable generative AI applications for your business today.
Foundation models can power the engine of your developer’s AI toolkit. In order to effectively achieve the outcomes your business needs, developers might need to access various models that perform better for specific tasks or use cases. IBM watsonx™ offers a selection of cost-effective, enterprise-ready foundation models, including the Granite™ model series developed by IBM Research, open source models and models sourced from third-party providers. All are suitable for use in a variety of AI applications, including agents and RAG-based solutions. Your developers can begin building with these models using our low-code or no-code visual interface or via API connections—all in watsonx.ai.
Having access to enterprise-ready foundation models is often not enough. Out of the box, many models will lack knowledge specific to your business and your preferred end-applications. Though training models is a way around this challenge, it can be expensive and time consuming. Retrieval-augmented generation (RAG) provides a cost-effective way to supplement models knowledge without retraining the models. Using RAG and other customization tools (such as agents, which we’ll cover later) and APIs helps to build and deploy AI-powered applications. These applications effectively ground model responses in internal company data when needed to fulfill user requests in a more efficient and effective manner.
We’ve found that currently building a performant RAG application is a notoriously challenging technical process. This process requires developers to implement a performant data ingestion and retrieval pipeline that provides large language models (LLMs) with accurate context to inform and ground responses. In practice, this approach involves implementing a complex series of steps, including efficient extraction of text and images from large collections of documents. It also requires storage of this data in performant and scalable vector databases, including watsonx.data Milvus and retrieval and reranking at run time.
To help reduce the time and resources needed to build this functionality, watsonx.ai offers a comprehensive suite of APIs that enable each stage of the RAG pipeline, allowing RAG applications to be built quickly and effectively.
Each stage of a RAG pipeline must also be optimized for a specific use-case and set of documents to enable the most performant, contextual and accurate responses. This optimization requires fine-tuning parameters such as chunk size, embedding and generation models and retrieval strategy during an often manual and time-consuming process.
Instead of relying on trial and error, watsonx.ai AutoAI for RAG service in technology preview can help evaluate the performance of multiple configuration options. It can identify the optimal set of parameters for a specific set of documents, allowing more performant RAG applications to be developed in less time.
Developers can also help improve the performance of RAG applications by fine tuning an LLM to understand a specific area or knowledge or skills. This process often requires collecting large sets of data of human-generated data, which can be time-consuming and expensive to get, and requires expensive and resource-intensive training.
To address the data challenge, IBM Research and Red Hat developed the Large-Scale Alignment (LAB) method, which represents an innovative approach that enables efficient fine tuning of pretrained-base models to specific business needs. InstructLab is an example of an alignment tuning method that facilitates contributions to LLMs, enabling the encoding of new skills and knowledge into models using far fewer computing resources than are typically required. IBM watsonx.ai intends to release an enhanced InstructLab experience* to watsonx.ai in the future. The experience will be designed to help enable AI developers, gen AI app developers, data engineers and data scientists to customize small and large language models with enterprise data.
Agents are emerging as powerful tools for improving productivity in enterprise environments. These intelligent systems can automate complex tasks, inform decision-making processes and streamline workflows. To support the development process, watsonx.ai offers a selection of models and AI middleware services that allow developers to build, deploy and monitor agents as a set of production-ready APIs.
The performance of large language models in agentic applications is determined by the model's function call abilities, advanced reasoning skills and the size of its context windows. Function calling refers to the ability of the model to effectively identify which tools are required to complete a task and generate the appropriate inputs to these tools. Having low performance in any of these dimensions limits the complexity of the workflow that the agent is able to automate. That’s why IBM is introducing the latest Granite 2B and 8B models from IBM Research that have strong performance on function calling benchmarks and advanced reasoning that can support agentic development.
Many open source frameworks are emerging, enabling developers to build single or multiagent-applications, with established communities and an extensive library of resources and learning. Each of these frameworks offers a unique approach and set of capabilities that enterprises are experimenting with to implement and automate workflows for specific use-cases.
IBM watsonx.ai is focused on providing seamless integration with these frameworks, such as Crew AI and LangChain through full support for industry-standard APIs. This allows developers to easily power these frameworks with multiple models hosted in watsonx.ai. IBM has also open-sourced its own experimental agentic framework Bee. This agent framework focuses on supporting various models and offering tools that application developers need to accelerate the next wave of AI adoption in the enterprise.
To develop agentic applications, developers must also build interfaces, commonly referred to as tools, which extend an LLM's base capability by allowing it to interact with external systems. These tools can be utility-based, such as giving an LLM the ability to execute code, or data-based, such as retrieving and updating information in a third-party enterprise system via an API. Building and orchestrating these tools effectively is still a significant technical challenge. Automating complex workflows requires agentic systems to interact with tens or even hundreds of independent tools. Because of this, developers must help ensure that the inputs and responses from the tools can be understood by an LLM; appropriate business rules are followed; and required security constraints are implemented.
To help address these challenges, watsonx.ai is extending the Flows Engine framework. Currently in technology preview, the framework supports the definition and orchestration of tools as a set of API schemas that can be integrated with any agentic framework.
Once enterprises have built and optimized an agentic service, it must be deployed and monitored in a scalable, performant and observed way. To implement this approach, developers must instrument the service with appropriate observability tools, host the service as a set of production-ready APIs and integrate it into the end-user application.
To help accelerate this stage of the development lifecycle, AI Services provide developers with a deployment and monitoring platform for agentic services.
The development lifecycle, from building an agentic workflow to integrating it with a selection of enterprise-ready tools to deploying it and monitoring it as a set of production-ready APIs, can require significant time and resource investment.
To help accelerate time to market for simple use cases, IBM plans to release the watsonx.ai Agent Builder* in the future, a developer-focused low-code tool for building and deploying enterprise agents.
The progression of gen AI technology can make it difficult for developers to adopt the most recent and valuable tools. There are learning curves to use technologies such as RAG or to begin building agentic workflows. This is why we built the watsonx Developer Hub, it is the central place for developers to access what they need to begin building today. With various resources, quick starts and guides for use of our APIs and SDKs, developers can quickly go from signing up to achieving their first use case.
Generative AI continues to develop at a rapid pace. So fast that it is difficult to keep up. In order to find value from these emerging technologies, developers are essential to customize models and build applications for your enterprise. But, unless your developers have a toolkit ready that is easy to use and enterprise ready, it can be difficult to be successful. At IBM, we can provide your developers enterprise-ready generative AI tools to help your team build applications and solutions for your business.
Statements regarding IBM’s future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Gartner Press Release, Gartner Predicts 30% of Generative AI Projects Will Be Abandoned After Proof of Concept By End of 2025, July 29, 2024. GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.