November 8, 2023 By Ayhan Sebin 4 min read

Generative AI-powered assistants are transforming businesses through intelligent conversational interfaces. Capable of understanding and generating human-like responses and content, these assistants are revolutionizing the way humans and machines collaborate. Large Language Models (LLMs) are at the heart of this new disruption. LLMs are trained on vast amounts of data and can be used across endless applications. They can be easily tuned for specific enterprise use cases with a few training examples.

We are witnessing a new phase of evolution as AI assistants go beyond conversations and learn how to harness tools through agents that could invoke Application Programming Interfaces (APIs) to achieve specific business goals. Tasks that used to take hours can now be completed in minutes by orchestrating a large catalog of reusable agents. Moreover, these agents can be composed together to automate complex workflows.

AI assistants can use API-based agents to help knowledge workers with mundane tasks such as creating job descriptions, pulling reports in HR systems, sourcing candidates and more. For instance, an HR manager can ask an AI assistant to create a job description for a new role, and the assistant can generate a detailed job description that meets the company’s requirements. Similarly, a recruiter can ask an AI assistant to source candidates for a job opening, and the assistant can provide a list of qualified candidates from various sources. With AI assistants, knowledge workers can save time and focus on more complex and creative problems.

Automation builders can also harness the power of AI assistants to create automations quickly and easily. While it may sound like a riddle, AI assistants employ generative AI to automate the very process of automation. This makes building agents easier and faster. There are two essential steps in building agents for business automation: training and enriching agents for target use cases and orchestrating a catalog of multiple agents.

Training and enriching API-based agents for target use cases

APIs are the backbone of AI agents. Building API-based agents is a complex task that involves interacting with a user in a conversational manner, identifying the APIs that are needed to achieve a user goal, asking questions to gather the required arguments for the API, detecting the information provided by the user that is needed when invoking the API, enriching the APIs with sample utterances and generating responses based on API return values. This process can take hours for an experienced developer. However, LLMs can automate these steps. This enables builders to train and enrich APIs more quickly for specific tasks.

Assume Bob, an automation builder, wants to create API-based agents to help company sellers retrieve a list of target customers. The first step is to import the “Retrieve My Customers” API into the AI assistant. However, to make this automation available as an agent, Bob needs to take several manual and tedious steps which include training the natural language classifier with sample utterances. With the help of LLMs, AI assistants can automatically generate sample training utterances from OpenAPI specifications. This capability can significantly reduce the required manual effort. Once the foundation model is fine-tuned for semantic understanding, it can better understand business users’ prompts and intents. Bob can still review and manipulate the generated questions using a human-in-the-loop approach.

Soon, the process of building agents will be fully automated by identifying APIs, filling slots and enriching APIs. This will reduce the time it takes to create automation, reduce technical barriers and improve reusable agent catalogs.

Orchestrating multiple agents to automate complex workflows

Building automation flows that use multiple APIs can be technically complex and time consuming. To connect multiple APIs, it’s important to identify, sequence and invoke the right set of APIs to achieve a specific business goal. AI assistants use LLMs and planning techniques to simplify this process and reduce technical barriers. LLMs can work as a powerful recommendation system, suggesting the most suitable APIs based on usage, similarities and descriptions.

Builders must align the inputs and outputs of multiple APIs to compose multi-agent automations, which is a tedious and error-prone process. LLM-driven API mapping automates this alignment process based on API attributes and documentation. This makes it easier for automation builders to reuse existing APIs from large catalogs without manual intervention.

Now, suppose our automation builder, Bob, wants to create a more complex multi-API automation that allows sellers to retrieve a list of customers and subsequently generate a list of personalized product recommendations. After importing and enriching the “Retrieve My Customers” API agent, the LLM-infused sequencing feature can automatically recommend the “Generate Product Recommendations” API. This means Bob does not have to sift through each API individually to discover the most suitable one from the extensive catalog of agents.

In addition, each API contains fields of varying data types. The source API provides output fields that represent information about a set of customers. The target API presents input fields that also represent customer information. Typically, Bob would have to spend time manually mapping each field in the target APIs to a corresponding field in the source API. This tedious effort would be exacerbated as the number of source APIs and target fields increase. The API mapping service can generate a set of alignment suggestions which Bob can quickly review, edit and save.

IBM® watsonx Orchestrate™ uses a combination of AI models (including LLMs) to simplify the process of building AI agents through API enrichment, sequencing and mapping recommendations. In the new phase of evolution, AI assistants will be able to sequence multiple APIs at runtime to achieve business goals defined by non-technical knowledge workers, which further democratizes automation. By leveraging AI assistants, enterprises can accelerate their automation initiatives and redeploy significant resources toward more value-generating areas.

Learn how to automate and reclaim valuable time with generative AI-powered assistants
Was this article helpful?
YesNo

More from Automation

Deployable architecture on IBM Cloud: Simplifying system deployment

3 min read - Deployable architecture (DA) refers to a specific design pattern or approach that allows an application or system to be easily deployed and managed across various environments. A deployable architecture involves components, modules and dependencies in a way that allows for seamless deployment and makes it easy for developers and operations teams to quickly deploy new features and updates to the system, without requiring extensive manual intervention. There are several key characteristics of a deployable architecture, which include: Automation: Deployable architecture…

Understanding glue records and Dedicated DNS

3 min read - Domain name system (DNS) resolution is an iterative process where a recursive resolver attempts to look up a domain name using a hierarchical resolution chain. First, the recursive resolver queries the root (.), which provides the nameservers for the top-level domain(TLD), e.g.com. Next, it queries the TLD nameservers, which provide the domain’s authoritative nameservers. Finally, the recursive resolver  queries those authoritative nameservers.   In many cases, we see domains delegated to nameservers inside their own domain, for instance, “example.com.” is delegated…

Using dig +trace to understand DNS resolution from start to finish

2 min read - The dig command is a powerful tool for troubleshooting queries and responses received from the Domain Name Service (DNS). It is installed by default on many operating systems, including Linux® and Mac OS X. It can be installed on Microsoft Windows as part of Cygwin.  One of the many things dig can do is to perform recursive DNS resolution and display all of the steps that it took in your terminal. This is extremely useful for understanding not only how the DNS…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters