Integrate Watson Virtual Agent, Conversation service, Retrieve and Rank, and other IBM Watson Developer Cloud services APIs
This content is part # of 6 in the series: Build advanced cognitive applications, Part 1
This content is part of the series:Build advanced cognitive applications, Part 1
Stay tuned for additional content in this series.
Part 1 of this tutorial series guides you through the process of combining several IBM Watson Developer Cloud service APIs to create your own cognitive computing app. In this tutorial, I discuss:
- Giving your app a cognitive user interface with Watson Virtual Agent
- Integrating multiple cognitive services within your app
- Creating a user-experience flow with Watson Conversation service
- Enhancing the relevancy of your app's behavior with machine learning
- Optimizing ingestion of large collections of unstructured content
- Integrating advanced content analytics into your app
- Tailoring Watson to suit your special requirements with Watson Knowledge Studio
This is an advanced tutorial, and it assumes that you have basic knowledge of IBM Bluemix®. If you need introductory Bluemix information, take a look at this quick-start information before you proceed.
What you'll need to build your application
- A Bluemix account and a Watson Virtual Agent. (You can request free trials of both Bluemix and Watson Virtual Agent.)
- A small collection of some 'interesting' text documents to upload. Watson uses this information to learn about whatever your own special subject area is for your application scenario.
A design pattern for integrating Watson cognitive computing services
This tutorial series focuses on a design pattern that integrates several Watson Cloud services APIs in a way that addresses the most frequently encountered needs that I have seen in many organizations. The basic scenario is that users need help dealing with a large amount of unstructured data. We need to make well-focused information accessible in an intuitive way so that users can immediately use it for their specific needs.
To fully grasp where we are going, consider where we came from. Simple search indexing was built on keywords, which did not reliably indicate actual informational focus, so commonly used search indexes produced long lists of possible locations to look for the desired information. Document retrieval systems compounded the problem by presenting the full content in coarse granularity results derived from a keyword search index, leaving you to dig through large volumes of irrelevant information to find what you needed.
The prior generation of conventional solutions fell short of the desire for fast relevant information because they fundamentally failed to determine the intent of the user and to operate upon information in alignment with the users' ways of understanding it. Keep these two key thoughts in mind, and you will quickly discover which features of this new tech will bring the greatest advantages to your particular situation. Addressing those two fundamental failings of prior generation tech is exactly what drove the initiative to develop cognitive computing.
Cognitive computing operates at the level of conceptual meanings (not keywords) and with the implicit context of users' perspectives in understanding information. This is how a cognitive system effectively provides information in a way that is attuned to the users' ways of thinking for the work they need to accomplish. Operating at the level of abstracted conceptual meaning, the limitations of term-indexing fall away, as the system is no longer dealing with words, but rather working with the meaningful concepts represented. The interweaving of many layers of natural language processing (NLP) and machine learning becomes the most significant distinguishing characteristic that defines cognitive technology.
Creating a user interface with Watson Virtual Agent
The IBM Watson Virtual Agent is the highest level service API in the design pattern this tutorial discusses, and it brings the intuitive interactions of a virtual agent (also known as chatbot) interface, layered over the Watson Conversation service. In turn, the Conversation service is built on the Dialog and Natural Language Classifier services, plus some handy tooling to support the configuration and training of cognitive behavior. You can access the APIs of the subordinate services directly, but it is seldom necessary to address them except in unusual situations requiring special functionality. In most application scenarios, you can use the Watson Virtual Agent service for the user interface of your application. The Watson Virtual Agent service provides a number of the most commonly encountered types of business applications with pretrained agents ready-to-go. For special adaptations with customized dialogs, you can work with the tooling provided in the Conversation service. The inherent flow of a dialog-based interface lends itself well to integrating a variety of cognitive services and analytics applications.
Practicing safe application design
There is a fairly specific application design pattern that you should adhere to for handling the flow of dialog between the agent, the UI, the customized dialogs, and the integration with back-end enterprise systems of record. There are strong provisions within the Watson Cloud services architecture for handling sensitive data securely, but you must utilize those features to gain full benefit of that built-in security. You can avoid introducing vulnerabilities in your code by using these interfaces according to the documentation; improper shortcut programming hacks circumvent the provided mechanisms of secure data handling. For example, adding an ampersand in front of a variable denotes to the Watson Virtual Agent service that it requires special handling. There are other important details to observe, so if you are building a custom UI, be sure to read that portion of the documentation closely. The documentation provides a clear detailed example that shows how to properly handle a dialog involving credit card transactions being passed safely through the Watson Virtual Agent dialog to a back-end system of record. Work through that example and make sure that you understand how it accomplishes the secure transaction.
User interface and client SDK
Watson Conversation service
The Conversation service handles interactions between virtual agents and users. One of the main advantages of a cognitive interface is the adaptive flexibility with which it understands and responds to the user. Watson learns with statistical models of verbal interaction in context, which effectively operates at an abstracted conceptual level above specific terminology. This allows the creation of systems that behave in ways that are more closely aligned with users' meanings and intentions so that they can interact with Watson in an intuitive fashion.
The technology is not based on a grammatical parsing of language. The Conversation service learns linguistic behaviors by example. In most cases, you can initially train it to operate in a way that is generally applicable to minimize the up-front effort. Then, while in use, you can customize with further training to understand the typical jargon and other verbal idiosyncrasies of the user-group. With this approach, you can deliver some immediate results, and then capture user interactions that you can use to train the system more fully. It is important to get dialog examples from real users.
To condition the service for a particular application, categories of conversational patterns are designed using built-in tools to train machine learning models of intentions, entities, and dialog flow. Watson learns each of these by analyzing examples of a few different ways that users might typically express common notions. It is not necessary to predetermine all the various ways something might be said. But it is important to use real examples captured from the end user subject matter experts. Watson can learn the desired conversational behaviors from just a few different examples.
You use the service's REST API to pass user input to Watson and receive responses. The programmatic interface provides categorically-driven functionality, which enables your application to respond with different actions depending on the intended purpose expressed by the user. In the simplest case, your app might just return Watson's response as an 'answer' if that is all that is required, but other behaviors are also possible. Because of the way user input is categorized according to the examples it is trained on, you can use this to control application flow in a way that is intuitive to the user. This categorical aspect of the service's behavior enables an application to alternatively perform a variety of different actions depending on what type of user input is detected.
Consider the following example of a single app that behaves differently depending on the category of user input:
- In some cases, the app returns a direct answer when the user asks a simple question.
- When additional contextual input is needed, the service conducts a short interchange with the user to elicit further input for clarification.
- In other cases, the result returned by the Conversation service is passed to another Watson service such as Retrieve and Rank, which utilizes deep learning algorithms applied over a large corpus of unstructured information to return concisely focused passages relevant to the user's immediate working context.
- Depending on how the Conversation service categorizes the user's input, this same app issues API calls to Watson Explorer to perform some type of content analytics on a particular collection of unstructured data and presents a graphical summary that the user has requested.
- Alternatively, depending on the category of user input, the same app launches another workflow, passing the user context forward so as to provide a seamless experience. For instance, the Watson Tradeoff Analytics service could be used in another screen to help the user quickly evaluate options for a complex decision, and then the main application flow continues in the Conversation service still maintaining the user's working context.
Even if you only have basic skills, you can easily integrate multiple cognitive services in an application with an agent interface that responds in some variety of different ways according to the context of use. The Watson Conversation service handles the front-end cognitive tasks with its layered combination of natural language processing and deep learning algorithms and provides a simple API for you to take full advantage without the need of extensive technical expertise. The Watson Cloud services architecture provides this cognitive technology as an accessible commodity for enterprise application developers and system integrators.
You can use the Watson Virtual Agent and underlying Watson Conversation service APIs to provide extremely responsive user control to your application flow, bringing the users immediately to desired functionality directly as they ask for it. When the application requires context-sensitive input from the user, the Conversation service can trigger your app to display a pop-up to request any necessary details, and all of that contextual information can be handled and maintained within the Conversation service API. The big leap forward in application design is that now any application or system integration can easily be made to behave according to the user's ordinary verbal input and respond in any of several different ways as appropriate to the task at hand.
Use Watson Cognitive services for large collections of information content
There are a number of services that deal with helping users with extremely large bodies of diverse free text information, such as tutorials and publications, unstructured verbal reports, historical narrative information, transcripts of communications, large volumes of public commentary, complex governmental and regulatory legal documents, and more. Any large collection of unstructured text information written in natural human language can be ingested into what is called a corpus. You can incorporate several cognitive services into applications to enable functionality that operates at the level of conceptual understanding upon such a corpus, and you can attune an application's responsiveness to the particular domain lingo and jargon of the users. Different services can come into play, depending on what you need to accomplish.
Document Conversion service
The Document Conversion service is essentially a preprocessing stage to condition content prior to ingestion into a corpus of information. It is necessary because of the diversity of document formats. Some proprietary file types, such as PDF documents, contain quite a lot of hidden markup that must be interpreted and cleaned up, so that the input to NLP processes is actually only natural human language. If you configure it properly, the Document Conversion service can greatly enhance the qualitative characteristics of results that are ultimately returned by the Retrieve and Rank service.
Conversely, if you do not configure the Document Conversion service to properly interpret the formatting of the documents, the results that emerge later can be segments of information that are either too large or too small. The results that Watson ultimately returns come from this service's segmentation of content into potential answer units. These chunks of meaningful text can be delineated from the original content headings and/or paragraphs, as well as several other configurable parameters used to help distinguish a useful balance between terse factoids versus broad passages.
It is crucial to have an optimal configuration driving the Document Conversion service, before ingesting input into the corpus of content that Watson draws its information from. The training of results ranking algorithms is built on top of the output of the Document Conversion service, and so you should carefully avoid readjustments after ingestion. In practice, the better this service carves up concisely meaningful chunks of text, the greater the impact on the subsequent utility of the application and the quality of the user experiences.
Retrieve and Rank service
The Retrieve and Rank service combines two information technologies in a single service: the fundamental faceting mechanisms of Apache Solr plus several layers of Watson's deep learning algorithms. This combination provides users with results that are more relevant to their intent per the way they are using the information. This goes well beyond simple metrics of selectivity. It is intentional alignment with the perspective of the user's purpose in accomplishing their work. The Retrieve and Rank service also supports the standard APIs of Apache Solr, so that all the power of configurable faceted applications are available. IBM provides the supporting framework that Solr requires in order to be made enterprise-ready, with the additional layers of service implementation with robust error handling and resiliency that generally become necessary when using Solr at an enterprise scale. Therefore, you do not need to build up all of that common supporting software infrastructure.
The ranking algorithms use machine-learning models, which are trained with examples via a operator-friendly utility. The service walks you through the training process with recommendations to guide you to the quickest path, and automated testing with progress metrics are shown along the way. The service uses several different kinds of advanced ranking algorithms, normalizing and balancing the weights of their ranking factors to optimize performance. The relevancy of results are typically fairly good even before any training of the ranking algorithms, so you can put the system to use right away and improve it over time.
It is imperative to train the ranking algorithms on questions from real end-user subject matter experts and that the relevancy of the candidate responses is rated accordingly. If you try to devise training data for Watson without real user input, the result will be a system that might respond well to the engineers but not much attuned to the end users. Never use simplistic randomizing or reordering techniques to "generate" training data, because training machine learning algorithms with machine generated examples only creates a rigid automaton. Watson learns to understand humans from analyzing all the noisy complexity of real human interactions with all of its inherent gaps and inconsistencies, because that is where the tacit contexts of our understandings are hidden. This is absolutely crucial data for a cognitive system.
When you instantiate the Retrieve and Rank service, you will find a button to launch tooling, which takes you to the utilities with self-explanatory step-by-step instructions that guide you through setting up a corpus of information and training the ranking algorithms. You upload sets of example questions, and click a simple five-star rating of the relevancy for the top four or five answers for each example. Along the way, Watson is continuously optimizing and recommending the best sequence of tasks to help you minimize the effort of training. If you follow the path of these tools, you can typically achieve well-trained ranking algorithms in a short time with marginal effort.
It is best to ingest content into the corpus in moderately sized segments, not all at once. Uploading batches of content and training as you go keeps the training cycles much shorter and ultimately reduces the overall length of the training process by a significant fraction. Upload some segment of the corpus that is generally concerned with one broad subject area, and then do a cycle of training. Then, upload another subset of content and train on that, building the corpus and the ranking models together so that a balance of learning and information are both accumulating at the same time. Keep in mind that what Watson ultimately operates on is not the information content itself, but more so the models of our understanding of it. This is not at all like loading a database with several documents. You are essentially uploading cognitive patterns of understanding into what becomes a representation of the perspectives of end users.
A typical mistake is to load all the content for the entire corpus all at once up front, before beginning training the ranking algorithms. If you do that, you will regret it about halfway through when you realize that the overall training effort becomes much longer and more laborious as a result. Instead, just train as you upload, in batches, and it will take less time and effort.
Bring information from very large corpora into the conversation
Combining the Virtual Agent with the capabilities of Watson to pull up narrowly focused passages from mountains of unstructured text yields a cognitive system with an extensive reach. You can configure the Conversation service to categorically recognize entirely different kinds of user requests and conditionally pass the application program flow to any of a variety of other services. So, you can use the dialog between the user and the agent to determine when to switch from simple direct responses in some cases, to a greater depth of focused information brought from a large corpus in other cases. One way to do this is to have the Conversation/Dialog service handle all the simple inquiries with direct answers, but when it encounters the category of anything else for which it has no response, then the application program flow is passed to the Retrieve and Rank service. That could be a sort of default behavior.
Another approach is to train the Conversation service with one or more categories of question types in which the user is implicitly asking for greater depth of understanding. This gives the user more control of the application and also presents the possibility of routing certain inquiries to different instances of corpora containing entirely different sorts of information. You can accomplish most of this with configuration parameters and minimal code.
This application shows how to combine the Conversation service with the Retrieve and Rank service in an application as described previously, with a demo and complete documented code.
The fastest way to get started is to click the Deploy to Bluemix button (located under Deploy the app), which clones the project into your own Bluemix account. Then examine the details of the application code, and use it as a starting point to modify and expand as a head-start in creating your own application, addressing the particular needs of your use case.
Use Alchemy to extract elements of natural language for metadata enrichment
The AlchemyLanguage service API provides the generic functions of natural language processing to extract named entities and their implicitly associated relationships from any unstructured text communication. The pretrained model-based approach this service is built on allows it to recognize and extract many different types of terminology and accurately identify those types with appropriate metadata tags. Being model-based gives it the flexibility to handle differing languages. The list of all languages that it can recognize is quite long.
The same approach also means that custom models can be trained as well, to handle domain specific jargon and slang. (More about that later in this tutorial.) The Alchemy API provides functions for Entity Extraction, Sentiment Analysis, Emotion Analysis, Concept Tagging, Relation Extraction, Taxonomy Classification, and other NLP operations. Combining Alchemy with other cognitive services enables a far greater degree of qualitative performance. It can enhance the effectiveness of the Retrieve and Rank service with a higher dimensionality of NLP concept-space. You can also use it in conjunction with Watson Explorer Advanced Content Analytics. You can take advantage of the Alchemy Language SDK to handle the pragmatics of working with this API.
This project in GitHub shows how to use Alchemy to extract a variety of elements and metadata from free text, with complete documented code and a working demo. The fastest way to get started with it is to click the Deploy to Bluemix button, which clones the project into your own Bluemix account, where you can examine in detail how the Alchemy API calls work and grab code snippets to save time in developing your own application.
Watson Explorer content analytics
This tutorial focuses on a design pattern that combines various cognitive services, so Watson Explorer is considered from the perspective of using its analytic capabilities in concert with other Watson Cloud services. Watson Explorer is a fairly robust application service that is also often used as a stand-alone solution in and of itself, but that is beyond the scope of this tutorial. You can find in-depth information about Watson Explorer content analytics as a stand-alone solution.
Watson Explorer content analytics brings advanced capabilities built on faceted indexing of natural language elements in unstructured text, and deriving statistics about those elements in order to perform analytic functions across linguistic feature-sets. Recognition of linguistic elements is performed via an interface called an Annotator. There are a number of generic annotator implementations used in a few different areas of Watson technology. There are some annotators built on machine learning that have very broad applicability. Some annotators, used within Watson Knowledge Studio, can be built with Alchemy or can consist of dictionaries of terms specific to a particular niche application.
Watson Knowledge Studio
When applications require elements of domain-specific content to be annotated, Watson Knowledge Studio is used to create those custom annotators. The most common type of annotator uses machine learning algorithms to categorically recognize the desired elements according to examples it has been trained on. This enables the cognitive capability of correctly recognizing new items that are of the desired categorical type, including instances that have not been explicitly seen before.
In many cases, it can be useful to create dictionary annotators, which are used within Watson Knowledge Studio to annotate particular conventions in the terminology of a specialized knowledge domain, linguistic idiosyncrasies, jargon, acronyms, or other important elements that might be peculiar to the implicit lingo of the content and/or the application users.
The machine learning annotators created in Watson Knowledge Studio, once trained and ready to deploy, can then be used via Alchemy in Watson Explorer content analytics to perform statistical operations upon the occurrence of the elements it recognizes, bring them up as facets of an index across a collection of documents, or to support other advanced analytic features.
The integration point that brings Watson Explorer content analytics into this complete design pattern is like a small doorway that leads into a very large building. A branch in the agent's dialog passes program flow over the threshold via a few lines of code, and then quite a lot of functionality suddenly unfolds, far more than the scope of this tutorial can touch upon. The next installment in this series will go into all of that in much greater depth.