Before Christmas, I was giving a brief overview on the new content-related functionality in the IBM Business Process Manager (BPM) V8.5.7 CF2016.12 cumulative fix (see Sending and Receiving Files to/from REST Services and Web-based Content Integration Tooling are available in IBM BPM V8.5.7 cumulative fix 2016.12). One of the new features extends the REST invocation capability that was introduced in the CF2016.09 cumulative fix: the support of file parameters.
Today, we will use a REST service available on the IBM Bluemix platform that recognizes the text from a voice recording in an IBM BPM process. We will start by looking at Bluemix in general: what it offers and how one can get a test account. We will browse its available services and look for one that can solve our problem. We will perform the required steps in Bluemix to use a service. Then we continue in IBM BPM. There we will discover the REST service to be able to use it in a process. In this process, we will allow an end user to upload a file that is sent to the REST service to recognize the text. We finally show the recognized text to an end user.
Bluemix is IBM’s cloud platform that offers more than 100 services that you can invoke from applications that run outside of Bluemix, or from applications that you run in the Bluemix cloud.
If you do not yet have a Bluemix account, then you can register for a trial account that is free for 30 days here: Sign up for an IBMId and create your Bluemix account.
After you signed up, you can sign in Bluemix with your IBMId. You can browse through the catalog to see the available offerings. For this example, we use a service that recognizes text from an audio file. The growing number of services has made the search for a specific service a little more complicated than it was some time ago – unless you know the service name already. To speed you up: we need the Speech to Text service. You can find it in the Watson category or directly by typing the service name in the Search field. Watson is IBM’s cognitive technology that is able to interact with users in their human language, understands a user’s personality, tone and emotion, and uses machine learning to gain expertise to give recommendations.
After clicking on the Speech to Text service in the catalog, you can create your service instance. You may change the Service name and Credential name to values of your choice. You can leave the Connect to field as Leave unbound because we do not want to consume the service from a Bluemix app, but from IBM BPM which runs outside. The Pricing Plans that you can select depend on the country where you are. In my Germany region, I choose the Standard plan which includes 5,000 free minutes of speech recognition. This is more than enough for our test. Click Create to create your service instance.
After the service was successfully created, you can look at its details. What we need first are the Service Credentials that were created for your service instance.
Note the URL, username and password as we will need them later to invoke the service from IBM BPM.
Service API Specification
IBM BPM requires an API specification with the available operations and request and response data so that it can invoke a REST service. You may know the Web Services Description Language (WSDL) that is used to define web services. For REST services, the specification language that is more and more being adopted is the OpenAPI specification that has been renamed from Swagger recently. With OpenAPI you can describe and document RESTful APIs in a JSON or YAML document. IBM BPM supports the current OpenAPI V2.0 specification.
All Watson services on Bluemix have an OpenAPI documentation. Unfortunately, the URL for these OpenAPI documents are very well hidden. The approach to get them that I use is the following:
I open the Watson API Explorer. It is listing all Watson services. Next, I am clicking on the service I want to explore, for example Speech to Text. The Watson API Explorer now lists all the available operation, you can click on them to see the parameters. You can even fill out these parameters and try an operation. You will then see the response of the service invocation. Back to how we get the OpenAPI documentation of the service. At the bottom, there is a button that shows the validation result of the service. It is either green or red.
Click on it. A new browser tab opens that validates the document. For Speech to Text, you can see this address in the address bar:
There you can see the URL to the OpenAPI documentation, remove the Swagger validator prefix so that you now open this URL:
Done. You now have the OpenAPI documentation for the service. Save the file to your local disk.
Discovering the REST service in IBM BPM
Now we are ready to switch to IBM BPM. We start by discovering the REST service. In Process Center, create a process application and open it in the web Process Designer.
From the Library, create a new External Service.
A wizard with multiple pages now opens. On the first page, keep the selection on Java, REST or Web service and click Next.
On the second page, you can select the type of service you want to discover. For a REST service, you here need to keep the selection on Browse local files (Swagger) (that’s the label today, I expect it to change to be called OpenAPI instead of Swagger at some point). Select the OpenAPI documentation file that you previously downloaded. Optionally, specify a custom name for the external service, for example Speech to Text Service. Click Next.
IBM BPM will now look at the operations defined in the specification and display operations that must be invoked in a script task and those that have warnings. Do not get afraid of the long list. In IBM BPM, you can invoke REST services in two ways: visually in a Service task in a service flow, or in a Script task in a service flow. Operations with files can only be invoked from a Script task. That’s the reason for the long list. Click Next.
The next page shows the operations that were successfully discovered without any warnings and can therefore be invoked from a Service task. Theoretically, we could here deselect all operations as we will use only one operation that can only be invoked from a Script. We will leave it for now and click Next.
On the last page, you need to create or select a service for the external service. The server will contain the binding information of the service, for example the host name, port, authentication and timeout settings, etc. Click Finish.
Several artifacts are now created:
- The external service that represents the external REST service within IBM BPM.
- A REST server in the process application settings that contains the connection information.
- Business object types that are needed for the invocation of operations that were successfully discovered.
Next, we are going to properly configure the REST server. Go to the Process App Settings, there to the Servers tab. You can find the REST server that was automatically created. You now need to enter the connection information. You need to take the Host name from the URL in the service credentials of your Speech to Text service instance in Bluemix. The Port can stay empty unless a non-default port is used. Enable Secure server to indicate that HTTPS is to be used. The SSL configuration is defined in the WebSphere admin console. It needs to reference a trust store with the public certificate that is used by the REST service endpoint. As we are here using the Speech to Text service from Bluemix, we can use PublicInternetSSLSettings. This is a predefined SSL configuration that is referencing a trust store with the root certificates of the global certificate authorities (CA) where the certificates for the internet are typically signed.
In the Authentication section, we need to provide the user name and password that the service credentials of your Text to Speech service instance in Bluemix contain. You can provide them in two ways:
- Directly in the process app settings. This is good for a quick try, but lacks some flexibility if you need to change the credentials.
- Using an authentication credential. There you can enter the name of an authentication alias that you create in the WebSphere Admin Console (see Java 2 Connector authentication data entry settings) or using wsadmin scripting. An example script to create the authentication alias using scripting can be found here: createAuthenticationAlias.pyView Details.
Using the REST service in IBM BPM
We are now ready to invoke the REST service. So, what do we want to achieve? We want to pass in an audio file where somebody is speaking and get the text of what he or she said. Files in IBM BPM are represented by documents. A REST parameter of type file therefore needs to be mapped with a document reference. The business object type representing a file is ECMDocumentInfo from the Content Management toolkit. You need to add a dependency to that toolkit to your process application or toolkit to use the business object type.
To invoke a REST service with a file you need to create an instance of this business object type and fill the objectId and serverName properties. So far, only local documents are supported.
So, let’s create a service flow in web Process Designer that does this. We define one input variable for the document identifier and one output variable for the recognized text.
The Speech to Text operation that we want to invoke is recognizeSessionless. The operation is a little bit complicated because it allows to pass in the audio file in two ways:
- Directly in the request body. If so, then several other parameters can be passed in as query parameters.
- As a part of a multipart/form-data request. If doing this, then all the query parameters are obsolete and those are to be passed in serialized as a JSON object in a separate part.
With IBM BPM, we can only use the multipart/form-data approach. You can check the specification of the whole operation in the OpenAPI documentation of the service, here is only the relevant part for us:
To invoke it, let’s go to the Diagram tab of the service flow. Add one Script task with this code:
// Create ECMDocumentInfo object
The script starts by creating an ECMDocumentInfo object. The object identifier is taken from the service input. The server name comes from a constant and points to the IBM BPM document store where local documents are persisted.
The script continues to create a BPMRESTRequest object to prepare the REST invocation. It sets the external service name to the name of the external service that we previously discovered. The operation name is set to recognizeSessionless. The value here needs to match the operationId as it is specified for the operation in the OpenAPI documentation of the REST service. For services without operationId (it is optional per the OpenAPI specification), you build it by concatenating the path and method (for example "/myResource POST").
As part of the HTTP headers, the Content-Type is set. This is not generally required to invoke a service with a file, but the Text to Speech service definition makes it mandatory because it specifies the Content-Type explicitly within the request parameters.
In the body parameters, we pass in documentInfo as value for the upload parameter. In addition, we also need to specify the mandatory metadata parameter. It needs to be a serialized JSON object and can have several properties. One of them is mandatory, part_content_type. The Speech to Text service does not look at the content type of the upload part, instead it needs it separately.
This now causes additional work for us, because we only have the document identifier. Within the IBM BPM document store, the mime type is known. So, let’s declare an additional mimeType variable as String and searchResult as ECMSearchResult. Add a Content Integration task after the Start node. In its Implementation, use IBM BPM document store as Server and Search as Operation Name. In the Data Mapping, map the following input:
- CMIS query: "SELECT cmis:contentStreamMimeType FROM IBM_BPM_Document WHERE cmis:objectId='" + tw.local.documentId + "'"
- Return type: null
- Search all versions: true
- Max items: 1
- Skip count: 0
- Include allowable actions: false
Map the output to your searchResult variable. In a script after the Content Integration task, parse the result:
// check if there is any search result
This mime type, we can pass into the Speech to Text operation. So, back to the script to invoke the REST operation. After having set the request parameters, we are ready to invoke the service. Then we can parse the response. The sample code simply looks for the first result and returns this as text.
And that’s it. The service flow is ready to be included in an application. With some more error handling added, the service flow looks like this:
You can now build a process for the upload of the file, the service invocation and to show the result. Or you take the shortcut and directly this process application: BPM_ECM_blog_-_REST_service_invocation_with_file_parameter - V1.twxView Details. After importing it you are required to update the process application settings to specify the appropriate username and password from your Speech to Text service instance.
Then, you can run the Speech to Text Test process.
It starts with an Add audio document user task that uses the Document Explorer coach view to let you upload a file. As an audio file, you can take this sample: test.wavView Details. It is an audio file with a speaker saying Ladies and Gentlemen, have fun using the Speech to Text service with IBM Business Process Manager.
Once you finished this user task, the process continues with two system tasks:
- The first one performs a Search for the uploaded document and returns the document identifier.
- The second one calls the service we defined above to call Speech to Text.
If everything passes correctly, then another user task appears, called Show result which will print out the text that was recognized by Speech to Text.
Note: if you try to run this sample with own recordings, please be aware that they need to be of high quality because the narrow-band models for the text recognition are not available for free. Also, Watson is only able to consume a few audio formats.
We have invoked a REST service that required a file as input parameter. In our example, we sent an audio file to the Speech to Text service on Bluemix to recognize the text.
You can further extend this example if you want:
- You can use a toolkit to discover the external services and to define the service flows that calls the service. You can use this toolkit from multiple process applications.
- You can run Speech to Text with a different language to recognize human voices other than English.
- You can call the Tone Analyzer service from Bluemix to determine if the customer is happy or angry and adjust user task priorities accordingly.
- You can use the Language Translator service on Bluemix to translate the text into a different language to let your best experienced staff work on the customer’s question.
Next time (see Calling a Watson REST service on Bluemix that returns a file as response) we are going to look at REST services that return a file in their response and see how this can be invoked in IBM BPM.