Testing Watson Conversation
Hickmat 100000QA3T Comments (4) Visits (11114)
From my previous post you will have deduced that I am currently working on a Watson project. As part of this project a requirement emerged around being able to run a batch of test questions against the root dialog node in a Conversation to validate that the correct intent was being detected and that the confidence level was suitably high. Out of the box there are a number of ways to run test questions against a Watson Conversation but nothing I could find to support a batch run. Based on this I set about building a simple test tool. The approach I took was to support the uploading of test data via CSV but allow this to then be stored in a Cloudant database to support re-running of the test. Once uploaded the test data would be displayed in JSON format and if necessary could be edit before being submitted. Finally the results would be displayed in a table and compared against the expected results, and in addition it would be possible to save the results to Cloudant in order to allow historical test results to be reviewed. So over all the architecture would look as follows:
The approach I took around the CSV upload was based on some work
As you can see I put in some basic button management to enable / disable the buttons based on the state of the UI. When a set of test data was loaded in to the UI the "Run Training" button becomes active. Clicking this again calls a REST interface on my Microservice. This service walks the JSON test data and for each test item makes a call to the Watson Conversation service and stores the results back into the JSON object ready to be displayed in the results screen. This data load screen blocks until all the tests have been completed and once ready displays the results screen which looks like:
In this table the "Ok?" icon is set based on comparing the input intent against the resulting intent from Watson. If there is a match its flagged as OK otherwise as an ERROR. In addition I've provided the ability to sort the responses by confidence and allow the results to be filtered based on confidence level via a slider. Finally the results can either be saved to Cloudant or exported as a CSV file. If exported to Cloudant then an overview document is stored for the overall test and then each test result is stored as a separate document but linked to from the overview document.
There are still areas I can look to improve with this tool (the UI for example could be a little but prettier) but for now its working and is delivering value,