See my linkedin article
Marc Fiammante #ibmaot
fiammante 100000A8UA 363 Views
See my article on Linkedin https://www.linkedin.com/pulse/new-approach-visualizing-gradient-variations-images-marc-fiammante/ where I describe a new way of detecting edges in images with a innovative filter that I created.
Using that in an object recognition project improved significantly the results. Python code included in the article.
fiammante 100000A8UA 775 Views
I needed a small generic utility to create a CSV file from a JSON but did not find a generic Java one.
So here is one simple one which does not require a keymap like the Python json2csv https://github.com/evidens/json2csv/blob/master/json2csv.py .
fiammante 100000A8UA 1,129 Views
In some conversations my customer need to pass form data to a Watson conversation, the form and the conversation being part of two different widgets in the same browser environment.
The form data can then be used in conversation conditions.
Then when calling Watson Conversation API, we just need to retrieve the form context and add it to the conversation context as follows.
fiammante 100000A8UA 1,505 Views
Opengroup has standardized the Open Group Service Integration Maturity Model (OSIMM) which provides means to assess and organization SOA maturity levels and to define desired target maturity for an SOA roadmap.
On the Opengroup site there also is a Microservices architecture where it is stated that a Microservices Architecture is a subset of SOA. However the SOA maturity model had not been adapted for microservices.
Here are my first toughts on a Microservices Maturity Model, that incorporates recent industry practices.
First I reduce the levels to 5, as in my OSIMM practice I never had maturities above level 5. Then I replace the level 4 and 5 by a microservices version, where the level 5 is able to bill microservices or have an explicit funding, is using cloud environment and is based on agile approaches.
On the business dimension, the end goal is to make direct or indirect profit from microservices.
The governance dimension is also lightened for each services, however the level 5 maturity requires an enterprise catalog that is managed with a common approach, including billing and documentation.
The method is adapted to the microservices short cycles with agile method and continuous delivery and improvement with the latests DevOps approaches.
The applications use flexible protocols such as REST. The packaging evolves towards containers technologies like Docker, which are contructed to be parameterized and can be integrated with other services containers.
On the information side flexibility and a common taxonomy becomes the goal. Schema less approach with JSON provide variability and limit the propagation of changes and the need for versioning.
Of course the architecture targets a cloud reference architecture such as the one from NIST or its extension in the ISO cloud interoperability and portability future standard.
Finally for the infrastructure services are ultimately hosted on PaaS enabled platforms like Cloudfoundry.
This is just a first draft, comments and feedback welcome.
This code sample demonstrates how you can combine the Conversation service and Watson Explorer to allow customers, employees or the public to get answers to a wide range of questions about a product, service or other topic using plain English or other supported languages (we currently do it in French). Users pose questions to the Conversation service and in parallel, the application in Watson Explorer Application Builder executes a call to Watson Explorer search, which to provides a list of helpful answers from documents in collections.
The following picture shows a sample search page capture from a fictitious domain. The center widget implements is a chat box which calls an endpoint which interacts with Watson conversation.
The reason for having and endpoint is to avoid having 'access-control-allow-origin' errors (cross site scripting proctection) and to offer a protection of the called API credentials, as endpoint are executed on the server side. Endpoints act as relays from the Watson Explorer server and are written in Ruby language.
Ruby is processed on the server side, so when used in Widgets it will be processed before the page sent to the browser.
In this article I provide the source code for the widget and for the endpoint.
To use this code open application builder layouts in edit mode, select search-pages and create a widget with a name such as "conversation". Copy and paste the text in the box below. Paste only the plain text without format as I had to add some bold to avoid blog entry edit conflict errors. Then create an endpoint named "get_query_intent" with the code at the bottom. Don't forget to replace conversation classifier together with username and password from Watson Conversation provided credential in the endpoint code.
if you copy paste from the code below make sure to replace all html "<" formatting that will be copied with "<" .
The endpoint in application builder is quite straight forward
fiammante 100000A8UA 1,016 Views
In certain circumstances you may need the immediate annotation of a text. On example of this is when Watson Conversation intent confidence score it too low and that one wants to provide an alternate way for understanding the question using a WCA annotation, and then use the annotations for an intelligent document search.
WCA is based on UIMA and LanguageWare and thus it is possible to use the UIMA APIs.
After creating your annotator right click and export as UIMA Pear as shown in the picture below . This will embed in the Pear file all necessary dictionaries, rules, jars that are required for the pear file to be executed.
Get the following the uimaj-core-2.7.0.jar from WCA studio installation and add it to the Classpath (that jar is within another jar).
Then use the following code to install the Pear on the file system.
Now that the pear is installed we can use it.
Here is the result for my sample French text
fiammante 100000A8UA 980 Views
In recent large projects I have been involved in defining an approach for capturing Non Functional Requirements, which are all quality aspects and constraints that will be required to satisfy in order to deliver the business goals, objectives or capabilities. These requirements are also necessary in a project scoping phase to evaluate project feasibility, costs and duration.
The “non functional requirements” term is often misunderstood by business users as they do not see why something would be “non-functional” in their requirements. This is why we are now rather using qualities (and constraints) to define these requirements.
In addition this is converging with the ISO 25010 that defines a model for Systems and software Quality Requirements and Evaluation (SQuaRE).
The divisions within the SQuaRE series are:
The ISO 25010 model defines the quality in use model (quality viewed from users of the system) and the product quality (intrinsic product viewpoint) models.
The “quality in use” model defines 14 characteristics and sub characteristics while the “product quality” model has 39 characteristics and sub characteristics shown in the table further below.
However not all qualities requirements will be the same for all of the architecture elements. For example the “Time behavior” quality, which could be a response time, can be different for an online account balance query versus a loan acceptance. On the other hand developing very detailed quality requirements is time and effort consuming, and there need to be a tradeoff between having enough requirements to size and cost, and spending too much time in trying to capture target measurements such as number of users, queries, locations etc.
Once defined quality requirement need to be defined with a precise measurement, such as Gigabytes per users, Screen size, response time in second, if necessary using examples from ISO/IEC 2502n - Quality Measurement Division. A verification method must be defined, so that users that see the resulting system can check that the requirement is delivered.
Some requirements are cumulative (storage, seats, physical space) so all of them need to be evaluated, for others the most constraining requirement is the one to capture (what is the most consuming user transaction on an ATM since only one at a time can be done).
Using functional decomposition to group requirements
In my projects we needed a way to group functions with similar quality requirements and we used functional decomposition to find such groupings.
For the purpose of the discussion to provide a public example on how we made this groupings I’ll use the BIAN service landscape see https://bian.org/servicelandscape/ (for telecommunication operators it could be the TAM https://www.tmforum.org/application-framework/ or a process classification such as APQC https://www.apqc.org/pcf ).
In the following picture a subset of the BIAN service landscape decomposition is on the left, and the ISO 25010 qualities are on the column headers.
As requirement groupings one can consider that all the quality requirements that relate to "collateral services" domain could be identical and captured in a single work product.
On the other hand it can make sense to have security requirements common to all of the enterprise.
Finally that does not prevent from having a specific set of requirements for a particular service, as an example the Fraud detection could be a highly restricted service and have its own requirements.
As a conclusion, the approach is to be as exhaustive as possible to avoid missing a cost factor, but also try to limit requirement capture effort by grouping as much as possible, and by only capturing requirements, that can be measured and verified.
fiammante 100000A8UA 1,568 Views
I needed to check if a String contains another String with a fuzzy match. For this purpose I reuse the Levenshtein distance which is a string metric for measuring the difference between two sequences.
Since Leveinshtein distance measures the insertions, deletions or substitutions needed to get to the other string, the difference between the contained string length and the container string gives the number of deletions.
Substracting this number of deletions from the Leveinshtein distance between the two Strings gives the distance of the contained element from similar parts of the container String.
Here below is the Java code for the following result: String "street of the smoking dog" contains "the snoring dog" with a 15.0 % approximation
Navigating indirect documents relationships with Watson Explorer, Content Analytics, UIMA and recursive SQL
fiammante 100000A8UA 2,125 Views
In a project we use Watson Explorer and the Content Analytics Studio to create annotations such as Person, Location, Organisation, Account etc.
Watson Content Analytic is based on a Apache UIMA components in which the annotations attributes like name, birth date, address, car plate, account number etc. are called features.
Let's suppose that you have a first document with a person name and a car plate, in a second document a car plate and another person name, and finally a third document with that other person name and a location.
There is an indirect relationship between the first document and the third document, however using the usual queries that have indexed the features you cannot find the first and third document because the have no value in common. In addition as shown in the following picture the addresses do not have an extract match.
Luckily in the content analytic UIMA pipeline you can insert custom components written in Java that will add this indirect relationship capability that I shall describe further down in this entry.
Planning for the customization
To develop such custom Java annotator you can install the UIMA Eclipse tooling via the Eclipse update site: http://www.apache.org/dist/uima/eclipse-update-site/. This Eclipse plugin adds the "Add UIMA nature" menu option to Java projects so that the UIMA descriptor can be modifier with a specific UIMA editor.
The following pictures shows a UIMA custom component and some of its properties inserted after the parsing rules that in this article example are extracting the annotations such as name, location and organization.
Once the UIMA Java component is exported it can be added as a custom step in the pipeline displayed in the Watson Content Analytics Studio. In the following picture
Designing the resolution of the problem
In Watson Explorer the documents are crawled from wherever the sources have been configured (files, web etc), the extracted text then feeds the annotation pipeline, then indexing and annotation feature to facettes mapping occurs. You can add to the Watson Explorer GUI custom applications widgets to display additional aspects are needed as described in the following link Introduction to the IBM Watson Explorer Application Builder .
A custom application widget can display a tabular list of source documents and the documents they are related with directly or indirectly, with the links to these documents. It can include fields for further filtering.
But this widget requires this indirect relationship information with fuzzy relations to be displayed that has to be computed.
A rather easy way to navigate indirect relationships in any structured information is to use Recursive SQL (click to see Wikipedia article) where you look for a direct relationship and then recurse from the related documents to their own related documents.
In addition a commonly similarity matching between text is achieved by using trigram algorithm . Luckily some databases such as PostGreSQL implement this algorithm and even provide the "%" boolean operator. In a where clause text1 % text2 would return true if they are similar false otherwise. Click to see the PostGreSQL trigram article and click here to see a Java version of the algorithm if a database misses the trigram function it can be added using a User Defined Function (UDF).
Now we just have to implement a custom annotator that stores the documents, annotations and features in a database, and creates the appropriate views and queries serving as source for the Widget above.
Implementing the schema and the Java code
In the database schema we need the document,table with documents containing annotations in the annotation table, each annotation containing features in the feature table.
All annotations have a type e.g. Person, a beginning and an end location the position of the text in the document. All features have a type e.g. Name and a text value.
The following picture contains the resulting schema.
The view LINKEDDOCS returns the pairs of documents that have a similar feature value in commonn for the same feature names and annotation types.
CREATE VIEW LINKEDDOCS (PARENT, CHILD, VALUE) AS
The following SQL Query recurses the pairs of documents in the LINKEDDOCS view, limited to a depth of 10 to avoid infinite recursion.
Now we need to feed the database tables from the custom Java UIMA annotator that we add in the pipeline.
A java UIMA annotator extends the provided UIMA implementation e.g. public class Annotation2SQL extends JCasAnnotator_ImplBase.
There are 2 methods that need to be implemented "public void initialize(UimaContext aContext)" and "public void process(JCas aJCas)" .
The initialize method it called when the annotator is loaded and reads all the required parameters e.g. String dbName=aContext.getConfigParameterValue("JDBC","DBName"); and then that method also initialize the JDBC session.
The process method is called for each documents flowing into the pipeline, and a view of the document content with all previously created annotations is passed as the JCas parameter. There are several examples in the Apache UIMA tutorial or on the web. The first thing in the process is to locate the appropriate view as in Watson Content Analytics, the name of the associated CAS view is "lrw-view".
This is done by iterating the views "Iterator<JCas> viewI = aJCas.getViewIterator();" until the "lrw-view".is found if ( lrw_view.getViewName().equals("lrw-view")) break; .
We store the document information in the database using UIMA SourceDocumentInformation Java API (see link and source examples on the Web).
Then we need to iterate on the annotations and store then in the database using the following code
Finally for each annotation we iterate on the features, convert them to String and store them in the database
After the pipeline is processed all of the above tables have the information and the recursive query gives the indirect relationships.