I am analyzing some documents related to Healthcare.
An example of the text is : "rustig voorste oogsegment".
For some reason, the word "oogsegment" is not recognized as a noun. What it does : it recognizes "oog" as noun and "segment" as "noun". When I try to search for "oogsegment" no document is found back. In the Content Analytics Studio, I see that the words "oog" and "segment" are recognized, but the word "oogsegment" isn't.
How can I solve this?
This topic has been locked.
Pinned topic Nouns are not correctly recognized in Dutch
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Re: Nouns are not correctly recognized in Dutch2013-03-20T14:54:11ZThis is the accepted answer. This is the accepted answer.This is due to the decomposition paradigm which decomposes "oogsegment" into oog and segment.
You can add these type of compound words into a custom dictionary and use them in your model if you need to.
Re: Nouns are not correctly recognized in Dutch2013-03-25T15:00:43ZThis is the accepted answer. This is the accepted answer.
Re: Nouns are not correctly recognized in Dutch2013-03-26T09:59:35ZThis is the accepted answer. This is the accepted answer.
- SystemAdmin 110000D4XK
If you really want to turn off decomposition, then it is an advanced usage and you need to contact IBM via the support channel from which you bought the ICA license.
It is possible to turn off decomposition, but please be aware that there are side effects, mainly on the Part of speech tagging precision which may degrade.