Add Missing Fields Crawler Plugin
Date submitted: March 01, 2014 Version: 1.0
Language: English, Category: Tools
Cat No: P003-A, Type:Crawler Plugin
When you configure a Watson Content Analytics (WCA) crawler one of your tasks is to map the native fields of the source to your collection index fields. Depending on the source, the native fields for the crawled documents may in some cases not exist. When this happens there is nothing to map from the native field to the WCA index field which results in a “missing” WCA index field for that particular document. These “missing” fields can have unintended consequences affecting facet counting and other WCA features that have been configured to depend on that field. You may have tried to use the WCA’s Field Filters feature to set the missing field to a default value when it is empty. But being empty implies that the field exists which in this case does not so the Field Filter is not the solution. This crawler plugin is designed to solve this problem. You configure the plugin with a list of the fields you want to check for each document crawled. And if the field does not exist, the plugin will create one with a default value you specify and then add it to the document.