You can export a UIMA pipeline for domain adaptive search
to generate queries according to query context and domain knowledge.
Based on rules that you specify in the UIMA pipeline, the Watson Explorer Content Analytics search processes can
alter the original query terms, generate suggested queries, and
group results.
Before you begin
Before you can export a UIMA pipeline for domain adaptive
search to Watson Explorer Content Analytics, you
must configure a Watson Explorer Content Analytics server
connection file.
About this task
After you develop and export your UIMA pipeline for
domain adaptive search, you can apply it to a result group when
you configure search quality management options for an enterprise
search collection in the Watson Explorer Content Analytics administration
console. At runtime, if an annotation that is created by the pipeline
is detected, any domain adaptive search rules that are defined
for the annotation are applied. For example, you might specify
domain adaptive search rules so that a query for IBM® runs a group query for “Big
Blue” OR “International Business Machines” in addition to the
original query, and that the results from the group query take
priority over the results from the original query.
You
can configure dynamic rules by using feature values as the query to
run or suggest. For example, you might want to run the query IBM
OR AIX if the original query is IBM,
and run the query Microsoft OR Windows if
the original query is Microsoft. In this
case, you created a dictionary with the entries (surface:IBM,
OS:AIX) and (surface:MS, OS:WINDOWS). You then created a parsing
rule to concatenate the covered text and the OS value of the dictionary
entry with OR, and save the new values as a feature when a text
contains a surface string such as IBM.
When you configured the domain adaptive search parameters for
the corresponding annotation, you selected this feature as the
search query to run.
Tip: Because UIMA pipelines
that you use for domain adaptive search are run every time that
a query is submitted, the search runtime performance can be impacted
if the annotator takes a long time to process the query text.
Before you export your pipeline to Watson Explorer Content Analytics, verify the performance
of the pipeline in Content Analytics Studio.
You can specify rules to display a suggested query instead of
or in addition to running a modified search query. For example,
for an original query cut wood that has
multiple contexts, you might configure the system to run queries for jigsaw and
chisel, and display the query suggestions how
to create a dog house and buy firewood.
Procedure
To export a UIMA pipeline for domain adaptive search
to Watson Explorer Content Analytics:
- From the Configuration/Annotators directory
of your project, right-click your ANNOCONFIG pipeline
configuration file, click Export, and
click .
- Specify a name and temporary location for the PEAR file
that is created on the file system before it is uploaded onto
the Watson Explorer Content Analytics server. By default, the PEAR file is exported to the Content Analytics Studio workspace directory.
- Select the Watson Explorer Content Analytics connection
file that defines the server to which you want to export the
pipeline.
- Configure the domain adaptive search parameters for one
or more UIMA types that are generated by the UIMA pipeline. Click Add to select an annotation type,
and then specify the parameters for that type. You can specify
a literal string or select a UIMA feature value for each parameter,
except for the group ID that must be a string that does not contain
any special characters. For each annotation type, you must set
a group ID and at least a search query or a suggestion query.
Tip: The UIMA pipeline can contain one or more result groups,
and each group can contain one or more UIMA annotation types.
If you want to include multiple annotations in the same result
group, ensure that you specify the same group ID for each annotation
type. Otherwise, different groups are created.
- Search query parameters
-
When you specify a search query to run if the annotation
is found in the original query text, you can also specify
a description for the results group. When the group results are
returned in the user application, a link More results
from the same group is displayed and the specified
description of the results group is displayed when you hover
over the link.
- Suggestion query parameters
-
When you specify a query to suggest if the annotation
is found in the original query text, you can also specify
the label to display for the suggestion. For example, for the
original query dog, you specify pets as
the suggestion label and cats OR fish as
the suggestion query. The label text is displayed as a suggestion
under the query entry field in the search application. If you
click the link for that suggestion label, the suggestion query
text is displayed in the query entry field and the query
is run.
For query suggestions, you can also specify
values that can be returned in a REST API response to custom
applications. You can specify different suggestion types such as
Suggestion A and Suggestion B to distinguish between suggestions
and select which type of suggestion is used in the custom
application. You can also specify the origin of the suggestion
to indicate what term in the original query produced a specific suggestion.
- Priority parameters
-
You can specify the priority of a query within its
group, and the priority of this group in relation to other
groups. Queries and groups with higher priorities are processed
earlier and their results are returned first. If multiple groups
have the same priority, the group ID is used as the second
sort key. For example, if the ID of group 1 is ab and
the ID of group 2 is aa and the priority
of both groups is 1, then group 2 is processed
first. The group priority value can be from -1000 to 1000.
The group priority that you set for an annotation is a dynamic
priority and overrides the priority that is set in the Watson Explorer Content Analytics administration console.
If you do not set a dynamic group priority, the static priority
that is set in the administration console is used.
By
default, the priority of the original query is 0.
If you want group query results to be returned before results
from the original query, you must set a group priority that
is higher than 0.
- Other parameters
-
For search and suggestion queries, you can specify
properties to configure the queries. For search queries,
you can specify ExactHighlighting or
GreedyHighlighting to configure how much
text is highlighted in the results. You can specify EnableStopword or
DisableStopWord to configure whether
stop words are removed from the queries. To specify multiple
properties, separate them with a space character. For suggestion
queries, you can specify the property DisableResultGroup
to disable result group options for the suggestion query.
You can also specify whether queries from the same UIMA
type are merged to a single query, and whether the UIMA type
is enabled only when the original query is a plain text query
that does not contain any special characters.
- Specify a display name to use for the text analysis engine
in the Watson Explorer Content Analytics administration
console.
What to do next
After the PEAR file is installed in Watson Explorer Content Analytics, you can apply the
domain adaptive search annotator to a result group on the Search
Quality Management page of the Watson Explorer Content Analytics administration console.
If you want to reinstall the pipeline after you modify the linguistic
resources in Content Analytics Studio, you
must specify a different name in the Text Analysis
Engine Name field when you install the updated pipeline.
If you want to use the same name when you install the updated
pipeline, you must first manually disassociate the existing version
of the text analysis engine from the Watson Explorer Content Analytics collections and delete
that version of the text analysis engine from Watson Explorer Content Analytics.