Defining custom stop words
User defined stop word lists typically include common filler
terms that are deemed not relevant for a query result, for example, the or an. A stop word list can also be
used to remove specific names, for example, product names like WebSphere Application Server, from the query. Multiple-word
terms are correctly identified in user queries and do not have to
appear between quotation marks.
You do not need to enumerate normalizations of the term,
such as the removal of accents or umlauts, because normalization is
handled automatically. For example, if you want to include the term météo as a stop word, you do not need to include the term METEO, too.
A stop word can include white space characters, but it cannot include punctuation characters, such as a comma or vertical bar.
Stop words are specified in XML files. There is one XML file per language. Stop word files are located in the <ECMTS_HOME>\resource\uima directory. Stop word files have te following naming format language_code-Stw.xml where for example, the French stop word file is called fr-Stw.xml.