Db2 Text Search locale and language

Locale specification can also impact the performance of a text search query.

Locale specification

When you perform a search on a text search index in a multi-lingual environment, it is suggested that you always use the QUERYLANGUAGE option with your search query to specify which locale (a combination of language and territory information) to use to interpret a search term. For example, if you have a search term such as bald, you can specify to treat it as an English word by setting the QUERYLANGUAGE=en_US in the search query. Similarly, if you want it to be treated as a German word, QUERYLANGUAGE can be set to de_DE. However, it should be noted that the results returned are highly dependent on the LANGUAGE used for indexing, regardless of the QUERYLANGUAGE specified in a query.

If the QUERYLANGUAGE is not specified in the search query, then the following logic is used:
  • The search term is interpreted to be of the locale that was set for the underlying text index during index creation.
  • If the locale set for the index during index creation is AUTO, then this defaults to English (en_US), and the search term will be treated as an English word.
Restrictions:
  • If the locale specified in the search queries is invalid (for example, QUERYLANGUAGE=Mongolian), then the query will be considered invalid and an exception will be thrown.
  • Setting QUERYLANGUAGE=AUTO in the search query is an unsupported option and the results of the query are undefined.

Note that the locale specified by QUERYLANGUAGE has no effect on the locale of error messages resulting from search queries. The error-message locale that is used depends on whether you started the text search instance services. If you did not start them, messages are written using en_US; if you did start them, messages are written in the same locale of the environment in which you issued the START FOR TEXT command.