DB2 Net Search Extender (here after referred to as 'NSE') and DB2 Text Search allow applications to use full text search functionality with a DB2 database. NSE is the predecessor to DB2 Text Search, a recent addition to DB2. With many new features and enhancements, DB2 Text Search helps you to meet the challenges of today's competitive business environment.
Here are the key features and improvements available in DB2 Text Search:
- Stored procedure interface for index administration
- Integrated installation and configuration performed by the DB2 installer
- Extensive XML text searching capabilities including XQuery and XPath
- Support for linguistic processing
To take advantage of the new text search implementation, existing DB2 NSE customers should make plans to migrate their text indexes and query interfaces from DB2 NSE to DB2 Text Search.
- NSE and Text Search indexes differ in format. Thus it is very important to re-create the indexes in the background using the DB2 Text Search format, while NSE is still running in the production system.
- Due to syntactic and semantic differences between the two, applications using NSE will need to adapt to the new DB2 Text Search command interfaces. This requires the application layer command interfaces to be re-written to accommodate the new command interfaces provided by DB2 Text Search.
- Although NSE and Text Search are similar in many respects, there may be differences in the query results. See the Resources section for a link to the Information Center where this is discussed.
- Ensure that the system has sufficient storage capacity to store both the NSE and Text Search indexes simultaneously during the period when the Text Search index is created in the background.
- For all releases prior to V97 FP3, the database can have either NSE or Text Search enabled at one time. Therefore to migrate to Text Search on any of the earlier releases, you will need to drop the NSE index, disable NSE, and then enable Text Search and create Text Search indexes. To avoid production system down time, you should upgrade your NSE setup to V 9.7 FP3 before migrating to DB2 Text Search.
- Since the DB2 Text Search index is re-created from a DB2 table, it is important that all documents to be indexed be available in the table for re-indexing with DB2 Text Search.
As discussed above, migration from DB2 NSE to DB2 Text Search requires a certain level of manual intervention since it involves index re-creation and changes to the full text search command interface. There is no automated migration tool available for migrating NSE indexes to Text Search indexes or for changing the applications using NSE to Text Search.
The following sections discuss the different steps required for migrating existing NSE indexes and NSE applications to DB2 Text Search.
Migration of full text indexes
Since V97 FP3, the co-existence of NSE and Text Search indexes on the same column has been enabled. The co-existence feature allows a database to be enabled for both NSE and Text Search.
The below steps are for migrating the existing NSE test indexes to Text Search (V97 FP3 onwards) from a database that has a working NSE index:
- Install, configure and start DB2 Text Search server.
After installing DB2 Text Search, update the instance for text search.
db2iupdt <instanceName> -j "TEXT_SEARCH"
Then log in as the instance owner and configure text search.
configTool generateToken -configPath <configPath> -seed <myInstanceName>
Run the DB2 install and select TEXT SEARCH as feature to install from the SERVER SUPPORT tab.
Configure Text Search using
configTool generateToken -configPath <configPath> -seed <myInstanceName>
- Enable the database for text search using db2ts enable command:
db2ts ENABLE DATABASE FOR TEXT
- Create a text search index:
CREATE INDEXcommand creates the text index for a text column with the attributes as specified with the various arguments specified in the command. The
CREATE INDEXcommand is used to define the index characteristics such as codepage and storage options. The index will contain data only after execution of
UPDATE INDEX. The Text Search index update operation can be scheduled as in NSE, and the update batch size can be controlled. The indexes can be updated by explicitly executing the
UPDATE INDEXcommand or issued by the text server implicitly, using the update frequency specified. The update frequency can be specified with the
UPDATE FREQUENCYoption with the create index command or added later using the
db2ts "create index <indexName> for text on <table>(<column>)"
Text Search index administration commands such as
UPDATE INDEXare independent of any NSE index on the same column that is indexed and can be done when the NSE index is active.
If there exists an NSE index on the same column then the Text Search index by default will be created in inactive state. After migration, for enabling the searches using the Text Search index instead of NSE index then we need to use
ALTER INDEXand set index as active. Refer the below steps for further details.
- Update the Text Search index.
The update index command populates the indexes for Text Search from the column from the base table. The
USING UPDATE MINIMUMoption can be used to check for the minimum number of changes specified in the create index command to start an update index. By default, the update index starts unconditionally.
db2ts update index <indexName> for text
Once the index update is successful, the Text Search index is populated. The next step is to set the Text Search index to active,
such that it is searchable. In a case where both NSE and Text Search index is created on the same column, only one index is active
at a time and the other is set to inactive state. Since the NSE index was active, the newly created Text Search index will be in
inactive state and any searches on the column would use the NSE index and not the Text Search index.The active index is selected
based on the values for ACTIVEFLAG in SYSIBMTS.SYSTSCOLUMNS and DB2EXT.TTEXTCOLUMNS. The ACTIVEFLAG has the following states:
- 0 – INACTIVE
- 1 – ACTIVE
- 2 – INVALID
db2ts alter indexcommand with SET ACTIVE option.
db2ts "ALTER INDEX <indexName> FOR TEXT SET ACTIVE"
After completing the migration steps listed above, verify the new Text Search index by using searches. If the Text Search index is
successfully, back up and delete the NSE index. If there are issues with the Text Search index, the NSE index can be reactivated
ALTER INDEXcommand and used for searches.
The existing applications on NSE will need to be modified to accommodate the new syntax of the Text Search queries. Also, in case of co-existence, take care to ensure that the correct index (NSE or Text Search) is used by the application. The index used is not destermined based on the syntax used in the query, but is based on the ACTIVEFLAG set for the index on the respective column. Therefore, the query could fail if there is a mismatch between the syntax type of the query and the type of index that is currently active. Also, for similar queries, the result set may differ between NSE and Text Search. The application may need changes to cope with this change, or the query may need changes to get a similar result set in Text Search as in NSE.
The following examples compare various types of NSE queries and the corresponding Text Search queries:
Text search uses || and && operators as opposed to | and & in NSE.
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '"author" | "pulitzer" & "book"') = 1
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'author || pulitzer && book') = 1
The query below will search for all words that start with “thr” and end with “er” with any number of characters between them.
SELECT AUTHOR,TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '"thr%er"') = 1
SELECT AUTHOR,TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'thr*er') = 1
The query below will return any word that starts with “th” and ends with “iller” with a single character between them.
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '"th_iller"') = 1
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'th?iller') = 1
NSE has fuzzy search feature which can be achieved in Text Search by using the '~' operator. To specify the degree explicitly
'~' followed by the number can be used.
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'fuzzy form of 80 "pullitzer"') =1
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'pullitzer~0.8') =1
NSE searches need to specify stemmed form explicitly whereas the default for Text Search is stemmed (search term in single quotes).
Double quotes can be used in Text Search for phrase search.
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'stemmed form of "shock"') =1
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'shock') =1
NSE specifies the escape character with the 'ESCAPE' keyword, where as in text Search '\' is the escape character.
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '"100!%" ESCAPE "!"') = 1
SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '100\%') = 1
In Text Search,the SYNONYM keyword can be set to use thesaurus/synonym dictionary associated with the text search index.
SELECT CATEGORY, DATE FROM DB2EXT.HTMLTAB WHERE CONTAINS(HTMLFILE, 'THESAURUS "nsesamplethes" EXPAND SYNONYM TERM OF "product"') = 1
SELECT CATEGORY, DATE FROM DB2EXT.HTMLTAB WHERE CONTAINS(HTMLFILE, 'product', 'SYNONYM=ON') = 1
To use the features of Text Search effectively, you should also understand the various factors that affect Text Search performance and resource usage. The importance of planning the migration should not be neglected. For more details on migrating to DB2 Text Search, follow the links in the Resources section.
- Refer to the Information Center section on Text Search solution planning to gain more of an understanding of key concepts and to leverage benefits of Text Search.
- Examine factors influencing performance and resource use in the DB2 Text Search capacity planning section of the Information Center.
- Get started with DB2 Text Search using the tutorial " DB2 Text Search, Part 1: Full text search" (developerWorks, Apr 2010).
- Refer to the DB2 Text Search and Net Search Extender comparison to get more details on differences that may affect query results.
- Visit the developerWorks resource page for DB2 for Linux, UNIX, and Windows to read articles and tutorials and connect to other resources to expand your DB2 skills.
Get products and technologies
- Now you can use DB2 for free. Download DB2 Express-C a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to build and deploy applications.