Migrate DB2 Net Search Extender indexes to DB2 Text Search

In order to leverage the latest capabilities in IBM® DB2® for Linux®, UNIX®, and Windows® for full text search, users of DB2 Net Search Extended should migrate to DB2 Text Search. This article explains how to migrate from Net Search Extender to DB2 Text Search indexes, and what to consider when planning your migrarion.

Kavya Rao (kavyarao@in.ibm.com), DB2 Text Search Support, IBM

Kavya Rao is a software engineer with IBM India Software Labs. She is a certified DB2 administrator and a support specialist with several years' experience of full-text search in DB2, including DB2 Net Search Extender and DB2 Text Search.



08 March 2012

Overview

DB2 Net Search Extender (here after referred to as 'NSE') and DB2 Text Search allow applications to use full text search functionality with a DB2 database. NSE is the predecessor to DB2 Text Search, a recent addition to DB2. With many new features and enhancements, DB2 Text Search helps you to meet the challenges of today's competitive business environment.

Here are the key features and improvements available in DB2 Text Search:

  • Stored procedure interface for index administration
  • Integrated installation and configuration performed by the DB2 installer
  • Extensive XML text searching capabilities including XQuery and XPath
  • Support for linguistic processing

To take advantage of the new text search implementation, existing DB2 NSE customers should make plans to migrate their text indexes and query interfaces from DB2 NSE to DB2 Text Search.

Pre-migration considerations

  • NSE and Text Search indexes differ in format. Thus it is very important to re-create the indexes in the background using the DB2 Text Search format, while NSE is still running in the production system.
  • Due to syntactic and semantic differences between the two, applications using NSE will need to adapt to the new DB2 Text Search command interfaces. This requires the application layer command interfaces to be re-written to accommodate the new command interfaces provided by DB2 Text Search.
  • Although NSE and Text Search are similar in many respects, there may be differences in the query results. See the Resources section for a link to the Information Center where this is discussed.
  • Ensure that the system has sufficient storage capacity to store both the NSE and Text Search indexes simultaneously during the period when the Text Search index is created in the background.
  • For all releases prior to V97 FP3, the database can have either NSE or Text Search enabled at one time. Therefore to migrate to Text Search on any of the earlier releases, you will need to drop the NSE index, disable NSE, and then enable Text Search and create Text Search indexes. To avoid production system down time, you should upgrade your NSE setup to V 9.7 FP3 before migrating to DB2 Text Search.
  • Since the DB2 Text Search index is re-created from a DB2 table, it is important that all documents to be indexed be available in the table for re-indexing with DB2 Text Search.

Migration

As discussed above, migration from DB2 NSE to DB2 Text Search requires a certain level of manual intervention since it involves index re-creation and changes to the full text search command interface. There is no automated migration tool available for migrating NSE indexes to Text Search indexes or for changing the applications using NSE to Text Search.

The following sections discuss the different steps required for migrating existing NSE indexes and NSE applications to DB2 Text Search.


Migration of full text indexes

Since V97 FP3, the co-existence of NSE and Text Search indexes on the same column has been enabled. The co-existence feature allows a database to be enabled for both NSE and Text Search.

The below steps are for migrating the existing NSE test indexes to Text Search (V97 FP3 onwards) from a database that has a working NSE index:

  1. Install, configure and start DB2 Text Search server.
    • UNIX:
      After installing DB2 Text Search, update the instance for text search.
      Example: db2iupdt <instanceName> -j "TEXT_SEARCH"
      Then log in as the instance owner and configure text search.
      Example: configTool generateToken -configPath <configPath> -seed <myInstanceName>
    • Windows:
      Run the DB2 install and select TEXT SEARCH as feature to install from the SERVER SUPPORT tab.
      Configure Text Search using
      configTool generateToken -configPath <configPath> -seed <myInstanceName>
  2. Enable the database for text search using db2ts enable command:
    Example: db2ts ENABLE DATABASE FOR TEXT
  3. Create a text search index:
    The CREATE INDEX command creates the text index for a text column with the attributes as specified with the various arguments specified in the command. The CREATE INDEX command is used to define the index characteristics such as codepage and storage options. The index will contain data only after execution of UPDATE INDEX. The Text Search index update operation can be scheduled as in NSE, and the update batch size can be controlled. The indexes can be updated by explicitly executing the UPDATE INDEX command or issued by the text server implicitly, using the update frequency specified. The update frequency can be specified with the UPDATE FREQUENCY option with the create index command or added later using the ALTER INDEX command.

    Example: db2ts "create index <indexName> for text on <table>(<column>)"

    Text Search index administration commands such as CREATE INDEX and UPDATE INDEX are independent of any NSE index on the same column that is indexed and can be done when the NSE index is active.
    If there exists an NSE index on the same column then the Text Search index by default will be created in inactive state. After migration, for enabling the searches using the Text Search index instead of NSE index then we need to use ALTER INDEX and set index as active. Refer the below steps for further details.
  4. Update the Text Search index.
    The update index command populates the indexes for Text Search from the column from the base table. The USING UPDATE MINIMUM option can be used to check for the minimum number of changes specified in the create index command to start an update index. By default, the update index starts unconditionally.
    Example: db2ts update index <indexName> for text
  5. Once the index update is successful, the Text Search index is populated. The next step is to set the Text Search index to active, such that it is searchable. In a case where both NSE and Text Search index is created on the same column, only one index is active at a time and the other is set to inactive state. Since the NSE index was active, the newly created Text Search index will be in inactive state and any searches on the column would use the NSE index and not the Text Search index.The active index is selected based on the values for ACTIVEFLAG in SYSIBMTS.SYSTSCOLUMNS and DB2EXT.TTEXTCOLUMNS. The ACTIVEFLAG has the following states:
    • 0 – INACTIVE
    • 1 – ACTIVE
    • 2 – INVALID
    All administration commands can be executed on an active or inactive index, but active indexes qualify for searches in both NSE and Text Search. To set an index to active use the db2ts alter index command with SET ACTIVE option.
    Example : db2ts "ALTER INDEX <indexName> FOR TEXT SET ACTIVE"
  6. After completing the migration steps listed above, verify the new Text Search index by using searches. If the Text Search index is working successfully, back up and delete the NSE index. If there are issues with the Text Search index, the NSE index can be reactivated using the ALTER INDEX command and used for searches.

Application migration

The existing applications on NSE will need to be modified to accommodate the new syntax of the Text Search queries. Also, in case of co-existence, take care to ensure that the correct index (NSE or Text Search) is used by the application. The index used is not destermined based on the syntax used in the query, but is based on the ACTIVEFLAG set for the index on the respective column. Therefore, the query could fail if there is a mismatch between the syntax type of the query and the type of index that is currently active. Also, for similar queries, the result set may differ between NSE and Text Search. The application may need changes to cope with this change, or the query may need changes to get a similar result set in Text Search as in NSE.

The following examples compare various types of NSE queries and the corresponding Text Search queries:

  • Text search uses || and && operators as opposed to | and & in NSE.
    NSE: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '"author" | "pulitzer" & "book"') = 1
    Text Search: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'author || pulitzer && book') = 1
  • The query below will search for all words that start with “thr” and end with “er” with any number of characters between them.
    NSE: SELECT AUTHOR,TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '"thr%er"') = 1
    Text Search: SELECT AUTHOR,TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'thr*er') = 1
  • The query below will return any word that starts with “th” and ends with “iller” with a single character between them.
    NSE: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '"th_iller"') = 1
    Text Search: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'th?iller') = 1
  • NSE has fuzzy search feature which can be achieved in Text Search by using the '~' operator. To specify the degree explicitly '~' followed by the number can be used.
    NSE: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'fuzzy form of 80 "pullitzer"') =1
    Text Search: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'pullitzer~0.8') =1
  • NSE searches need to specify stemmed form explicitly whereas the default for Text Search is stemmed (search term in single quotes). Double quotes can be used in Text Search for phrase search.
    NSE: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'stemmed form of "shock"') =1
    Text Search: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, 'shock') =1
  • NSE specifies the escape character with the 'ESCAPE' keyword, where as in text Search '\' is the escape character.
    NSE: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '"100!%" ESCAPE "!"') = 1
    Text Search: SELECT AUTHOR, TITLE FROM DB2EXT.TEXTTAB WHERE CONTAINS(COMMENT, '100\%') = 1
  • In Text Search,the SYNONYM keyword can be set to use thesaurus/synonym dictionary associated with the text search index.
    NSE: SELECT CATEGORY, DATE FROM DB2EXT.HTMLTAB WHERE CONTAINS(HTMLFILE, 'THESAURUS "nsesamplethes" EXPAND SYNONYM TERM OF "product"') = 1
    Text Search: SELECT CATEGORY, DATE FROM DB2EXT.HTMLTAB WHERE CONTAINS(HTMLFILE, 'product', 'SYNONYM=ON') = 1

Other considerations

To use the features of Text Search effectively, you should also understand the various factors that affect Text Search performance and resource usage. The importance of planning the migration should not be neglected. For more details on migrating to DB2 Text Search, follow the links in the Resources section.

Resources

Learn

Get products and technologies

  • Now you can use DB2 for free. Download DB2 Express-C a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to build and deploy applications.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=800571
ArticleTitle=Migrate DB2 Net Search Extender indexes to DB2 Text Search
publish-date=03082012