Configuring Net Search Extender (NSE) for high availability (HA)

Db2® Net Search Extender can be configured to support high availability by sharing the indexes between the high availability nodes, as well as Net Search Extender index backup and restore operations. Net Search Extender full-text indexes consist of data stored in a Db2 database, and in some external files located on the file system. Only Net Search Extender data within the database are recovered during failover by configuring Db2 for high availability. The NSE specific external files need to be shared with the fail-over node, using a file sharing technology applicable to the user scenario and the platform in use. The external files are not restored if any index update operation are interrupted, causing the index files to get corrupted. The files need to be backed-up to restore them manually.

Interrupting an index update can irreparably and unpredictably corrupt the index. The fatality of the corruption depends on affected index files and the index operation phase at the time of interruption. Some of the index files are also updated directly, instead of their copies, making rollback recovery more difficult. So if failover occurs during an index update, corrupted index files need to be restored from the last successful index update operation, which are saved as index directory snapshots.

High availability configurations prevent index files that are located on shared storage from getting into an inconsistent state if an index update is interrupted during a failover. Database objects found on the failover system can be used to revert the index files back to a consistent state.

If the file snapshot is not supported by any platform, any corresponding file system sharing technology applicable to that platform shall be considered for NSE shared index folder/drive.

Index directory snapshots

  1. All Net Search Extender index files must be stored on dedicated file systems to backup and restore the latest index files. No other data should be stored on the file system.
  2. Every index must reside on its own file system. Alternately, indexes can share file systems, but the update schedules for indexes sharing a common file system are serialized in such a way that no two updates occur at the same time. The number of distinct file systems for Net Search Extender indexes are then be adapted to the number of parallel update processes that the system is capable of handling.
  3. The space used by a snapshot is initially very small, but tends to increase as file system content is changed. Ensure there is sufficient file space for snapshot on the index file system. Monitor file space usage to ensure ample space exists for snapshots.

Preparing for failover

Indexes are located on shared storage between the high availability nodes. Every index update and scheduled update should immediately be followed by a snapshot of its index directory. These instructions can be encapsulated in a scrip and executed by an external scheduler, as shown in the following steps:

  1. Verify whether the index files are located in the shared location between the high availability nodes.
  2. Check Db2 Net Search Extender status from db2ext.tcommandlock table and work directory.
  3. Run snapshot procedure to take Net Search Extender index file system snapshot to the shared storage.
  4. Invoke the Net Search Extender UPDATE INDEX command
  5. Remove self-defined mark after the index update completes.
Note: Since the Net Search Extender native scheduled index update can only invoke the DB2TEXT UPDATE INDEX command, disable it by setting UPDATE FREQUENCY to NONE. Use operating system specific index update scheduling instead, such as the CRON command on UNIX and Linux®, and AT command on Windows operating systems. These commands invoke the wrapper script at the specified interval, with one crontab entry for every index that has an automated update schedule. This ensures the existence of current snapshots of all indexes on the file system from the most recent successful update on shared storage.

Index characteristics during failover

Key to index recovery is determining whether or not failover corrupted the index. This requires a fallback to the most recent known good state of that index and can be determined by the following Net Search Extender index update process:

  • Every index update is internally encapsulated in a pair of insert and delete operations on the db2ext.tcommandlocks table.
  • To prevent concurrent admin commands on that index, the index update starts by creating a row in this table, a named index, a timestamp, and the type of operation. The row is removed from the table again before the update terminates, making the index available for new administration commands.
  • If no index update occurs during a failover then the db2ext.tcommandlocks table contains no rows, and no further action is required. All data stored in the log table are immediately available on the failover system via high availability support, ready for the next regular index update.
  • If a failover happens during an index update, the db2ext.tcommandlocks table on the failover node will show a row for every index that was involved in an update at the time of the failure. There can be more than one affected index, each corresponding to a single row in db2ext.tcommandlocks, so every operation needs to be repeated for each row. Manual recovery then needs to be initiated to restore the snapshot. Every affected index is protected against further (scheduled or manual) updates by the presence of the lock entry on the table.
  • Check whether the entries in the log table still persist. Compare the timestamp of the oldest entry in the log table of the index with the most recent CTE0003 entry in the event table of the index.

    If the oldest log table entry is younger than the most recent CTE0003, log table cleanup had been done already before the failover, but the db2ext.tcommandlocks entry couldn't be deleted yet. The index is uncorrupted in this case, so do not restore the snapshot, but only manually remove the db2ext.tcommandlocks entry and proceed as usual.

    If the oldest log table entry is older than the most recent CTE0003, the index should be restored from snapshot.

Restoring Index from snapshot

  1. Remove all index files in the index directory of the affected index. Note that searches against that index will fail during that time, so stop Net Search Extender.
    rm -rf /myWORK/NODE0000/TMP_IX300608/*
  2. Replace the empty directory with the content of the snapshot. This takes time since it requires a physical copy of the files.
    rm -rf /myINDEX
    mount -o snapshot /dev/fslv06 /mnt/
    cp -pR /mnt/* /myINDEX
  3. After the restoring the index directory content, manually remove the row corresponding to the index from db2ext.tcommandlocks table.
    db2 "delete from db2ext.tcommandlocks"
  4. Repeat the previous steps for all affected indexes
  5. When done, restart Net Search Extender. Regular operation can proceed now on the failover node.

The Net Search Extender content of the log table remains intact and a new call to DB2TEXT INDEX UPDATE will process it as before. Some manual cleanup in the event table may be necessary, since it can contain entries created during the original index update operation.