DB2 Text Search indexes are hosted in the DB2 database and linked to text index collection data managed with the text-search server configured for the database instance.
A database failure can affect consistency for text-search index collections, and the database server and the text-search server can be affected by hardware, software, or network failures, or by other factors like human influence or environmental impact (power failures, for example). A comprehensive high-availability (HA) solution is necessary to ensure that all components of the system are available in a planned timeframe and to re-synchronize the database content with the text-search index content on a standby system.
The following sections discuss basic elements to enable text-search setups for an HA scenario. They are based on a DB2 HADR scenario, where the primary server transfers logs to the standby server via TCP/IP. Transaction logs are transmitted and applied on the standby server. The primary server acts as TCP Server and the standby server as client. If the primary server fails, the standby server takes over the role of primary.
Figure 1. Basic setup for text search in an HA scenario
A DB2 database enabled for text search connects through a local host TCP/IP connection to the text-search server. The text-search server runs as a separate process and manages the file system-based text-search index collections.
In general, the following scenarios can be distinguished:
- A failure affects database availability and leads to a fail-over to the standby server.
- A failure affects the text-search server availability, but the database server is unaffected.
- A failure affects availability of the text-search index collection.
The focus of the following sections is how to ensure that text-search functionality can be executed consistently after a database fail-over to the standby server. The other cases are briefly discussed here.
Text-search server availability
Provided that the installation and configuration of the text-search server are correct, possible reasons for unavailability of the text-search server include:
- Security system changes, such as authorization or access privilege changes, can prevent starting the text-search server.
- Network issues like port or socket configuration can cause communication failures.
- Runtime issues, such as the Java™ process, getting hung, killed, or crashing.
In each of these cases, the issue must be fixed on the primary system, and the text-search server associated with the instance must be restarted with the proper authority and the proper configuration.
Text-search index availability
If a text-search index becomes unavailable, two scenarios are most relevant:
- A hardware error — for example, a disk crash of a custom location. This type of error is not specific to text-search indexes. Hardware errors of this kind should, for example, be addressed by choosing disk devices that support proven technologies for disk availability.
- A human error — an unintended execution of a collection cleanup, for example.
To address index availability, the HA strategy must include a backup plan that allows restoring a text-search index collection. File permission changes may render a text-search index unavailable for access as well, but this is not a relevant scenario for this context.
Text-search server setup for a database fail-over scenario
If the machine that hosts the standby database system has different resource properties from the machine that hosts the primary database system, it might be necessary to have configuration differences for the text-search server. For example, configuration settings that impact resource usage, or the port number used for the local host TCP/IP connection between database server and text-search server.
For a smooth transition, prepare a configuration script that adjusts values for the standby text-search server. When a fail-over occurs: Ship the current text-search server configuration from the primary system to the standby system and run the configuration script to adapt configuration values before starting the text-search server on the standby system.
The re-configuration and startup of the text-search server may be
integrated into a script that is part of an automatically managed
solution. Note that HADR configuration settings do not apply to
text-search indices, but they may affect the integrated strategy. For
example, setting the parameter logindexbuild to ON will not keep track of
text-search index builds.
The choice of HADR synchronization mode may cause a slight increase of the risk for a failed text-search index update. When an asynchronous mode is configured (ASYNC or SUPERASYNC), recent transactions might be lost on a standby system. In this case, it is possible that a text-search index update was committed by the text-search server and on the primary server, but the transactions to wrap up the text-search index update in the database are not applied to the secondary server. The scenario is to be treated like the corresponding update failure in a non-HA setup, and the next incremental update will re-process the data.
For all HA scenarios, to distribute disk I/O workload, do not share devices between the logging file system, the table-space files, and the text-search index collections.
Text-search indexes in a database fail-over scenario
Using a shared disk for text-search indexes
Text-search index collections can be created in a custom location. The location can be on a shared disk that is accessible from the primary server as well as the secondary server. The file system must be set up identically for the primary and secondary system to access the collections with the same absolute path information.
Figure 2. Text indices on a shared disk
Depending on the reason for the database fail-over, the text-search server process on the primary system might still be active. If so, make sure to shut down the text-search server on the primary system before a startup of the text-search server on the secondary system.
The advantage of this setup is that it requires only space for one set of text-search indices and has a fast uptime because the text-search server on the standby system will have immediate access to the collections after starting. The disadvantage is that using a shared disk may be slower.
To resume full-text search on the fail-over system:
- Refresh the text-search server configuration.
- Identify and treat interrupted text-search administrative operations, if any.
- Start the text-search server.
Using a text-search index copy on the standby system
Another approach to make text-search index collections available to a standby server is to ship text-search index files to the standby system. Again, the file system and folder structure must be set up on the standby system to mirror the primary system.
Figure 3. Text index copy on standby system
Similarly to backup considerations, all files or only changed files may be shipped to the standby system. In a standby-with-copy scenario, use a backup strategy that ensures that either (changed) text-search index files are shipped after each index update or a shadow staging table keeps processed staging table entries until after a shipment. With such a strategy, the content of the database on the standby system and the content of the text-search index collections will be consistent. (See Part 3 of this series for information about backup strategies and staging table considerations.)
In this solution, the standby system is essentially used as repository for the most recent text-search index backup. To resume full-text search on the fail-over system:
- Refresh the text-search server configuration.
- Identify and treat interrupted text-search administrative operations, if any.
- If staging table copies or shadow staging tables are used, move entries back to the original staging tables.
- Start the text-search server.
- Run an incremental text-search index update to apply changes from the staging table.
Using a restore strategy for text-search indexes on the standby system
If time is not critical and the standby server is not appropriate to serve as a backup repository, a variation of the standby-with-copy scenario is to use DB2 HADR to manage the database fail-over and restore the text-search index collections from a backup.
Figure 4. Using an index backup to restore the standby system
To resume full-text search on the fail-over system:
- Refresh the text-search server configuration.
- Restore the text-search index collections from the current backup.
- Identify and treat interrupted text-search administrative operations, if any.
- If staging table copies or shadow staging tables are used, move entries back to the original staging tables.
- Start the text-search server.
- Run an incremental text-search index update to apply changes from the staging table.
Handling interrupted administrative operations
A critical element to enabling text-search on the standby server is to ensure that the text-search server on the standby system is configured to match the text-search server on the primary system. Similar to database management configuration and database configuration, text-search server configuration data is not automatically propagated from the primary to the standby system, but must be handled manually. Text-search server configuration data is on database instance level and persisted in configuration files located in .../sqllib/db2tss/config.
Independent of whether a shared disk, a standby-with-copy, or standby-with-restore solution is used to include text-search indices in a fail-over scenario, a fail-over from primary to standby server may involve incomplete text-search index administrative operations. This has implications for consistency between database content and text-search index content, and text-search server index metadata.
For example, an operation may be uncommitted on the text-search server and on the database server at the time of failure. Another example is that an operation may already have committed on the text-search server but not yet committed in the database, and the database did not send a rollback or undo operation request to the text-search server.
Before starting the text-search server to resume operations on the standby system, it is prudent to review the command locks view for interrupted operations:
- Create index. If a
CREATE INDEX FOR TEXToperation was active at the time of the database failure:- Remove the command lock.
- Find and remove superfluous collection entries by using
CLEANUP FOR TEXT. Starting the text server does not depend on this step. Cleanup can be done when it is convenient.
- Drop the index. If a
DROP INDEX FOR TEXToperation was active at the time of the database failure:- Remove the command lock.
- Repeat the operation.
- Update the index. If an
UPDATE INDEX FOR TEXToperation was active at the time of the database failure:- Remove the command lock.
- Restart the text-search index update to reprocess the data.
The text-search index collection status will be the last-committed state before the time of failure. This matches the last backed-up state of the text-search index combined with the saved staging table content up to the last successful incremental update.
If the update operation was committed by the text-search server, the collection will contain the updated data, even if the index update is marked as failed in the database. In this case, if the update operation has not already committed all transactions in the database (excluding the command lock update), the same update scope will again be applied with the next index update processed at the time of failure.
To include text-search indices into a high-availability strategy for DB2, it is necessary to ensure that the standby system is properly configured for text search and has access to the text-search index collections. Keep in mind that:
- The text-search server configuration on the standby system must match the configuration on the primary system, except for those configuration values that address system resource settings.
- The text server on the secondary system must have access to the text-search index collections in a file structure that matches the primary system.
- In a shared disk setup, the primary text server and the secondary text server must not use the text indices concurrently.
| Description | Name | Size | Download method |
|---|---|---|---|
| Simple HA sample | Sample_HA.txt | 2KB | HTTP |
Information about download methods
Learn
- "DB2
Text Search, Part 1: Full-text search" is a step-by-step guide to walk you through a
simple full text-search scenario.
- "DB2
Text Search, Part 2: Understanding linguistic processing for full-text search in DB2" explains why documents get included in search
results.
- "DB2 Text Search, Part 3: What to consider when planning text index backups" discusses procedures to back up
and restore text indices.
- In the DB2 for
Linux, UNIX, and Windows area on developerWorks, get the resources
you need to advance your DB2 skills.
- Stay current with
developerWorks technical events and webcasts.
- Follow developerWorks on
Twitter.
Get products and technologies
- You can use DB2
for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the
community that offers the same core data features as DB2 Express Edition
and provides a solid base to build and deploy applications.
- Build your next
development project with
IBM trial software,
available for download directly from developerWorks.
Discuss
- Participate in the discussion forum.
- Check out the
developerWorks
blogs and get involved in the
developerWorks community.
Marion Behnen is a senior software engineer within the IBM Software Group and works as a technical lead for full-text search in DB2. She has more than 20 years of experience with database application development, business reporting, data warehousing, and business process integration. Prior to joining IBM, she was involved with many aspects of business process and data integration, in particular for the manufacturing industry.




