Db2 Text Search in a partitioned database environment

Db2 Text Search supports full-text search in a partitioned database environment. Text search indexes are distributed in a pattern that matches the base tables on which they are created. For each database partition, a text index partition, also called a collection, is created. This pattern facilitates text search maintenance by allowing text search index updates with parallel execution on all index partitions.

The staging tables used for multi-collection text search index updates are per index rather than per collection and are distributed in a manner similar to the base table. Staging tables use the DBPARTITIONNUM scalar function to find relevant changes that need to be applied to each index partition per index refresh. Data from each database partition server is updated in the corresponding text index partition during the text search index update to enable a parallelization of the update operation.

Every text search index update may result in multiple collection updates and text search server capacity planning is required. For workload distribution, a stand-alone remote text search server setup is recommended in partitioned database environments.

A Db2 Text Search server setup that is installed and configured separately from the Db2 instance is referred to as a stand-alone setup. A remote stand-alone setup, that is, a setup on a separate host from the database server, can be used for non-partitioned, single-partition and multi-partitioned Db2 instances to remove the resource-intensive text search server workload from the database server host.

The configuration of the integrated Text Search server during the default instance creation of a partitioned database instance applies to the lowest numbered database partition server. It is not required to configure during installation, the administration and configuration of the Text Search server in an existing partitioned database environment can be managed by Text Search server tools.

The following diagram depicts a Db2 instance with four database partitions. They are located on two dedicated hosts, Machine1 and Machine2 with two logical partitions per host. All database partition servers are served by a single text search server.
Figure 1. A Db2 Text Search server setup in a partitioned environment
A Db2 Text Search server setup in a partitioned environment
Stand-alone setups are suggested to help achieve a balanced workload and avoid sharing resources by the text search server with a single database partition server.

In a partitioned database environment, the db2ts START FOR TEXT command with the STATUS and VERIFY parameters can be issued on any one of the partition server hosts. To start the instance services, you must run the db2ts START FOR TEXT command on the integrated text search server host machine. The integrated text search server host machine is the host of the lowest-numbered database partition server. If custom collection directories are used, ensure that no lower numbered partitions are created later. This restriction is especially relevant for Linux and UNIX platforms. If you configure Db2 Text Search when creating an instance, the configuration initially determines the integrated text search server host. That configuration must always remain the host of the lowest-numbered database partition server.

Database partitions in a partitioned instance can be added and deleted. This is generally followed by data redistribution, using the REDISTRIBUTE DATABASE PARTITION GROUP command to move and rebalance data in the tables. If a text search index is hosted by one of the affected tables, such a data redistribution requires a reshuffling of the text index partition content to align the text index partitions with the new set of relevant database partitions. Incremental updates of text search indexes are usually inadequate for this purpose, instead, the text search index must be updated with the FOR DATA REDISTRIBUTION option. Note that this can result in significant downtimes for large workloads similar to initial updates.

When enabling and administering Db2 Text Search in a partitioned database environment, consider the following:
  • Ensure that the Db2 setup is complete as described in the Db2 documentation. The NFS mount must be configured with root access and setuid.
  • If startup fails, you need to check if Db2 Text Search has been configured correctly and then issue the db2ts START command a second time.
  • Before inserting or deleting partition numbers from the db2nodes.cfg file, stop the Db2 Text Search instance services.This applies to any command that might result in changes to the db2nodes.cfg configuration file.
  • On Windows platforms, while using Db2 Text Search in a partitioned database environment, the db2nodes.cfg file should not use IP addresses as well as host names for the same host.
You should be aware of the following considerations when conducting searches in a partitioned database environment:
  • The RESULT LIMIT is evaluated on every partition during search. This means that if you specify a RESULT LIMIT of 3 and use 4 partitions, you will get up to 12 results.
  • The SCORE value reflects the document's relevance when compared to the SCORE value of all documents from a single partition even if the query accesses multiple partitions.