Troubleshooting basic replication

If the replica server does not seem to be receiving updates from the replicating server (master or peer), there are several possible reasons. Check the following conditions for a possible quick fix:

  • Check for messages from the replicating server.
  • Verify that a replica entry for the replica server exists in the backend to be replicated in the replicating server, and was specified correctly to match with the replica server. If cn=localhost is used as the suffix for all replica entries for a backend, perform an ldapsearch with a base of cn=localhost and a filter of objectClass=*. Otherwise, perform an ldapsearch where the search base is the suffix defined in the backend section of the LDAP server configuration file and the filter is objectClass=replicaObject. If more than one suffix is configured for TDBM or LDBM, the search must be repeated using each suffix in the search base.

    See z/OS IBM Tivoli Directory Server Client Programming for z/OS for more information about ldapsearch.

  • Verify that the replicaHost value in the replica entry for that replica specifies the machine on which the replica is running.
  • Check that the values listed in the replica entry for that replica match those of the replica server configuration. Specifically, the replicaPort, replicaBindDN, and replicaCredentials should be verified.
  • Check that the replicaUpdateTimeInterval specified in the replica entry for that replica has been set correctly.
  • Verify that the replica server is running by performing an ldapsearch against the replica.
  • Check that the default referral specified in the LDAP server configuration file in the replica server points to the replicating server.
  • If the replica entry replicaUseSSL attribute is set to TRUE, verify the replicaPort attribute is set to the SSL port configured on the replica server. Verify the sslKeyRingFile, and sslKeyRingFilePW or sslKeyRingPWStashFile values in the LDAP server configuration file on the replica server and on the replicating server are correct.
  • When adding many entries, ensure that the region size for the replicating server is sufficient for replicating the entries to the replica. Entries on the replicating server are kept in memory during replication. If the region size is not sufficient, an out of memory condition can occur in the LDAP server. If possible, set the region size on the replicating server to 0M (or unlimited). If that cannot be done, set the region size to 14M (needed to run the LDAP server itself) plus twenty times the size of the largest LDIF file that is to be added to the replicating server.

The ibm-slapdLog and ibm-slapdReplMaxErrors attributes in a replica entry can be used to configure a replication error log for this replica. If basic replication fails, the error log holds all errors that occurred during replication and the LDIF for the set aside replication operations.

Recovering from basic replication out-of-sync conditions

If a replica becomes out-of-sync with its replicating server for any reason, and normal replication processing is not correcting the situation, it might be necessary to reload the replica.

The following procedure should be followed to reload a replica:

  1. Use the LDAP server MAINTMODE ON operator modify command on the replicating sever and on each of the replica servers to put them into maintenance mode.
  2. Using an root administrator DN, unload all the replica entries (entries that describe replica servers) from the master server. Use a search command like the one shown in Searching a replica entry to create LDIF output containing the replica entries for each suffix in the backend.
  3. Using an root administrator DN, run ldapdelete to remove the replica entries from the master. This resets the replication information in the replicating server.
  4. For TDBM, run the following SPUFI on the replicating server to be sure that the server successfully completed the removal of the data in the DIR_REPLICA, DIR_REPENTRY and DIR_LONGREPENTRY tables:
    select count(*) from dbuserid.dir_replica
    where you substitute your database owner for dbuserid. The record count returned should be zero.
  5. Stop all the replica servers.
  6. Clear out the directory on each replica server.
  7. Run an unload utility on the replicating server. Use ds2ldif twice, once to unload the schema entry and a second time to unload the TDBM or LDBM directory entries.
  8. Start the replica servers in maintenance mode.
  9. Using an administrator DN, run ldapmodify to load the schema unloaded from the replicating server onto each replica.
  10. On each replica, load the directory data retrieved above from the replicating server. For LDBM, you must use ldapadd. For TDBM, you can use ldapadd or use the ldif2ds load utility. The ldapadd utility must be run using an root administrator. If you use ldif2ds, you must stop the replica server before loading entries. In this case, restart the replica in maintenance mode after loading it.
  11. Using an administrator DN, run ldapadd to add the replica entries unloaded in step 4 back into the replicating server.
  12. Use the LDAP server MAINTMODE OFF operator modify command to take the replicating server and each replica out of maintenance mode.