There are numerous ways to backup and protect the information you have stored in a ClearCase repository. You can use standard O.S. backup procedures (tape archives etc.) to secure the information and in the event of a serious failure, you can use these archives to restore your repositories (see the ClearCase Administrator's guide for restoration procedures). However O.S. based restoration can take a significant amount of time, and in the case of MultiSite, you will then need to "multitool restorereplica" to re-retrieve updates since the backup was last performed. If you do not have other replicas, you will have lost any data since the last backup. If ClearCase availability is critical to your organization, maintaining a redundant replica enables your organization to restore normal operations faster, plus you will only lose data since the last replication (typically only minutes) instead of the last backup (typically the last day or longer). In the event of a failure of your primary server, this article gives an outline of how you can quickly switch to a backup server to prevent end user downtime.
The Lotus ClearCase deployment consists of many geographically dispersed sites with varying number of users per site. For example, the Dublin site has 70+ developers working on a number of diverse projects and a ClearCase failure would be catastrophic to their productivity. This environment calls for very high availability and reliability, so our preferred source code control application is ClearCase 2003.06 (with MultiSite and latest recommended patches from IBM Rational) on RedHat Enterprise Linux Enterprise Server 3.0. Additional reasons for choosing Linux include its security and ease of maintenance. While this configuration is itself pretty robust, a hardware failure could still be devastating. Critical locations (such as Dublin) procured 2 identical IBM xSeries servers to set up a local ClearCase cluster. While not strictly necessary to purchase identical machines doing so makes ongoing maintenance easier over the lifetime of the servers and allows easier maintenance of operating system files. It also provides the same performance levels that the ClearCase users expect after the switchover.
Both servers were installed and set up as identically as possible. We used the same RedHat Enterprise Linux Enterprise Server 3.0 installation and configuration steps, as well as installing ClearCase from the same release area to ensure consistency. Also ensure that the package "ClearCase MultiSite Full Function Installation" is selected when setting up the release area.
The optimal solution is to set up both servers as license servers and the redundant server as a backup registry server. Because CC licenses are keyed to the Ethernet MAC address of the license server, this requires either purchasing two sets of CC licenses, dividing CC clients to point to one or the other license server, or a special license arrangement with IBM Rational, certifying that the second set of licenses are for failover purpose only and will not be used as primary licenses. This set-up means that there is no dependence on the primary server in a failure situation. The primary and backup registry servers are configured during the site_prep stage.
After installation, confirm the file /<install dir>/clearcase/rgy/rgy_hosts.conf on your primary and redundant servers contains:
<primary_server_FQDN> <backup_server_FQDN> For example, primary.ibm.com backup.ibm.com |
The "Daily Registry Backup" job is set up to run automatically in a standard ClearCase installation, and the backup registry server will backup the registry provided rgy_hosts.conf is configured correctly. Verify the job completes "OK" with no messages on either the primary or the redundant servers. Then verify the directory /<install dir>/clearcase/rgy/backup on your backup server contains registry information data with the current date. This confirms the job is working as expected and shows that your registry is now set up to replicate to the backup server on a daily basis.
Once set up, you should update your ClearCase client release area so that all clients have the "Backup registry host" automatically configured during client installation. Any Windows client users installed prior to this change can add this change manually via the Registry tab of the ClearCase Control Panel, and Unix clients can perform this task by editing /<install dir>/clearcase/rgy/rgy_hosts.conf.
In the event of a primary server failure, all clients will need to switch to the backup license server. This can be done via the Licensing tab of the ClearCase Control Panel or via a re-install from an updated release area, but to avoid this, create a DNS CNAME (a DNS alias name, for example, cclicense.ibm.com) and use that as the configured license server. After a failure, only the CNAME in the DNS needs an update to point to the redundant license server's hostname.
Both servers should use separate regions for both Unix and Windows, so VOB tags are registered to different regions. For example, run commands on your primary server such as the following:
cleartool mkregion -tag <PRIMARY_WIN> cleartool mkregion -tag <PRIMARY_UNIX> cleartool mkregion -tag <BACKUP_WIN> cleartool mkregion -tag <BACKUP_UNIX> |
On the primary server ensure that the default region is <PRIMARY_UNIX> by checking or editing the file /<install dir>/clearcase/rgy/rgy_region.conf. Similarly ensure that the backup server is using the region <BACKUP_UNIX> by checking the same file on the backup server.
Replicating VOBs to the backup server
This article assumes that your VOBs are already set up on your primary server. These VOBs need to be replicated and set up on the redundant server. The export should use the ID of the VOB owner, which preferably is not root for security purposes. Run su to VOB owner login:
su - <vob owner> Use a command similar to: mkreplica -export -workdir /<tmp directory> backup_server_FQDN:backup_server@/vobs/vob1 to export the VOB to the backup server. |
Depending on when you do this, you might want to add the "-fship" option to cause the packets to be shipped to the receiving host immediately instead of waiting for it to happen according to a schedule. We have also found it best to use a fully qualified domain name when giving your hostname as this ensures other locations can resolve the hostname properly even if they do not have the same DNS domain search order configured.
Note: The export operation locks the VOB being exported so you need to choose a good time for this, for example, it may be best to do this overnight.
Depending on the performance of the server, exporting can take anywhere between a few hours to many days. Also, make sure you have enough disk space on both the source and target servers:
- in the
<tmp directory>on the primary server - the incoming shipping bay on the redundant server (
<install dir>/shipping/ms_ship/incoming) - and the VOB storage location on the redundant server
Importing the VOBs on the backup server
Wait for the arrival of all packets to the backup server. Then run the following commands on the target server. We find it useful to move the replica packets to a temporary directory from your standard incoming packet directory so that they can be worked on in isolation.
mv /<shipping directory>/ms_ship/incoming/repl_* /<tmp directory> |
Again it is very important that the import of the new replica be done as the VOB owner because this simplifies owner/identity and permission issues greatly.
su - <vob owner> Import the packets into a new replica, using a command similar to: multitool mkreplica -import -workdir /<tmp directory> -tag /vobs/vob1_tag -vob /<vob directory>/vob1 -<permission> -vreplica backup_server repl_<primary_server>_<date>_13668_1 |
Read the man page for multitool mkreplica for more information on the various options for exporting and importing VOB replicas. Specifically, choose the correct permissions and identity preserving options based on your VOB's configuration. If you have specific identity and permission configuration steps for your deployment, perform these now.
Create VOB tags so that clients will be able to access the VOBs in the <BACKUP_WIN> region. The win32 GUI tool "Region Synchronizer" (to be found in Start >Programs >Rational Software >Rational ClearCase >Administration >Region Synchronizer) can be useful for this. Click on the help button for more details on using this utility.
Setting up 2-way replication between your primary and redundant servers
To make replication administration easier between these servers we find it useful to create a file /<install dir>/clearcase/scheduler/tasks/backup_vob_list.script containing the list of VOBs to be replicated. On the primary server this file would contain VOBs to be replicated to the redundant server, for example:
replicas:backup_server@/vobs/vob1 replicas:backup_server@/vobs/vob2 replicas:backup_server@/vobs/... |
For ease of maintenance, use the same VOB order as the primary server VOB list file when creating the redundant server file. As new VOBs are added to the servers modify both backup_vob_list.script files to replicate between servers.
Add the following scheduled tasks on both servers by using the command: "cleartool schedule -edit -schedule". You could also use the CCHostAdmin win32 GUI if you prefer:
Job.Begin
Job.Name: "Replicate Full"
Job.Schedule.Daily.Frequency: 1
Job.Schedule.StartDate: 31-Dec-00
Job.Schedule.FirstStartTime: 00:05:00
Job.DeleteWhenCompleted: FALSE
Job.Task: "MultiSite Sync Export"
Job.Args: -quiet 1 -update -ship -maxsize 20m /<install dir>/clearcase/scheduler/tasks/backup_vob_list.script
Job.NotifyInfo.OnEvents: JobEndFail,JobDeleted,JobModified
Job.NotifyInfo.Using: email
Job.NotifyInfo.Recipients: root
Job.End
Job.Begin
Job.Name: "Replicate Change"
Job.Schedule.Daily.Frequency: 1
Job.Schedule.StartDate: 31-Dec-00
Job.Schedule.FirstStartTime: 00:20:00
Job.Schedule.StartTimeRestartFrequency: 00:10:00
Job.Schedule.LastStartTime: 23:59:00
Job.DeleteWhenCompleted: FALSE
Job.Task: "MultiSite Sync Export"
Job.Args: -quiet 1 -ship -maxsize 20m /<install dir>/clearcase/scheduler/tasks/backup_vob_list.script
Job.NotifyInfo.OnEvents: JobEndFail,JobDeleted,JobModified
Job.NotifyInfo.Using: email
Job.NotifyInfo.Recipients: root
Job.End |
The "Replicate Full" task is just a single replication, but it updates the epoch tables in the event a packet is lost (i.e. autonomous / self-healing replication). The "Replicate Change" task repeats every 10 minutes and only sends deltas.
Note: In v2003.06.00 with no patches, it appears that a bug was introduced which inverts AM and PM in the times. Using the CCHostAdmin win32 GUI doesn't show this problem so add the jobs that way. Subsequent patches fix this problem.
After replications are completely set up and the new replicas are verified to be working, go back to your primary server and make the new replicas self-mastering. This step is purposely delayed to occur after the mkreplica import stage in case changes need to be made to the new replica before the replicas are functioning on the new server. Again it is better to su to the vob owner when running this command, e.g:
su - <vob owner> multitool chmaster replica:backup_server@/vobs/vob1 replica:backup_server@/vobs/vob1 multitool chmaster replica:backup_server@/vobs/vob2 replica:backup_server@/vobs/vob2 etc. |
This step is especially important in the case of a failure on the primary server. If objects on the redundant server are not self mastering it can be corrected, but it will take some time and effort. If this occurs you don't want to be wasting valuable time readjusting mastership which would be more usefully spent in switching to the backup server.
What do you do in the event of a server failure?
Actions for the ClearCase administrator:
License server switch: If you have set up a DNS CNAME, change it to point to the backup license server. If this isn't an option you will need to inform all users to change their client configuration via the Licensing tab of the ClearCase Control Panel to use the backup license server instead. You will also need to make changes to your ClearCase release area so that new users automatically get set up to use the redundant server.
Replication switch: If you replicate with other locations, inform them to replicate to the redundant server.
Registry switch: The easiest way to switch the registry to the redundant server is to use the rgy_switchover utility, which all administrators should install. The v6.0 ClearCase Administrator's guide gives ample detail on the correct usage of this utility in the section Administering the ClearCase Registry > ClearCase Registry Backup and Switchover.
If the primary server becomes available again you can also use this utility to switch back to the primary server. If some client users are unreachable when you are running this command you will need to get those users to manually switch over to the new registry. This is covered in the section below "Actions for the ClearCase client users". Therefore record the name of any clients for which the switchover fails. Depending on how many systems you have available to you, it may be wise to set up a third server at this stage as the new backup registry server.
Object mastership transfer: Mastership of all objects from the primary replica will need to be transferred to the redundant server so that developers can continue to modify those objects in the redundant server's replicas. Assuming the primary server is not salvageable, perform the following command in a view context using the redundant replicas:
multitool chmaster -all -obsolete_replica <primary_server> -long <backup_server> |
Note: You should definitely read the man page for the multitool chmaster as the -obsolete_replica flag should be used with caution.
Actions for the ClearCase client users
If no DNS CNAME is set up, the clients will first need to switch to using the backup license server by editing the Licensing tab of the ClearCase Control Panel.
Those clients which were inaccessible while the ClearCase administrator was running the rgy_switchover utility will need to switch to the backup registry server manually if this option wasn't already configured in the Registry tab of the ClearCase Control Panel (/<install dir>/clearcase/rgy/rgy_hosts.conf on Unix).
All clients must be updated to use the backup replica regions. On Windows this can be done by updating the Regions used by the client configuration via the Registry tab of the ClearCase Control Panel so that the clients are using the backup servers VOB tags (/<install dir>/clearcase/rgy/rgy_region.conf on Unix). e.g., they will then need to switch to the backup servers regions, <BACKUP_WIN> and <BACKUP_UNIX> in our case. Unfortunately most if not all views that were using the primary replicas will not work with the backup replicas. New views will need to be created.
What about backing up other system files?
It is probably also a good idea to put in place a system to backup other system files on a regular basis. In particular files related to user access, Samba and network configuration as well as any trigger scripts you may be using. In our case with a Linux system we regularly back up files in the /var /usr /boot /home directories as well as key files in /etc directory.
Benefits of this backup system
In the event of a server failure the system as outlined above should see you operational, within a very short space of time. While it is expensive to purchase and maintain the redundant server, the system as described in this article once set up has very little on-going overhead. The cost of this system can in most cases be easily offset against the potential cost of a ClearCase server failure. If you haven't planned for a server failure it will take a considerable amount of time to configure and set up a second ClearCase server from scratch. Costs of server failure not only include the potential high costs of lost development time, but also loss of changes not yet replicated to other sites, or changes not yet checked in from local views. Other indirect costs will include losses to other functions such as deployment and testing teams as they will no longer be able to track the change sets being delivered into different builds.
As with any backup system containing mission-critical data, you should schedule periodic audits of the system to make sure it is functioning as expected. You don't want to wait until a failure to discover that replications haven't been happening for a month because of a network issue or some other obscure system issue that was otherwise undetected.
While CCWeb from a remote site may be used in the short term in event of a failure, you will still lose development time, especially if you need to set up users on the remote system. CCWeb however, still does not have the same richness of functionality as the ClearCase client although the new ClearCase Remote Client significantly improves this. Rational also provides a number of other ways of restoring ClearCase archives via the operating system however these are dependent on a system already being set up.
- Participate in the discussion forum.
- To learn more about ClearCase, visit the ClearCase area on developerWorks Rational. You'll find technical documentation, how-to articles, case studies, and links to training, downloads, product information, and more.
- Find more resources for Rational developers in the developerWorks Rational zone.
Stephen Davies is a ClearCase administrator working in the Dublin Software Lab in IBM in Ireland. Having worked in a small software company for 4 years, he joined IBM Lotus in 1999. He has worked on a wide range of IBM Lotus products including Notes/Domino, Quickplace, Domino.doc and Sametime for Windows, Unix and Mac platforms. He is a certified Notes developer and a certified Solaris administrator, but now his preferred O.S. is Linux. Stephen is a graduate in Computer Engineering from the University of Limerick and has also recently completed a M.Sc. in Technology Management at the N.I.T.M. in University College Dublin.

Joel Abbott is a Senior Software Engineer working on IBM Lotus Sametime and IBM Lotus Workplace products where he leads Configuration Management and Release Engineering teams distributed around the world. He has been in the software industry since 1989 where he has had numerous technical and managerial roles. Joel is a graduate from the University of Kentucky and splits his time between Lexington, KY and Dublin, Ireland.




