The IBM® Rational® ClearCase MultiSite® Global Monitor feature is a monitoring tool that was released as part of ClearCase MultiSite Version 7.1 (for more information on this product, see the Resources section). It enables you to see your global ClearCase deployment from a single view point in a Web-based interface. It also provides a single method of notification, as well as customizable thresholds. For example, you can receive a text message on your cell phone when the system meets your customized thresholds. Global Monitor also provides notification for several immediately available events, such as the IBM® Rational® ClearCase® ALBD (Atria Location Broker Daemon) server going down, or a scheduled job failure.
What can you find in the Global Monitor console? You can find Rational ClearCase version numbers, VOBs (Versioned Object Bases), Views, scheduled jobs, ClearCase service logs, and more! As a ClearCase MultiSite administrator, you can also find replicas, feature levels of replicas, epoch numbers, and synchronization packets in incoming and outgoing bays. Again, you can configure the monitoring system to be notified by specific conditions on the monitored data. Figure 1. is a screen capture of Global Monitor console that shows replica synchronization packets in the incoming and outgoing bays.
Figure 1. Replica synchronization packets
You can find the path to the packet file, the type of packet (incoming or outgoing ), and age of the packet (how long the packet has been in the bay). When you navigate to the detailed view of a packet by clicking the blue link button at the left, you find more details about the packet, such as originating VOB tag, replica name, oplog IDs, or the packet fragment number.
The user interface (UI) provides a logical navigation tree of your global Rational ClearCase deployment. ClearCase hosts are grouped by ClearCase MultiSite site and ClearCase region. When the system finds an issue in your ClearCase deployment, it also provides context-sensitive help documents called Expert Advice. You can easily come up with possible solutions by reading the help text. The Global Monitor tool provides you with flexible deployment options, sufficient to pass through firewalls with limited data flow on one or two ports. In addition, it supports vendor reporting tools such as open source BIRT (Eclipse Business Intelligence and Reporting) projects or the IBM® Tivoli® Common Reporting tool.
The Global Monitor feature uses the IBM® Tivoli® Monitoring tool (also called ITM), which is a market-leading enterprise monitoring product (for more information on this product, see the Resources section). IBM Tivoli Monitoring is bundled with ClearCase MultiSite. Global Monitor's centralized user interface console is also provided by IBM Tivoli Monitoring, and it is called IBM® Tivoli® Enterprise Portal, or TEP.
One of the great features of the Global Monitor is its alert system which is called the ITM situation (for more information on this feature, see the Resources section). This document describes how to apply the ITM situations to check the status of the MultiSite synchronization of your deployment, but let's take a moment to learn what the ITM situation is and how to use it. If you are already familiar with the ITM situation, you can skip this section.
The event notification system of IBM Tivoli Monitoring is called a situation. It is one of the powerful features that ITM provides, and it is highly customizable. You can define your formula to trigger a situation event. You can create a situation formula on any monitored data, and it will be evaluated at specific intervals, which you can also customize. When a situation event fires, you can easily find it in the centralized console. Also, you can associate a script with a situation that is executed when the event fires. For example, an e-mail notification or text message on your cell phone can be sent by a script that is executed when a situation event fires.
The Global Monitor has the following predefined situations shown in Table 1:
Table 1. Global Monitor predefined situations
| Name | Auto start | Description |
|---|---|---|
| KRC_albddown | yes | The albd_server on this host is down |
| KRC_f_level_mismatch | no | The replica feature level is not supported on this host |
| KRC_failed_job | no | A ClearCase scheduled job has failed |
| KRC_family_f_level_low | no | The family feature level can be raised |
| KRC_family_f_level_too_high | no | The family and replica feature levels are not supported on this host |
| KRC_inbay_too_long | no | A packet has been in the shipping bay longer than expected |
| KRC_pool_space | no | The device hosting pool space has filled up beyond the configured limit |
| KRC_replica_f_level_low | no | The replica feature level can be raised |
| KRC_replica_f_level_too_high | no | The replica feature level is not supported on this host |
| KRC_replica_f_level_unknown | no | The replica feature level is unknown |
| KRC_rollback_critical | no | Epoch rollback detected: the VOB is not locked and not undergoing restorereplica |
| KRC_rollback_info | no | The replica is being restored with restorereplica |
| KRC_rollback_warning | no | Epoch rollback detected: the VOB is locked, but not undergoing restorereplica |
| KRC_shipping_bay_space | no | The device hosting bay space has filled up beyond the configured limit |
| KRC_updater | yes | An internal situation to run the Global Monitor cache updater |
| KRC_updater_log | yes | The Global Monitor cache updater has failed |
| KRC_view_space_low | no | The device hosting view space has filled up beyond the configured limit |
As you can see in the Auto start column, many of these are not enabled by default, which leaves plenty of room for customization. If you install the Operating System agent on your monitored host, you can also find many situations provided by default. In addition, you can define customized situations based on either ClearCase monitored data or Operating System level data.
When you configure a situation, you will find a panel like that shown in Figure 2:
Figure 2. KRC_albddown situation
The panel consists of 5 tabs at the top. The first, the Formula tab, provides a user interface to edit your formula for the event. You will see how to edit a formula in the next section. You can also edit the sampling interval in this tab.
The Distribution tab is where you configure the host or agent on which you would like to evaluate a situation. By default, all Global Monitor agents are selected.
The Expert Advice tab is where you provide text or a link to a help document. The Global Monitor system will display the help content when you navigate to the detailed panel of a situation.
On the Action tab (shown in Figure 3), you can specify a script or operating system command to be executed when the situation fires.
Figure 3. The Action tab of the KRC_albddown situation
The monitored agent data can be included in the argument of the system command. The script can be executed either on the monitoring server or on the agent host. A typical example is to enter a command in the System Command field to send an e-mail or text message. You can add arguments in the command field to provide details of the system status, such as the host name of the machine that the events fired on, the event's severity, and so on.
On the Until tab, you can customize when to close a situation event.
How to define and run situations
You can manage situations using the Situation Editor. First, let's look at a predefined situation, KRC_view_space_low, to examine its logical expression. On the Tivoli Enterprise Portal, type Ctrl+E or click the icon composed of blue and red dots in the toolbar shown in Figure 4 to open the Situation Editor.
Figure 4. Open the Situation Editor
The Situation Editor is displayed. Expand the ClearCase node in the navigator pane. You will see the predefined situations for the Global Monitor tool. Select the KRC_view_space_low situation, as shown in Figure 5.
Figure 5. The Situation Editor
The right frame of the Situation Editor shows tabbed editing areas (Formula, Distribution, and so on). You can see the logical expression, Usage Percentage > 95, in the Formula tab. The Usage Percentage indicates the Usage_Percentage attribute of the KRC_VIEW_SPACE query, and it actually means storage space which is obtained from the cleartool space -view command. This formula specifies that the KRC_view_space_low will be fired if the view storage usage exceeds 95%.
Now let's define a sample situation and run it. The Global Monitor tool has a predefined situation that alerts you when the view storage is running low, but there is no such situation for VOB storage space, so you can create a vob_space_low situation as an example.
Open the Situation Editor, and then click the Create new Situation button shown in Figure 6.
Figure 6. Create new situation
In the Create Situation dialog box, set Name to vob_space_low, and Monitored Application to ClearCase, as shown in Figure 7.
Figure 7. Create vob_space_low situation
Click OK to close the dialog. You now will see the Select condition dialog. Because the new situation is for monitoring VOB space, set Attribute Group to KRC_VOB_SPACE and select the Percent Used Attribute Item, as shown in Figure 8. The KRC_VOB_SPACE query obtains value from the cleartool space -vob command, like the KRC_VIEW_SPACE query.
Figure 8. Select the condition for vob_space_low situation
In the Formula tab of the Situation Editor, set v (Value of expression), > (Greater than) and 95, respectively, as shown in Figure 9. You can also change the sampling interval here. The default value of the sampling interval is 15 minutes, but if you feel that is not appropriate for the new situation, just change it.
Figure 9. Definition of vob_space_low situation : Formula
Open the Distribution tab and set Assigned to *CLEARCASE, as shown in Figure 10. The *CLEARCASE item represents all of the Global Monitor agents that are connected to the ITM system. You can choose specific agents if you would like to evaluate the situation on specific hosts.
Figure 10. Definition of vob_space_low situation : Distribution
Click the OK button to close the Situation Editor. The definition of the new situation, vob_space_low, is now completed. Now you need to associate the vob_space_low situation with a navigator node. In the Navigator view, select a navigator node, ClearCase > VOBs, right-click, and select the Situations menu item. You will see the Situation Editor that only displays Situations for - VOBs. Click the Set Situation filter criteria button, as shown in Figure 11.
Figure 11. Set situation filter criteria
The Show Situations dialog is displayed. Select the Eligible for Association check-box, and then click OK. All of the situations in the Global Monitor tool, including vob_space_low, are now listed. Select vob_space_low, and set State to Warning.
The setting for the new situation, vob_space_low, has been completed. When the specified conditions are matched (that is, KRC_VOB_SPACE.Percent_Used > 95), the situation is fired, as shown in Figure 12. In this example, lots of elements are checked in to the VOB, so pool space exceeds 95%. Note that you may need to run the space command (shown in Listing 1) or execute the Daily VOB Space scheduled job on the ClearCase host to report correct storage usage.
Listing 1. Command line to update VOB space
> cleartool space -gen -vob {vob_tag}
|
Figure 12. The vob_space_low situation is fired
You can confirm that the new situation is actually started by using the Manage Situation dialog. Select the ClearCase node in the Navigator view, right-click, and then select the Manage Situations menu item. You see the Manage Situations dialog, as shown in Figure 13, and can verify that vob_space_low is currently started and opened.
Figure 13. vob_space_low situation on Manage Situations dialog
Now that you are familiar with the ITM situation, you can learn about how to monitor your ClearCase MultiSite synchronization. How do you know if your synchronization is in trouble? The most common symptom of a synchronization error is that the packet files are clogged in a shipping bay. The Global Monitor system collects packet file information so that you can monitor your synchronization status in your shipping bays.
Update packets are accumulating in an incoming shipping bay.
The problem is that synchronization update packets for a particular replicated VOB are accumulating in a shipping bay, and are not being imported. In general, this is caused by a syncreplica import failure. For example, if a packet has been lost in transit to the target host, subsequent packets will fail to be imported because they depend on changes (oplogs) that the target importer has not yet received. An import can also fail if the VOB is locked, and packets would again accumulate in the incoming shipping bay.
How to detect synchronization problems using Global Monitor
Enable the Global Monitor KRC_inbay_too_long situation, and customize the threshold to a value that makes sense for your business. The Global Monitor system collects all of the packet information, and also detects how long the packet file sits in a shipping bay. If you display the Family Health workspace (select Workspace > Family Health, as shown in Figure 14), you can find all of the packet files and durations at the shipping bay (as shown in Figure 1).
Figure 14. Family_Health workspace on ClearCase < MultiSite node
For example, if your scheduled syncreplica import job runs every 10 minutes, you may want to detect packets that remain in the bay longer than an hour. If the situation fires, you can use the Global Monitor feature to confirm failure of a multitool syncreplica -import command. Perform the following steps to enable the KRC_inbay_too_long situation:
- Open the Tivoli Enterprise Portal client and log in to the monitoring system.
- In the Navigator view, select a navigator node, ClearCase.
- Right-click it and select Manage Situations.
- In the Manage Situation at Managed System:<hostname> dialog box, right-click KRC_inbay_too_long and select Edit Situation to customize the formula of the situation, or the action when it fires (Figure 15).
- Click OK to close the Situations for - Situation dialog box when you finish the customization.
- To start the situation, select KRC_inbay_too_long, right-click, and select Start Situation (Figure 16).
- Close the Manage Situation at Managed System:<hostname> dialog.
Figure 15. Customize KRC_inbay_too_long situation
Figure 16. KRC_inbay_too_long situation on Manage Situation dialog
You must correct the issue that has caused the initial import failure. For instance, if you determine that the VOB was locked, you must unlock it.
If you determine that a packet was lost in transit, you can detect the missing oplogs at the importing host, and then create a packet that only includes missing oplogs at the exporting host. There are 3 command line switches that are added to the syncreplica command in V7.1.1.
Table 2. New command line switches of syncreplica
| Switch name | Type | Explanation |
|---|---|---|
-oprange | Export | You can specify a range of oplogs to export a packet. When you find one packet is missing for some reason, but all of the subsequent packets are transmitted successfully, you can create the missing packet again by running an export command with this -oprange switch. |
-endrange | Export | This switch will add the end oplog information to the export packet. The end oplog information is displayed in the lspacket command output, and is used by the following -diagnose switch. |
-diagnose | Import | When you add this switch to import packets, packets are actually not imported but are surveyed to determine if there is a gap of oplogs that would cause an import failure. If it detects a gap, it outputs a message about the gap like that shown in Listing 2. |
To detect the missing oplogs, you can run multitool syncreplica -import -diagnose at the failing importer. Note that -diagnose parses only packets that have been created using the -endrange switch.
Listing 2. Sample command line to diagnose missing packet
> multitool syncreplica -import -diagnose -receive Suggested Export Replica "original@/vobs/testRep1" multitool syncreplica -export -endrang e -oprange original=1523:1524 testRep1 |
Create and send the packet again by running the multitool syncreplica -endrange -oprange ranges_suggested_by_-diagnose command at the suggested exporter, as shown in Listing 3.
Listing 3. Sample command line to create a packet that includes specific oplogs
> multitool syncreplica -export -endrange -ship -oprange original=1523:1524 testRep1@/vobs
/testOrg1
Generating synchronization packet /var/adm/rational/clearcase/shipping/ms_ship/outgoing/sy
nc_original_2010-01-05T175206-0500_5288
Shipping order "/var/adm/rational/clearcase/shipping/ms_ship/outgoing/sync_original_2010-0
1-05T175206-0500_5288" generated.
|
At the importer, run the multitool syncreplica -import command again.
Another possible source of ClearCase MultiSite synchronization failure comes from your export or import scripts, which are executed by the ClearCase scheduler. Yes, the Global Monitor system collects information about scheduled jobs so that you can be alerted by any job failure. For example, the predefined KRC_failed_job situation notifies you if any of the scheduled jobs finished with the error code. This section explains how to create the new situations, especially those to detect problems in replica synchronizations.
How can the Global Monitor feature detect problems with scheduled jobs?
Commands for replica synchronizations are typically executed by running jobs. ClearCase provides some predefined jobs (for example, "Daily MultiSite Export", shown in Figure 17. Job ID 12). This specific job is provided for the exporting phase of the syncreplica command, and it is the job that creates and sends packets to the shipping server at the receiving replica. For the importing phase, "Daily MultiSite Receive" (Figure 17. Job ID 14) is the job that receives packets sent from the originating server. You can schedule these jobs to run periodically (or run just once) by using ClearCase administrator console. If jobs do NOT run as scheduled, replicas are left unsynchronized in a VOB family. Therefore, it's important for ClearCase administrators to detect whether jobs run correctly or not.
Figure 17. Predefined scheduled jobs (ClearCase)
Global Monitor collects values of job properties on each ClearCase host, and provides functions to create situations made up from those job properties. To create situations for jobs, you need to perform the following steps:
- Open the Tivoli Enterprise Portal client and log in to the monitoring system.
- In the Navigator view, select a navigator node, (in this example, the Jobs node under the ClearCase node).
- Right-click and select Situations.
- In the Situations for - Jobs dialog, right-click the ClearCase node, and select Create New.
- Input a name in the Name field, and click OK.
- With KRC_JOBS highlighted, select attributes as you like from the Attribute Item list, and click OK (as shown in Figure 18.).
- Set conditions for each attribute selected at the previous step.
Figure 18. Select attribute group and items.
Table 3. shows attributes of KRC_JOBS in details. Situations for jobs are created from the combination of these attributes. Note that the attributes Last Finished Timestamp, Last Started Timestamp, and Running Started Timestamp have been added since Version 7.1.1.
Table 3. Attributes in details.
| Attribute name | Explanation | Type |
|---|---|---|
| ID | This attribute is the id for the job created by ClearCase. | Text |
| Job Description | This attribute explains what the job is like. | Text |
| Job Name | This attribute is the name for the job created by ClearCase. | Text |
| Last Finished/Last Finished Timestamp | These attributes are the time when the last job finished. If no jobs have been executed, these attributes are left blank. | Text/Timestamp |
| Last Started/Last Started Timestamp | These attributes are the time when the last job started. If no jobs have been executed, these attributes are left blank. | Text/Timestamp |
| Node | The format of this attribute is <hostname>:<agent code>. | Text |
| Running Started Timestamp | This attribute is the time when the current running job started. It is only set while the job is running. | Timestamp |
| Status | This attribute is the returned state of the job. | Text |
| Timestamp | This attribute is the time when ITM collected the data from the agent. | Timestamp |
Misconfiguration of scheduled jobs
Sometimes scheduled jobs are stopped for some other troubleshooting (like restoring a VOB), and you may forget to restart them. In this case, those jobs would not have run for a long time, so you can use the Last Finished Timestamp attribute in the situation formula because the value of the Last Finished Timestamp attribute reflects the time when those jobs last finished running.
Suppose that exporting and importing jobs are executed once an hour. Then the KRC_not_run_job situation should be fired when the value of Local_Time.Timestamp exceeds the value of Last Finished Timestamp by about one hour plus the typical time of executing the jobs. (Local_Time.Timestamp is the local time when ITM collected the data from the agent.)
These are the steps that you should follow to create the situation KRC_not_run_job, as shown in Figures 19 and 20.
- In the Select condition dialog (shown in Figure 18)
- Select KRC JOBS from Attribute Group.
- Select ID and Last Finished Time (clicking with Ctrl+C) from Attribute Item.
- Click OK.
- Select the first cell of the ID column in the Formula section, and input target Job ID.
- Set the value to
12to identify syncreplica export job. - Set the value to
14to identify syncreplica import job.
- Set the value to
- Select the next cell in the Last Finished Time column in the Formula section.
- Click the left icon, and select Compare Time to a time + or - delta
- In the Select Time Comparison Criteria dialog, select Local_Time.Timestamp from Time Attribute for Comparison, and
-,70(this value should be customized according to the interval and the typical time of executing jobs on your environment), Minutes from Time Delta. The formula will be:Last Finished Timestamp < Local_Time.Timestamp - (interval and execution time).- The time difference should be considered if ITM runs at a different time location at each agent.
- The formula including the time difference will be:
Last Finished Timestamp < Local_Time.Timestamp - (interval and execution time) - {(ITM time zone) - (agent time zone)}. - For example, when ITM and the agent run in UTC+9 and UTC-5 zone respectively, the added part in the formula including the time difference is calculated as:
- {9h - (-5h)} = -14h.
- Click OK.
- Click the middle icon, and select Less than.
- Set 10 minutes as the value in the Sampling interval.
- Click OK.
Figure 19. Create KRC_not_run_job at the exporting host
Figure 20. KRC_not_run_job fires at the exporting host when the job is stopped incautiously
The KRC_not_run_job situation can be applied to detect the case of moving VOBs (for more information on this, see the Resources section). VOBs are sometimes moved to a new machine with Job histories, as well as some other registry information. Before moving VOBs, scheduled jobs for synchronization must be stopped, but sometimes you may forget to restart them after the move is finished. By applying the situation to the host to which the VOB is being moved before moving VOBs, The KRC_not_run_job situation will fire if you have forgotten to restart the scheduled jobs, as shown in Figure 21.
Figure 21. KRC_not_run_job fires at the host to which the VOB is being moved
Sometimes it takes more time to export or import a large number of oplogs than the interval between the previous and next sessions of a scheduled syncreplica job, and the currently running session of the job will block the next session of the same job. For example, packets which arrive during a long-running instance of job 14 are imported at its next running, not during the current one.
In this case, the job has been running for a long time, and the value of the Running Started Timestamp attributes remains the same as when the job started running. If the exporting and importing syncreplica jobs are executed once an hour, the KRC_long_run_job situation should be fired when the value of Local_Time.Timestamp exceeds the value of the Running Started Timestamp attribute by about one hour, as shown in Figures 22 and 23.
The steps to create the situation KRC_long_run_job are the same as for the KRC_not_run_job situation, except for selecting the Running Started Timestamp attribute, not the Last Finished Timestamp attribute.
Figure 22. Create the KRC_long_run_job situation on a host
Figure 23. The KRC_long_run_job situation fires on the host when the syncreplica import job runs too long
The ClearCase MultiSite Global Monitor feature allows you to see global ClearCase deployment from a single view point in a Web-based interface. Global Monitor uses IBM Tivoli Monitoring (ITM) to provide customizable thresholds to monitor generic events (for example, the ALBD server going down, a scheduled job failure, and so on), and to provide a single method of notification. As examples of typical use cases, this article described how to leverage ITM situations to monitor specific conditions (for example, running low on space on the VOB storage device). This article also explained how well the Global Monitor feature can be applied to ClearCase MultiSite deployment to monitor typical issues of replica synchronization. Timely and appropriate information provided by the Global Monitor feature helps you take actions quickly to recover from ClearCase and ClearCase MultiSite issues.
The authors would like to thank Takehiko Amano for technical advice.
Learn
- Key Rational ClearCase resources:
- For technical information and other resources for developers, see the Rational ClearCase product page
- Visit the Rational ClearCase area on developerWorks for technical articles and tutorials, as well as links to other information.
- See the Information Center for documentation, tutorials, and other technical information.
- Join the ClearCase discussion forum to exchange information with your peers.
- Find answers to terminology questions in the ClearCase Glossary.
- Explore the Software Configuration Management Information Center to learn how features such as local and remote access, a proven use model, a wide range of supported environments, transparent access to files, and parallel development support give your team members instant, controlled access to the project assets that they need to create, update, build, reuse, and maintain your software.
- Learn about other applications in the IBM Rational Software Delivery Platform, including collaboration tools for parallel development and geographically dispersed teams, plus specialized software for architecture management, asset management, change and release management, integrated requirements management, process and portfolio management, and quality management. You can find product manuals, installation guides, and other documentation in the IBM Rational Online Documentation Center.
- Visit the Rational software area on developerWorks for technical resources and best practices for Rational Software Delivery Platform products.
- Explore Rational computer-based, Web-based, and instructor-led online courses. Hone your skills and learn more about Rational tools with these courses, which range from introductory to advanced. The courses on this catalog are available for purchase through computer-based training or Web-based training. Additionally, some "Getting Started" courses are available free of charge.
- Subscribe to the IBM developerWorks newsletter, a weekly update on the best of developerWorks tutorials, articles, downloads, community activities, webcasts and events.
- Learn more about IBM® Tivoli® Monitoring
- The Creating custom situations (IBM Tivoli Monitoring infocenter) page provides detailed information about how to create a custom situation.
- A way to create a sync packet containing a specified range of oplog without chepoch shows you how to use the new switches of the multitool syncreplica command in Version 7.1.1.
- The Moving VOBs (IBM Rational ClearCase infocenter) page provides detailed information about how to move VOBs.
Get products and technologies
- Download trial versions of other IBM Rational software.
- Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Tivoli®, and WebSphere®.
Discuss
- Join the ClearCase discussion forum to exchange information with your peers.
- Check out developerWorks blogs and get involved in the developerWorks community.

Ken Kumagai is a software engineer at the Software Development Laboratory in Yamato (YSL), IBM Japan. He currently works on the IBM Rational ClearCase Global Monitoring team as a developer. One of his current interests is distributed software configuration management. He admires such extensible application platforms as Emacs and Firefox. In his spare time, he enjoys reading books at the nearest Starbucks coffee shop.

Yoshio Horiuchi is a software engineer in the Software Development Laboratory in Japan. He currently works on IBM Rational ClearCase Global Monitor.





