Comparing and merging UML models in IBM Rational Software Architect: Part 9. Model management in Rational Software Architect with IBM Rational ClearCase Remote Client and dynamic views

Massive models can be difficult to work with in IBM® Rational® Software Architect. New technologies such as the IBM Rational ClearCase® Remote Client, logical model fragmentation, and sparse model merging can dramatically improve the usability of large models by reducing the working set size (and thus the memory footprint). This improves performance for every user, and allows massive models to be modified and even merged on much smaller client machines than could have been used previously. This article documents these features and is a resource for anyone who works with large UML models.

Share:

Kim Letkeman (kletkema@ca.ibm.com), Senior Technical Staff Member, IBM Rational

author photoKim led the team that developed the model management solution for the IBM Rational Software Architect family of products, and has worked with customers for years to ensure their success with model-driven development. Kim's current responsibilities include model management for the Rational Software Architect product family and the standardization of a concrete syntax for the UML action language, for which he is a coauthor.


developerWorks Contributing author
        level

24 August 2010

Prerequisites

Detailed coverage of IBM® Rational® ClearCase® functionality, IBM® Rational® Software Architect integration issues, snapshot views, and dynamic views is beyond the scope of this article. For a discussion of ClearCase and issues surrounding its integration with Rational Software Architect, please read the article Comparing and merging UML models in IBM Rational Software Architect: Part 5 - Model management with IBM Rational ClearCase and IBM Rational Software Architect Version 7 and later by this author. Note that this article supersedes that article in areas where the information might conflict.


Introduction

The Rational Software Architect family of products has a strong integration with ClearCase. Rational Software Architect and ClearCase are integrated at the legacy application level, where ClearCase applications like the TreeView (version tree browser) are able to launch a compare or merge of versions of an artifact, and an Eclipse-based compare editor will open in a running instance of Rational Software Architect.

If there is no running instance of Rational Software Architect, the designated instance of Rational Software Architect with the designated auto-launch workspace will be launched automatically, and the compare or merge session loaded when the launch is complete. All of this happens without any significant intervention required of you. ClearCase and Rational Software Architect can be installed in any order and the integration remains automatic and seamless.

Further, there are two plug-ins for ClearCase integration at the Eclipse user interface level.

The SCM Plug-in

The original plug-in is called the Rational ClearCase SCM Adapter (for Eclipse), hereafter referred to as the SCM adapter or SCM plug-in. SCM stands for software configuration management and the adapter was named for the interface library it uses to communicate with the ClearCase servers.

This plug-in requires that you install the ClearCase client alongside Eclipse (Rational Software Architect) in order to function. You can use it to queue checkouts and other commands, and to interface with any view type that legacy ClearCase commands understand (that is, with both snapshot views and dynamic views).

The CCRC Plug-in

The modern plug-in, on which future development is based, is called CCRC, which stands for the ClearCase Remote Client. This plug-in was developed due to the need for a client that functions well in a widely distributed development environment (more specifically, for development over the WAN).

CCRC enables team members to work at home over a VPN, and from a different city than the main servers. This means that you can treat a large, distributed development team homogeneously, instead of dividing the work by region or location. However, dividing the work might be more convenient if your development is organized around components to address security concerns with VOBs, or network issues related to IBM® Rational® ClearCase MultiSite® (replicated VOBs) or local network versus WAN speeds.

Traditional ClearCase clients and the SCM plug-in use a chatty remote procedure call (RPC) protocol to the server, and this makes most commands interminable over the WAN. The author ran a test on a large VOB and found the update command to have an 84:1 ratio over WAN connections versus LAN connections. Low latency connections such as local area networks easily support this protocol, but something else is needed for WAN connections.

CCRC addresses the issue by using a web protocol for its specialized CCRC views. These views are server-based, and maintain a copy area (file contents) on the client machine. A sandbox, if you will.

But CCRC versions later than 2009 use new custom code for modern features such as logical model merge and findmerge, and these cannot (yet) operate on the traditional ClearCase snapshot and dynamic views.

Large enterprise ClearCase installations

Large enterprises have used ClearCase very effectively over the years by scripting customized behaviors and global development using ClearCase MultiSite.

Dynamic views are commonly used for several powerful features:

  • Instant-on, with no need to populate a local copy of the view. This can make a significant difference in environments where there is minimal to moderate change of massive data sets.
  • When you perform a checkin, individual artifacts are immediately propagated to all other dynamic views that share the same configuration specification (thus modifying the same set of artifact versions on a shared branch or stream in real time.)
  • Authoritative audited builds generate detailed traceability information for each build target artifact (also called a derived object). This includes the version of each file referenced during the build, as well as the date, user, machine, and other context information.
  • Builds can detect if a build target artifact (for example, an object file) has already been built in another view (using exactly the same versions and build options). If so, the derived object is made visible to your view without you having to build it. This feature is useful, for example, in environments where there is a common build such as with C++ development, where a new object file can be propagated to the whole team in seconds, saving everyone else the build step.
  • You can host views on a server if you want (instead of on the client machine), so even checked out files are kept on a backed-up server. Large storage farms can be useful in an environment where each developer has a private stream used for isolation from the integration stream, improving inefficiency and cost control.

The bottom line is that there are numerous reasons to use Dynamic Views (shown in Figure 1) in large organizations.

Figure 1. Dynamic views
SCM Server, VOBs, and distributed workstations

Figure 1 shows the relationship of a typical user to a dynamic view that resides inside a server, and displays artifacts inside a Versioned Object Base (VOB), which is ClearCase's word for a group of related artifacts that are versioned (and otherwise managed) as a unit.

In many cases, your enterprise develops many processes around the behaviors inherent in the dynamic view. Because it resides on the network and can be used to propagate changes to other views on the same VOB, there is strong incentive to use such behaviors.

Once a set of scripts are built to network in this way, it can be quite difficult to recast them to look at local data (for example, the data that might exist in a snapshot view or CCRC view, on the local disk or in the Eclipse workspace). Also, when such changes are made, there is an understanding that the local data must always by synchronized before it is used, which introduces an extra step and the potential for errors. For example, there could be a local build that sees local changes but not related remote changes from others.

For the purposes of this article, the major issue with the CCRC Plug-in in an enterprise environment is that it cannot yet see dynamic views, which means that the logical model merge solution is unavailable, leaving a modeler to merge individual fragments and risk model corruption.


Logical models and merging

A logical model in Eclipse is essentially defined by the list of model artifacts returned by a model provider snap in, which is a small piece of software that is given an artifact and returns the total list of artifacts for the logical model to which that artifact belongs. Logical models are an Eclipse concept, and are therefore only relevant when you are working in a view that is visible to an Eclipse plug-in.

Logical model structure

A logical model can take one of three shapes, as shown in Figure 2.

Figure 2. Logical model structures
three model configurations

They are, from left to right:

  • One file containing multiple logical models
  • One file containing one logical model
  • One logical model made up of many files

Rational Software Architect currently supports the second and third structures for a logical model. In other words, you can put the entire model into a single file or you can break up the model into separate files, called fragments.

Separating a model into fragments allows a massive model to be viewed and changed in environments that would normally be unable to open such a model. Figure 3 illustrates the mechanism at play.

A logical model is made up of a single root file with the .emx extension. This can be a package or a model, but in effect they are the same thing.

Figure 3. Logical model memory footprint
diagram shows root, packages, classes, and diagrams

The root can contain as many or as few elements as you want. You can separate each contained package into its own fragment with the .efx extension. You can further separate each class and diagram into self-contained fragments with the same extension.

Thus, a logical model is made up of one EMX file with from zero to an unspecified (but very large) number of EFX files.

Figure 3 shows what would happen if a user were to open the model in the project explorer, then drill into package 4 and open diagram 3. Diagram 3 references class 4, which is contained by package 3. All opened fragments must have a path in memory to the root, so all parent packages are opened until the root is reached. The memory footprint is now made up of the root plus package 4, class 4, and package 3. That's four fragments out of 9 in this example.

It is not uncommon to find large models with more than 5,000 fragments, yet the same small machine that had no trouble opening the diagram from figure 3 would have no trouble opening the same diagram even if there were 4,991 fragments in addition to those shown in the example.

Fragment merging

There are internal relationships between the fragments in order for the model provider to be able to follow a path back to the root and find child fragments that have a containment relationship (that is, owned children that do not reside in the same file.) Yet each fragment can still be treated as an individual file. Rational Software Architect even supports compare and merge of any individual EFX file.

However, there are significant dangers with merging of individual fragments. Two major risks in fact.

Housekeeping data

The presence of housekeeping data (links) inside fragments means that there are deltas that can do physical damage to the model structure. When a fragment is separated from its parent or two fragments are rejoined (in other words, the model is physically refactored), these data are changed in more than one place.

During the merge of each of these fragments as individual files, it is possible to miss the housekeeping changes and reject one of them accidentally. Because one is accepted and one rejected, the model is suddenly in an unknown state. In fact, it is corrupted to some extent. It may be possible to repair the damage, but models are complex and the risks build up if a lot of merging is performed.

Multiple fragment gestures with potential conflicts

The second danger is user-created and can be far greater. The modeler application lead gave me an excellent example of one action that can change four fragments! Reject just one of these and you have corruption.

This example entails a model with a diagram with two classes on it. Each class is a separate fragment, the parent package is a separate fragment, and the diagram itself is a separate fragment. That's four in play so far.

Figure 4 shows the result when a bi-directional link is created between the classes by drawing it on the diagram. The highlighted fragments are all changed.

Figure 4. Four fragments changed by one action
explorer on left, diagram on right

The specific changes written into the model are:

  • The parent package is modified in order to create the association itself.
  • One class fragment is modified to reference their new association element.
  • The other class fragment is modified to reference their new association element.
  • The diagram is modified to add the association's new view (the shape.)

Accidentally reject any of those and the diagram and model are corrupted.

And here's a hidden danger: If any one of these changes comes into conflict with another change, accepting the conflicting change will reject this change automatically. And if the modeler performing the merge forgets to reject all of the others, the model is again corrupted. This is an insidious problem.

Logical model merging

Rational Software Architect addresses this danger with the logical model. Merging logical models is the same as merging a single model: the whole thing is in memory and Rational Software Architect is able to perform its very significant model integrity protection during the merge.

Model integrity protection is made up of several features:

  • Automated acceptance and rejection of prerequisite and dependent changes so that the model cannot get into this unsafe state
  • Specialized groups of changes, called atomic composite deltas, that provide an all-or-nothing behavior to related changes
  • Powerful conflict detection that ensures that changes that can corrupt the model or lose data when accepted together are forced into a pick A or B scenario. Note that custom conflict analyzers can be easily added to the Eclipse environment through extension points in the modeling compare support.

For further information on integrity protection, or for a deeper understanding of model merging in the Rational Software Architect family of products, read Comparing and merging UML models in IBM Rational Software Architect: Part 3 A deeper understanding of model merging by this author.

Logical model merging in ClearCase

Terminology

I like to use Unified Change Management (UCM) terminology like Stream as shorthand for this grouping of artifacts under development over a common timeline: that is, to designate the same group of files at the same versions. You may prefer the term "branch" or "configuration specification" to designate the artifacts that are viewed. I consider these all synonyms for the purposes of this article (which ignores the other semantics of "stream," for example.).

ClearCase supports logical model merging with the CCRC Plug-in. At this point in time, it is understood that the CCRC Plug-in does not yet support dynamic views. So how does an organization that is built on the use of dynamic views for day-to-day work get their models safely merged?

The answer is that all ClearCase views are simply views on a specific group of artifacts, and they can coexist. Multiple views, regardless of the view type, can easily be configured to see exactly the same group of files at exactly the same versions.

Figure 5 will provide some clarification, because it takes figure 1 and adds the CCRC Plug-in based views to the mix. Suddenly, it is possible to work on the same set of artifacts in either style of interaction.

Figure 5. Dynamic and CCRC Plug-in views coexisting
static and dynamic views on the same artifacts

In Figure 5, user Fred has two views set up on the same Stream. The dynamic view is used for day-to-day work, and the CCRC Plug-in view is used only to merge incoming changes. Fred's role in this case might be that of a model manager, or he might be merging only his own work into the integration stream. It does not really matter, because the CCRC Plug-in can handle either mode.

This coexistence is simply a property of how views work and their role in the SCM architecture of ClearCase. There is nothing to set up, except to ensure that the ClearCase server has the ability to serve the CCRC style views.

It is always important, of course, to ensure that all changes are visible to all views. It is therefore useful to maintain a private stream (branch, and so on) in order to continue working in isolation from other modelers. Figure 6 illustrates one approach to this issue.

Figure 6. Isolation of multiple views from other modelers
view shows artifacts and private branches

Further notes on the coexistence of the dynamic view and the CCRC Plug-in view

Separation of workspaces

The CCRC Plug-in and the SCM Plug-in do not share an Eclipse workspace comfortably. It is important that they not be enabled at the same time in the same workspace.

It therefore makes sense to create two workspaces, one for the dynamic view and one for the CCRC Plug-in view. The appropriate SCM provider can then be enabled in its respective Eclipse workspace.

ClearCase unified client

In the future, there may be a unified client that will combine the capabilities inherent in the SCM and CCRC Plug-ins. This means that it will be possible to perform logical model merges and dynamic view operations in the same workspace.

Synchronization

The CCRC workspace is generally intended for merge work in an environment that requires the use of dynamic views for day-to-day tasks like building the load, so it is important to understand that each time you switch work spaces, it is necessary to perform a full refresh of the view contents. This action ensures that the latest file versions are in the view. In UCM, this would require a checkin to a private stream from the view in which the latest changes were made, followed by an update of the other view. Your work is visible to all dynamic views after it is checked in to the stream, but the snapshot and web views require manual synchronization.

Coexistence with UCM

For those using UCM, the CCRC Plug-in and logical model merging work well in stream to stream scenarios; however, there is a certain order that must be followed.

The person performing the merge (for example, a model manager) will load a view on the target stream (that is, the stream into which the merge is being performed). The source stream – that being the next model to be integrated – will be the merge source.

ClearCase knows how to calculate the correct common ancestor for all logical elements (fragments). However, the command to perform the logical model merge may not automatically be launched in the CCRC version of the deliver command, because it is possible at this time to have a directory merge performed on the server, interrupting the correct flow and leaving the client out of the merge operation.

A future unified client

Logical model merge requires complete control of the model structure, so server-based merging – which is not Eclipse based and is thus not logical model aware – is not a desirable behavior. Instead, a possible future version of the unified client and related server could make it possible to defer all merges to the client, thus protecting the integrity of the logical models.

In order to prevent this inappropriate behavior, you can run the CCRC Plug-in findmerge command (properly named the ClearCase Merge Search command in the CCRC Plug-in menus) at the client to preempt any potential directory merges.

The modeler will merge from the source (local) to the target (integration) stream, and then perform any logical model merges as they are launched. After all merges have completed, the modeler will perform the deliver command (or its equivalent, because deliver is a UCM term) from the source stream to the target stream. With all of the merging already completed on the appropriate file versions, the final artifacts are copied to the target stream. In the case of UCM, the project VOB metadata will then be updated and the delivery completed.

This pair of operations can be performed for all incoming source streams in such a way that the target stream is integrated with all of the modeling changes in this interval (or iteration, sprint, and so on).

Merging massive models

Although a fragmented model is well suited to smaller memory machines owing to its on-demand scaling of the memory footprint, a standard logical model merge does not work well on smaller memory machines.

Logical models have a great deal of metadata (or housekeeping) that is used to build the model in memory. These data are in fact a part of the model, and must occupy some of the memory themselves. Thus, a finely-grained, fragmented logical model with 5,000 classes is actually larger in memory than a single file model containing 5,000 classes.

This issue, coupled with the need to load four copies of a model during a merge session – common ancestor, local (in workspace), remote (last checked in), and merged – and you have a massive load on the Java™ heap. In fact, 5,000 fragments is known to be more than can be handled during a merge by the standard Microsoft® Windows® 32-bit Java™ Virtual Machine (JVM), which has a heap limit somewhere around 1.6 GB.

Rational Software Architect addressed this with a patented technology called Sparse Model Merge. With this technology, only the actual change sets are loaded into memory and merged. Figures 7 and 8 illustrate the three incoming models with their changes, and the full change set to be merged.

Figure 7. Change set to be merged
Base (none), Contributor 1 (G, H), and Cont 2 (D)

In Figure 7, the changed fragments are shown in red. Out of 15 x 3 = 45 fragments, only 3 are changed.

Figure 8 - Calculated memory footprint for merge
paths shown in green

In Figure 8, the final memory footprint after calculating the paths to the root is 18 out of 45 possible fragments. And if we extrapolate to include the merged model, then the memory footprint is 4/3 x 18 = 24 out of 60 possible fragments. That constitutes a 60% memory savings on a small model. Imagine the savings for a 5,000 fragment model (Hint: it's 99.88%!)

Needless to say, this results in spectacular performance for merging massive models on any machine and on any network. The bottom line is that sparse model merging memory and bandwidth requirements scale based on the size of the change set, not the model.

Is there a disadvantage to sparse model merging? Yes, a small one. Those fragments that are not in memory are shown as broken references.

This does not tend to affect diagrams because references to semantic elements contain hints that enable a reasonable rendering of the element. This is true even when the reference is broken because the element is not available (for example, all references are always broken when merging individual fragments in ClearCase, because only one piece of the model is in memory at all.) However, if you want to explore the model in browse mode during a merge, then sparse model merge will restrict the exploration to the change set only.

It might not be obvious, but sparse model merging makes a very good case for total fragmentation of a model. Every package, class, and diagram can reside in a separate file, which minimizes the memory footprint for every editing session and all merges.


What you have learned

The CCRC Plug-in provides a powerful and safe way to merge logical models. The CCRC Plug-in can coexist in any environment with snapshot or dynamic views. Use it to protect your corporate modeling assets, and improve your memory usage.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Rational software on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Rational
ArticleID=507596
ArticleTitle=Comparing and merging UML models in IBM Rational Software Architect: Part 9. Model management in Rational Software Architect with IBM Rational ClearCase Remote Client and dynamic views
publish-date=08242010