About this article
This article, Part 1 of a four-part series, lays the foundation for the other articles. The other articles offer guidelines for modeling according to particular process styles, and this one establishes the concepts and terminology that will be used throughout the series. This article also focuses on considerations of structuring models to support team modeling efforts. Those considerations should inform your thinking regardless of what specific modeling style you choose to adopt.
Basic concepts and terminology
Readers familiar with Eclipse, IBM® Rational® Application Developer, or IBM® WebSphere products® that were the predecessors of Rational Application Developer will already be familiar with some of the terms that we use in this article.
Workspaces, Projects, and Project types
You may already know that, in Eclipse, files reside within projects, that projects can be of various types (or in Eclipse terms, projects have natures), and that projects are grouped and managed within workspaces. For purposes of this discussion, not all of the project types that are available in Rational Application Developer and the Eclipse-based Rational UML modeling tools need be explained in detail. We are primarily interested in two categories of projects:
- UML projects, which are really just generic projects that contain a UML Model
- Implementation projects, which include the specialized project types, such as Enterprise Project, Enterprise Java™Beans (EJB) Project, Web project, Java™ Project, and C++ Project
The IBM Rational Unified Process (RUP) defines a model as "a complete specification of a problem or solution domain from a particular perspective." A problem domain or a system may be specified by several models that represent different perspectives on the domain or the system. For instance, traditional RUP guidance proposes a specific set of UML-based models:
- Business Analysis Model
- Business Use-Case Model
- Use-Case Model
- Design Model
- Analysis Model (may be subsumed within the Design Model)
- Implementation Model
- Deployment Model
- Data Model
Also, RUP is tool-agnostic. Therefore, for RUP purposes, a model could be a drawing on a napkin or a whiteboard, something in a modeling tool, or even a mental image. From the RUP perspective, a model is a logical concept, as opposed to a physical one, and we will adopt that viewpoint here.
In the context of the products under discussion, models are of two general types: conceptual and concrete:
- Conceptual models represent and manipulate ideas. Typical UML models are prime examples. Conceptual models have no mechanized binding to an executable realization. Note: There are products, such as IBM® Rational Rose® RealTime™, that support the execution, debugging, and testing of UML models in a runtime environment designed for that purpose. Such UML models are considered to be concrete.
- Concrete models represent a way to graphically depict and directly manipulate implementation artifacts that can be mechanically rendered into an ordinary executable file. Java models and C++ models are prime examples. Concrete models are also known simply as code models when the underlying semantics are 3GLs. Another class of concrete models is based on declarative languages. A prime example is physical data models that directly manipulate SQL Data Definition Language.
In the products under discussion, you interact with models primarily through diagrams and through the Eclipse Project Explorer
The distinction between conceptual and concrete models is not the same as the distinction that the Object Management Group (OMG) Model Driven Architecture (MDA) makes between platform-independent and platform-specific models. A model can be platform-specific, yet still be conceptual.
Model files (persistence mechanisms for models)
The products under discussion persist models as files. In Eclipse, a file is considered a type of resource. Resources may have additional properties and behavior within the Eclipse environment, thus resource connotes more than simply file. The modeling files described here are implemented as Eclipse resources. Therefore, when you see the term modeling resource here or in other writings, it means the same thing as modeling file.)
In the broadest sense, the software supports two kinds of modeling files:
- Conceptual UML Model files are stored within Eclipse projects and have file name extensions of .emx and .efx (the distinction between these file types will be made clear later on). These files contain two types of content:
- UML semantic elements (Classes, Activities, relationships, and so forth)
- UML notational elements that have been composed into diagrams that depict the UML semantic elements (these diagrams may also depict visual references to things in other semantic domains, such as Java, C++, or DDL)
- Concrete modeling files (Java, C++, DDL, for instance) are also stored within Eclipse projects in an Eclipse workspace and contain a mixture of semantic and notational information but in this case the semantic and notational contents are more clearly delineated
- Concrete model semantic elements are located in implementation artifacts. For instance, in the case of Java, the semantic model is serialized and stored as a collection of Java source code files. (When you are running the tool, the semantic model resides in memory as a Java Abstract Syntax Tree.)
- Each diagram is stored in its own file. Diagram files can use various extensions, but the most common one is .dnx. Concrete modeling diagrams may use UML notation but may also use other notations (for example, IDEF1X or information engineering notations for data visualization, or IBM proprietary notations used for designing Web tiers).
These model structure guidelines are primarily about how to structure the artifacts and contents of conceptual models. You can find guidance for organizing the contents of concrete models (that is, implementation projects) in other sources, such as Help in Rational Software Architect, Rational Application Developer, and Eclipse. To some extent, the organization of concrete models is imposed by Eclipse project type conventions. However, the general principles regarding the logical organization of solutions, as discussed in the Team development and model management considerations section of this article, apply to both conceptual models and implementation artifacts.
In the products under discussion, you interact with modeling files primarily through the Eclipse Project Explorer.
UML model files: Logical Units and Fragments
Sometimes, UML models become so large that they must be persisted in smaller pieces. Further, when teams of modelers work together on highly interrelated model content, issues of contention and ownership must be managed. To deal with these factors, the products under discussion provide two ways to partition the logical content of UML Models into physical storage (persistence) containers:
- Logical Units
The following attributes describe a Logical Unit (LU hereafter) as supported by the software available at the time of this writing (in future versions of the software, these conventions and limitations may change):
- In the user interface of the software, an LU is called a UML Model. Therefore, when you perform a New > UML Model action, you are actually creating an LU.
- The UML Model (and therefore, the LU) is the smallest unit of UML content that can be "opened" and "closed" by using the commands in the File menu and in the Project Explorer menu. (It is possible to open a Fragment by selecting it in the Navigator view, but doing so will result in also opening the LU that contains the Fragment.)
- Each LU has a root element, which is the top-most Package. The root element logically contains (owns or is the parent of) the other UML elements that depend on it to provide their requirements for containership and namespace. Thus, in this regard, a root element behaves like any other UML Package. However, a special property of the root element is that it also stores information about the LU as a whole, such as:
- The set of UML capabilities that are enabled when working with the LU
- The UML profiles that are applied to the LU
- An LU is persisted as a file with an .emx extension (see Figure 1).
- At a minimum, the .emx file stores the root element. It may also store other elements, but it is possible to treat any UML classifier or diagram that is logically owned by the root element as a Fragment stored in a separate .efx file.
- A UML Model (LU) always appears as a top-level construct within the Project Explorer view. In other words, in the current Rational user interface, LUs cannot be visually arranged into arbitrary logical containment hierarchies. It is possible to create arbitrary containment hierarchies in a pure sense by using
<<PackageImport>>relationships. Nonetheless, again, these will not be reflected in the hierarchies as depicted in the Project Explorer, where a UML Model/LU always appears as a top-most item.
In addition, the software supports the notion of model Fragments, which have these properties:
- A Fragment is stored in a file with an .efx file extension.
- Although a Fragment is persisted independently, in terms of logical containership and namespace, it is still fully dependent upon the LU to which it belongs
- Otherwise, just like a LU, a Fragment can contain an arbitrary subset of a logical model.
- Unlike an LU, a Fragment does not necessarily define a logical containment structure or a namespace. In fact, a Fragment can be defined at the UML classifier or diagram level. For instance, a Fragment might contain only a Class, a Component, or an Activity, but a Fragment might also be defined at the Package level, in which case it will correspond to a namespace and may contain other elements. But again, the entire structure defined by such a Fragment remains a child of the containing LU
- Because this containment and namespace dependency exists, a Fragment cannot be opened out of context. In other words, to open a Fragment, you must open the LU to which it belongs.
In the products under discussion, you can create multiple LUs (as UML Models). You can create one or more in a single project, and multiple projects in a workspace can contain LUs. Any number of LUs can be open and available for editing at the same time. Relationships can be defined to exist between elements that are contained in different LUs.
As mentioned previously, you interact with the logical content of UML models through diagrams, as well as through the Project Explorer. Where UML models are concerned, the elements that are contained by LUs and Fragments (the logical content of the models) appear in the Project Explorer, but the actual .emx and .efx files (physical persistence mechanisms) do not. These points are important for these reasons:
- The use of multiple LUs as a way to physically partition model content is logically opaque. That is, you see LUs clearly reflected in the Project Explorer because they always appear as top-level items.
- The use of Fragments as a way to physically partition model content is logically transparent. That is, by default, the only way that you see any indication of Fragments is that the UML elements that correspond to Fragments are decorated with glyphs when viewed in the Project Explorer (see Figure 1).
So far, we have discussed Logical Units and Fragments, which are model partitioning mechanisms. In the section called Team development and model management considerations, we will discuss how to use these mechanisms in model partitioning strategies.
Use of these terms: model, modeling file, Logical Unit, and model/LU
In a previous section, we defined a model as a logical construct, according to RUP. We followed that with a definition of modeling file. Then we precisely defined a Logical Unit as a physical persistence mechanism for units of logical model content. In the current versions of the software, when you create, open, close, rename, delete, or combine "UML models," you are actually performing those actions on an LU. In future versions, it will become possible to compose LUs into virtual containment hierarchies and the notions of "UML model" and LU will become distinct. Because the current design merges the concepts of UML Model and LU, we use the precise term model/LU to refer to LUs as currently implemented. This is to ensure that you can clearly understand when the narrative is referring to a model or UML model in the purely logical sense, as opposed to a model/LU.
Here is how we use these terms throughout the remainder of these model structure guidelines:
modeling file: Unless further qualified (for example, as a "concrete modeling file"), used when referring generically to either an .emx or .efx file.
Logical Unit: Used when referring to a Logical Unit as a tool for physically partitioning the logical contents of models
model or UML model: Used when discussing models general, or UML models in particular, in the purely logical sense
model/LU: Used when referring to a Logical Unit as the physical artifact that is operated on when you create, open, close, rename, delete, or combine "UML Models" in the user interface.
Allocating models to projects
As already mentioned, Eclipse supports several types of projects, and the products under discussion support two kinds of models:
- UML models (a type of conceptual model) persisted as LUs
- Concrete models (of various types) persisted as Eclipse projects that contain:
- A variety of file types that contain semantic information based on several technology-specific metamodels (Web diagrams, for instance) or Abstract Syntax Trees (Java™, for example)
- Files with .dnx extensions that contain notational information plus references to semantic elements (in other words, diagrams)
Remember these guidelines for allocating models to projects:
- Concrete models take care of themselves because, in essence: [concrete model] = [Eclipse Project]
- When you use UML models for the Java, C++, or C# transformations that are provided with the products under discussion:
- If you are using the transformations to iteratively reapply the UML-to-3GL transformation, or if you use both forward and reverse transformations and the Reconcile function, put the model/LUs in Eclipse projects that are separate from the corresponding concrete models (generated code).
- If you are using the transformations to replace elements, and you generally use mixed modeling (depicting both UML elements and 3GL elements in the same diagrams), put the model/LUs in the same Eclipse projects, along with the generated code. In that case, only design content (class-level modeling) should be in these model/LUs, and other content (use cases, analysis, design-level interaction or state modeling, and so forth) should be in separate model/LUs in other Eclipse projects.
In some cases, this may not be possible. For instance, there are specific conventions for using multiple Eclipse projects of different types (Java, Web, and Enterprise Java™Beans (EJBs), for example) to contain the implementation of an enterprise Java solution.
- Otherwise (when transformations are not in play), you could place conceptual (UML) models that are closely associated with specific concrete models in a folder named Conceptual Models within the same Eclipse project. This reflects the architectural principle of high functional cohesion, which is a recurring theme in the model structuring guidance that follows.
- In general, one of the purposes of Eclipse projects is to serve as a coarse-grained unit of configuration management. Therefore, it follows that if a particular model/LU has been structured to support strong ownership by a particular practitioner, then it should be a separate Eclipse project. A variant on this theme is the case in which a single practitioner strongly owns a collection of model/LUs that relate to a particular functional concern, but that practitioner models concerns across the lifecycle and wants to use separate model/LUs for use case, analysis, and design-level modeling specific to that functional concern. In such cases, it could make sense to place those multiple model/LUs in a single, functionally oriented Eclipse project and put all of the conceptual models for the whole development lifecycle together in that project. However, the previous guidance regarding use of transformations, if applicable, takes precedence over this.
- Model/LUs that reflect shared concerns and must be referenced by several other highly cohesive, loosely coupled model/LUs should be placed into UML projects that encapsulate those common concerns. For example, if a Use-Case Model is referenced by the contents of multiple projects, it should not reside in any of those referencing projects. Instead, it should reside in its own project or in a project that aggregates a number of other common concerns.
UML Model types
In the products under discussion, UML Models are not strongly typed, but you can follow a convention of using multiple model/LUs that are weakly typed. You can establish weak typing in either of two ways:
- Start with a blank model/LU, and establish its type simply by how you name it and what kind of content you place in it (including what UML profiles you apply to it).
- Create a model/LU based upon a predefined template that corresponds to a particular type of model. The products under discussion provide a default set of model templates for the model types described in this article. You can also create your own model/LUs to use as templates.
Either way, the so-called "type" of a model is really just a matter of convention concerning the naming and content of the model/LU and, possibly, the UML profiles that have been applied to it. For example, the tool will not prevent a model/LU that is, by convention, a Use-Case Model from also containing the Classes that realize the use cases (which RUP guidelines would consider to be part of the Analysis or Design model).
Summary of basic concepts
Suppose that you have several teams working on a set of applications that are part of the enterprise system for a large, integrated healthcare provider network that (improbably) has chosen to develop most of its own key applications. The following factors (and probably many others) must be considered in this situation:
- Several teams work on core or shared capabilities, such as these, that are used by multiple solutions:
- Services that expose and maintain consistency of the coded vocabularies that are used in healthcare applications
- Services that represent access to the key entities (patients, providers, payers, and so forth) that use the applications
- Shared applications that control which system users have access to which features of which applications
- A few teams work on various subsystems of the Laboratory Information System (LIS) solution, which handles these tasks:
- Processing lab service orders
- Managing and returning lab results
- One team works on a Radiology Information System (RIS) solution that handles these tasks:
- Processing radiology service orders
- Managing and returning radiology results
- Scheduling radiology services
Figure 1 illustrates some of the ways that these teams might choose to partition their models to reflect their ownership of specific functional capabilities. For the moment, we are ignoring questions of what the conceptual model types are, and we are not attempting to show what the corresponding implementation projects might look like. Instead, we need to focus on the various model partitioning strategies that they might choose.
Figure 1. Examples of ways that different teams might partition models
For concise illustration, Figure 1 shows all of the projects in a single workspace. It is more likely that, in a real-world situation, each team's workspace would look different. For instance, assuming that no direct dependencies exist between the Radiology and Laboratory systems, the workspaces of the teams working on the LIS and RIS might include only the shared Eclipse projects that they depend on plus their own Eclipse projects. The workspace of a team that owns the shared coded vocabularies might contain only a Core Capabilities model/LU within its own Eclipse project. It is also likely that the LIS team would allocate each of its multiple model/LUs to its own Eclipse project, so that those could be worked on and the configurations could be managed more effectively as independent units.
In this scenario, we see these results:
- The team working on the RIS has chosen not to partition their model. Perhaps that reflects that, in this team, one person does all of the modeling and the others write code.
- The team working on the LIS has chosen to partition by using multiple model/LUs. Perhaps this reflects that the LIS architecture is very strong and consists of a number of highly cohesive subsystems that are lightly interdependent, so that, individual members seldom need to see content that they "own" within the context of the entire LIS content.
- The teams working on the shared capabilities have used both model/LUs and Fragments in their partitioning strategy. Perhaps this is because, had they used Fragment partitioning exclusively, the single model/LU would have proven to be inconvenient as the basis for publishing the model. Or perhaps their choices simply reflect their preferences regarding the depth of logical containment structures in which they organize the contents of their model.
Team development and model management considerations
The fundamental challenge of teams working collaboratively on any type of specification, regardless of whether it is a set of documents, a set of models, or a code base. is this: controlling change.
With document-based specifications, a team has a few choices:
- Those responsible for specific aspects of a specification that are included in a common document do their work serially. Just turn on the change-tracking function and pass the document from one contributor to the next. Periodically, some authority reviews and accepts or rejects changes to establish a baseline.
- Contributors work in parallel. To establish baselines, the authority must perform a multiple-document merge of all of the contributors’ copies of the shared document (using either the capabilities of the word processor or the text merge capabilities of a configuration management tool).
- The specification is partitioned into multiple documents that are strongly owned by individual team members who can work concurrently without creating a need for merges. But often (to avoid creating redundant copies of certain portions of the spec) there might be a designated strong owner of common content who makes changes as required by the other team members. Alternatively, just the common pieces might be managed by using a merge strategy.
In each case, a policy must exist regarding how, and how often, document versions are preserved, protected, and, if necessary, merged and reconciled.
For a code base, the issues are similar. However it's not an option to use change tracking in source files (code editors and IDEs are not designed for that). Therefore, when source files are not strongly owned and, instead, are worked concurrently by multiple team members, the team must rely on the delta tracking and text merging capabilities of a configuration management system. Often, code bases also introduce the challenge of working on multiple versions or streams of the same code that represent different release versions in parallel. The need to port some changes from one stream to another is a task that, in some cases, can also be handled by the merge capabilities of a configuration management tool.
Models can represent all of the same challenges as code, particularly as models begin to express a level of detail approaching that expressed by code. Models that are more abstract might lend themselves to a more lightweight change management approach, perhaps closer to what would be practiced with spec documents. But models can also represent special challenges of their own:
- The number and directionality of linkages (interdependencies) within models is more complex than code bases.
- Where code would implement a dependency simply by referencing the supplier by name, models implement relationships as actual semantic elements as defined by the metamodel (for example Association, Generalization, Realization). These elements must be carefully managed to avoid overly tangled models.
- Models support a richer set of relationship types than those found in code. Thus, for instance, not only can there be cases where a model element uses or specializes another element (similar to how one Java class might use or extend another), but model elements might also refine other elements (in terms of abstraction level).
- Model semantic elements might be referenced in common by multiple diagrams that depict different aspects of the overall specification.
- In models, move-refactoring changes have high impact, because Package and namespace containment is specified more rigorously than in code. In models it is done in a persistent (static) manner with actual logical links; whereas, in code, it is handled dynamically by the build engine, driving off of fully qualified names
- Another cause of problems is that models often use first class elements (for example, associations and abstractions) for what in code would be just simple references by name. These elements must be carefully managed or they can lead to tangled models.
When your practice involves concurrent team development (shared ownership) of model files, uncoordinated changes can be made to the files that require merging. Parallel development processes under which that might occur include these examples:
- A modeling file can be checked out for nonexclusive access.
- A modeling file is worked on in parallel in the development streams of multiple practitioners.
If it turns out that some of the changes conflict with one another, the merge is considered nontrivial, because someone must make decisions about which of the conflicting changes should prevail. (A trivial merge is one in which the uncoordinated changes are not in conflict, and the model merge tool can perform the merges without human intervention.) The products under discussion provide powerful capabilities to support both trivial and nontrivial model merges, but nontrivial merges can represent nontrivial work for the individual who mediates them.
Avoidance of nontrivial model merges should be the primary goal when formulating a model organization strategy in support of team modeling.
Modeling in teams: General principles
Before getting into a discussion about model organization techniques, we must discuss the general principles that should guide a model partitioning strategy. Keeping in mind that the goal is to avoid nontrivial merges, there are two primary principles: strong architecture and strong ownership. They go hand-in-hand, because strong architecture enables strong ownership.
Principle 1: Strong architecture
In this context, strong architecture refers primarily to decomposition. The principles of architectural decomposition that apply here are the same ones that drive object-oriented development, component-based design, and service oriented architecture.
- Strive for maximum functional cohesion.
- Strive for maximal decoupling of business functions.
- Group things together that must remain tightly coupled, and then isolate those groupings from one another.
If your resulting decomposition is too detailed, then, depending on your staffing model (remember: strong architecture and strong ownership go hand-in-hand), you may want to start grouping those details into high-affinity aggregates (in UML Modeling terms, this means Packages).
There will always be some things that must be touched by many units of decomposition (and in some cases, by all units). Group those things together in common subcategories or groups, and plan each development iteration so that it includes a some leeway at the start of the iteration that focuses on stabilizing the common things.
There is also a time element. As you move from a more abstract to a more concrete understanding of a solution, your sense of the best organization for the architecture (and model) will evolve. Accordingly, plan to do model refactoring (reorganization) on an ongoing basis.
If you look at your solution and everything appears to be highly interdependent and tightly coupled, either your architecture needs work or there is something about the nature of your problem domain that means you truly can't decompose the problem. In either case, you have only a couple of choices:
- Resolve to assign the project to a very small team that shares a physical space with members who communicate very actively with each other regarding any changes that they make that could affect other artifacts (apply agile development principles with respect to your modeling work).
- Be prepared to perform lots of nontrivial merges.
A good way of looking at strong architecture for models is to examine how strong architecture manifests in code bases. Here, we clearly see the principles of stabilizing common pieces first and then successively layering more specialized pieces. Consider a hypothetical architecture for a solution based on Java™ 2 Platform, Enterprise Edition (J2EE) technology, as depicted in Figure 2.
Figure 2. Hypothetical architecture based on a J2EE solution
Here, we see that Java™ Platform, Standard Edition (Java SE) APIs are quite stable. Java 2 Platform, Enterprise Edition (J2EE) APIs are nearly as stable as Java SE. Both are used by all services and applications. Above the basic J2EE framework, we see things like Access Layer for "Common" Data and Common Utilities. These are, again, things that are used by all services and applications, so they are isolated from the services and applications and are stabilized early in projects. Above these, there is a layer of shared or common services (or components, if you prefer). These depend upon the layer below, and all applications that use them depend on them, so, ideally, they are pretty stable by the time that applications are built atop them.
These principles translate to models in a generally straightforward manner. But where code is concerned, it is intuitive (at least to a developer) that the providers in the architecture (those that provide something that other parts use) are interfaces (APIs), and the consumers (those that use what’s provided) are implementations. In models, there is no such distinction. Subject to the constraints of the UML metamodel, any model element might be a provider, a consumer, or both. It is up to the modeling team to establish conventions of model organization and interdependency.
Principle 2: Strong ownership
Setting aside the issue of specialized skill sets, after you have established strong architectural decomposition, it should prove fairly straightforward to map strong ownership of architectural components to individual practitioners. When each unit of model organization can be worked on exclusively by one practitioner, the potential for introducing conflicts is limited to only those places where cross-unit relationships are required (refinement or use of other dependency, association, depiction in diagrams, and so on). Where strong ownership exists, then regardless of whether the organizational units consist of entire model/LUs or Packages within a model/LU, and regardless of whether Packages are configured to be treated as Fragments, merges will be mostly trivial, thus fast and relatively painless. The same may be said if each organizational unit can be worked exclusively by a small team in the same location, with members who can coordinate their efforts to avoid introduction of conflicting changes within their subunits.
Modeling in teams: Model partitioning
In the Basic concepts and terminology section, you learned that Rational architecture management software offers two ways to partition the physical persistence of UML models: model/LUs and Fragments. (Important: If you missed that part, go back and read it now.) This section answers these questions:
- When should I partition?
- When partitioning, should I use multiple model/LUs or Fragments as the mechanism?
There are several drivers of those decisions:
- Model size and complexity. As models become (or are expected to become) unwieldy due to their size, or perhaps due to the depth of their logical containment (Package) structures, it may become necessary to partition them into multiple model/LUs or Fragments to improve performance or the convenience of certain operations (such as reporting).
- Model organization. Models that have strong architecture and reflect strong ownership are less likely to require partitioning. The architectural strength of the model organization may also influence the choice of a configuration management solution, which is the third driver.
- Choice of configuration management (CM) software. Some CM tools can support highly partitioned models better than others.
- Use of transformations. The designs of some transformations reflect constraints that may influence model partitioning decisions. For instance, the standard Java-to-UML transformation that is provided with some of the products under discussion is designed to permit only a model/LU as its target. Thus, if there is a need to transform multiple Java projects individually, it may work best to establish a separate model/LU for each one.
Actually, it is pretty easy to anticipate when models will need to be partitioned for reasons of size or complexity. Partitioning is required mainly when files grow too large for the machines that are in use in the user community. For example, a model that grows to 30 MB on disk becomes very difficult to work with day-to-day on a 1 GB RAM machine. A guideline is to partition with the goal of having 5 to 10 MB worth of models in RAM at any time. An alternative is to upgrade the RAM, which has benefits beyond modeling. A 2 GB RAM machine with no swap file performs almost every Eclipse operation faster. It also makes very large models perform very well again.
When to partition models for team reasons also is a short discussion. Try to do so only when the model files (model/LUs or Fragments) that you create can typically be worked on both exclusively (only one team member has a file checked out at any point in time) and in isolation (most changes can be made to the file without also requiring access to other files that contain related model elements). This is because there is a tradeoff. Partitioning narrows the scope of logical context that is available during merge sessions. As the logical context diminishes, the human mediator must increasingly rely on guesswork when making merge decisions. Simply put, fewer partitions means better context for merging.
This question often arises: Can nontrivial merging be avoided by portioning models into multiple model/LUs or Fragments? In a word: No.
However, that "no" comes with this caveat: If you use a CM solution that effectively supports a pessimistic locking scheme, then physical partitioning can be a way to avoid merges. It also limits the extent of what you can do. When working with a discrete partition, for instance, you can't perform a rename refactoring without being able to check out and modify every other partition that needs to be updated by the action.
Architectural interdependencies are a logical phenomenon, not a physical one. When you partition a model into multiple model/LUs or Fragments, the representations of the element interdependencies become simply cross-file references instead of in-file references. This does not make it any easier to resolve conflicts (in fact, it makes it harder). And when you introduce cross-file references, you introduce potential points of breakage (see sidebar).
Yet some manner of partitioning is often unavoidable. When you must partition, you might wonder how you should you choose between the use of model/LUs rather than Fragments as a physical partitioning mechanism. Keep these basic points in mind regarding model/LUs and Fragments:
- Fragments preserve the appearance of the hierarchical containment structure of a model, as reflected in the Project Explorer.
- Model/LUs appear as separate top-level containers in the Project Explorer and cannot (at this time) be nested within other logical containment structures.
- Fragments will slow down the CM system, because it must process more files.
- Fragments can also slow down some model processing operations. such as reporting, but they can speed up certain other operations.
- Fragments can be merged, but when you merge them, you will not have much information about the contextual model content, and conflicting changes increase the likelihood of model inconsistencies after a merge.
- Use of model/LUs generally provides an adequate amount of context to support comprehensible and safe merging during a parallel check-in unless, for some reason, you make your models very small (such as when you are trying to use models to Package small bits of UML content that you are treating as reusable assets).
- A model/LU merge mechanism exists with the IBM® Rational® ClearCase® remote client user interface. It restores context for widespread. conflicting changes but also slows the merge process
- Model/LUs are well-supported by the Unified Change Management (UCM) mechanism; whereas, limitations exist regarding Fragments.
- Model/LUs work with all ClearCase user interfaces and the command line; whereas, limitations exist regarding Fragments.
Taking these points into account, consider these as preferable options:
- If the team is located in one place and prefers to avoid merging (and especially if the models are very large), consider using Rational ClearCase with dynamic views sharing a single integration stream and reserved checkouts, and partition the model as Fragments.
- If the team is globally distributed, consider one of these alternatives, instead:
- Use IBM® Rational® ClearCase MultiSite®, with major model/LU components worked on in each location and merged centrally with UCM (this works best if the model/LUs are not, themselves, Fragmented).
- Use a Concurrent Versions System (CVS) with the model partitioned as multiple model/LUs.
- Use CVS with the model partitioned as Fragments. This requires the practice of strong ownership to avoid merging as much as possible, because any merges will have to be performed with very little context.
Let's look at how everything we have discussed so far about team modeling applies to certain scenarios. We’ll organize the scenarios according to the first principles: Are the models strongly designed, and is strong ownership practiced? The secondary consideration will be: What is the CM solution?
Well-designed model architecture:
- Package structure reflects high functional cohesion, loose coupling, and strong individual practitioner ownership.
- Packages that are used by multiple functional areas (and thus have many practitioners as consumers or co-owners) are isolated from strongly owned content, are stabilized early, and changes to them are tightly controlled by process conventions.
In rank order of best to worst, here's what typically works best:
- Use a single non-Fragmented model/LU. Merges will happen, but nontrivial merges should be infrequent and, when they occur, will be in full-model context.
- Use a closure of multiple model/LUs with cross-file references. Merges will happen less frequently, yet context remains reasonably high.
- Use a single model/LU with fine-grained Fragmentation. Use a configuration management solution (such as ClearCase, with shared dynamic views on an integration stream with reserved checkouts enforced) that effectively locks changed files (Fragments).
- Use a configuration management system such as ClearCase or CVS that can synchronize the whole workspace and allow merging of individual Fragments when rare conflicts occur at that level.
Not-so-well-designed model architecture:
- Packages are highly interdependent
- Related concerns are not strongly grouped or strongly owned.
Options in rank order, best to worst:
- Use a single model/LU with fine-grained Fragmentation in combination with a configuration management solution that locks changed files (Fragments), such as ClearCase, using shared dynamic views on an integration stream with reserved checkouts enforced.
- Use a single nonfragmented model/LU. Use ClearCase with private views (static or dynamic). Practice UCM rebasing and delivery to maintain integrity of each practitioner’s private copy up to the last moment of integration. UCM also provides atomic delivery, so that failed merges can be backed out easily. Merges become challenging and lengthy as the size of the collision set grows. It will grow between integrations, so integrate as frequently as possible.
- Use a closure of many interrelated model/LUs. Merges will be complex, and context for those merges will be lower, but model/LU footprints will be smaller and collisions slightly less frequent. Again, integrate frequently to reduce the likelihood of conflicting changes at the file level.
Let's summarize again:
Partitioning models into multiple files isn't nearly as important as logically structuring models to enable multiple practitioners to work on a model file in parallel without introducing conflicting changes. Strong architecture and strong ownership are the keys to being most productive with the tools if these conditions exist:
- If you lack strong architecture, or if you have strong architecture but you lack strong ownership, you will experience frequent nontrivial merging that no amount of model partitioning can relieve.
- If you have strong architecture and strong ownership, you will greatly reduce (but not eliminate) the frequency of nontrivial merging. You will not eliminate it because there will always be component interdependencies. The aforementioned common elements are one example, although hardly the only example.
The good news is that the products under discussion handle model merging far faster and more effectively than any other modeling tool available.
Figure 3. Summary of model management and partitioning considerations
Techniques for partitioning models
The following techniques are available for managing models and partitioning models:
- Creation of a new model/LU in a project
- Breaking off a Package of an existing model/LU so that the Package becomes a separate model/LU
- Combining two existing model/LUs into a single model/LU (which is called fusing)
- Partitioning a model/LU into Fragments
- Absorbing Fragments into model/LUs (the inverse of creating Fragments)
The specific menus, dialogs, and wizards that you use to do these things are covered in the Help section.
Other useful tools and techniques
As previously noted, when each unit of model organization can be worked on exclusively by one practitioner, the potential for introducing conflicts is limited to only those places where cross-unit relationships are required (refinement, use or other dependency, association, depiction in diagrams, and so forth). Many of the cross-unit relationships in a typical model are the result of depicting the same semantic elements in multiple diagrams that express different aspects of the problem or solution. The following techniques are useful for minimizing and managing these kinds of relationships:
In contrast to typical diagrams, wherein you manually place the elements that you want to depict, the contents of a query-driven diagram are determined by running a query against current-state model contents to repopulate the diagram to reflect semantic changes that have occurred since the diagram was last viewed. The use of query-driven diagrams instead of hand-drawn diagrams can reduce the number of diagram-related merge conflicts that might otherwise occur. There are two types of query-driven diagrams supported by the products under discussion: Topic Diagrams and Browse Diagrams.
- Topic Diagrams: To create a Topic Diagram, select a topical model element or set of elements, and then define what other elements you want to show in the diagram, based on the types of relationships that they have to the topical elements. As the semantic content of the model changes, the Topic diagrams that depict the changed semantic elements will adjust accordingly. The definition of a named Topic Diagram can be persisted so that the same query can be rerun at any time. Topic Diagrams can be persisted within UML Model files, but they can also be persisted directly in Eclipse projects. The fact that they are automatically rendered means that they can be ignored by model merges, thereby making them an attractive alternative to hand-drawn diagrams from a team modeling standpoint.
- Browse Diagrams: These are similar to Topic Diagrams in that you begin by selecting topical elements and then define filters that govern which kinds of related elements will be depicted. However, Browse Diagrams do not have a persisted definition. Their purpose is to facilitate discovery and understanding of model content by enabling you to graphically navigate through a model. After a Browse Diagram with a chosen focal element has rendered, you can then double-click on any of the related elements to create another Browse Diagram that has that element as the focal element. This is repeatable indefinitely. You can also reset the relationship filters as you browse. Plus, you can navigate forward and backward through a stack of generated Browse Diagrams (or navigate back to the home diagram), just as the name implies.
Architecture Overview Model
As part of your overall modeling work, you may find it useful to define an Architecture Overview Model to capture a high-level view of your architecture that helps you understand how to organize and partition your other models (among other possible uses). The point at which you create such a model and the way that you refactor it as your projects evolve could depend upon a number of factors, such as your overall development process and your choice of modeling approach (for example, a basic RUP approach or business-driven development approach). As such, any detailed discussion of their use belongs in the style-specific sections of these guidelines.
To provide a sense of the scope and abstraction level that such a model might reflect, consider Figure 4. This figure illustrates how code bases are separated into modules for strong architecture and ownership. But it is also indicative of the kind of architectural overview (modules and their dependencies) that you might express in an Architecture Overview Model.
You could also use an Architecture Overview Model to sketch your anticipated workspace structures, as Figure 4 shows.
Figure 4. Example of an Architecture Overview Model
Another possible use of an Architecture Overview Model is to capture informal diagrams of various aspects of your solution, such as the high-concept diagram of an auction system in Figure 5.
Figure 5. Diagram of architecture for an auction system
Of course, an Architecture Overview Model can be used for any combination of such purposes. You could also use it as a place to gather diagrams from the more detailed models of a solution, so that you can depict various architecturally significant viewpoints of that solution. More formally, you could treat it as the equivalent of the RUP Software Architecture Document. Given the tools for organizing models that are available in the products under discussion (such as support for multiple model files with cross-file references and diagram links, it becomes an almost trivial matter to do this (the next section, "General Techniques for Organizing Logical Content of Models," describes those tools). For instance, if you want to create a model that presents the "4+1 Views of Architecture," you might do something along the lines of what you see in Figure 6.
The example is shown without a Package for Process View, because the system in this example doesn't exhibit much in the way of concurrency.
Figure 6. Possible high-level organization of an Architecture Overview Model depicting 4+1 views of Architecture
- Just create a model/LU, and populate it with a simple set of Packages that corresponding to the 4+1 Views (
<<perspective>>Packages are also described in the next section).
- Then, just create diagrams in the Software Architecture Document model using these approaches:
- Create diagrams that are composed using UML semantic elements from other model files and that depict new views that were not found in those other model files but are needed as part of the architecture document.
- Create diagrams that are composed of geometric shapes or ad-hoc UML elements that reside in the Software Architecture Document model file. (Such UML elements should be only for purposes of documentation or clarification and should have no semantic significance to the actual implementation of the solution being described.)
- Create diagrams that simply contain links to the existing diagrams in other model files. This technique will work well if the architecture document model file is to be distributed to readers along with the other model files. If the architecture document is going to be published on the Web instead, follow one of the other approaches).
Additional information about modeling in teams
If you are preparing to embark on a team modeling practice, an indispensable source of wisdom is the series of articles on "Comparing and merging UML models in IBM Rational Software Architect" (Parts 1 through 7) by Kim Letkeman and available on developerWorks. Part 5 is particularly valuable (see Resources). Parts 1 through 4 and Part 6 are valuable as well, but they are a bit older and do not reflect more recent enhancements to the Rational team modeling capabilities, such as compare-merge improvements and support for model Fragments. Part 7, “Ad-hoc modeling: Fusing two models with diagrams," covers the scenario where two individuals have independently developed their own models of a proposed solution and now want to merge them (a situation that is (not as uncommon as you might think).
If you are interested in reading more about the general principles of managing dependencies to achieve strong cohesion and low coupling, a good source is Designing C++ Applications Using the Booch Method, by Robert C. Martin (Prentice Hall, 1995), Chapter 3, the section on Cohesion, Closure, and Reusability.
General techniques for organizing logical content of models
The primary tools for organizing the logical content of UML Models are model/LUs (as discussed in previous sections) and UML Packages. UML Packages serve two primary purposes:
- Logically partitioning and organizing model information within a model by grouping elements that correspond to a specific subject matter in the problem or solution domain
- Separating different types of model information. such as interfaces, implementations, diagrams, and so forth:
- Grouping elements to define and control their dependencies on other elements
- Grouping diagrams that provide alternative views on the same model
- Establishing namespaces
- For model elements
- For implementation artifacts generated from model elements (this may involve mappings between model and implementation language namespaces)
- For a unit of reuse
The products under discussion also support additional organizational tools that help primarily with how models can be navigated and how diagrams are grouped. (One navigation capability, Browse Diagrams, was discussed in the previous section.)
Represent viewpoints by using <<perspective>> Packages
In cases where it is desirable to see elements organized in more than one way, you can create additional Packages with diagrams that depict the alternate organizational schemes. This same technique can be useful anywhere that there is a need to represent a particular view of model content that cuts across the model’s packaging scheme. The products under discussion support this technique by providing a
<<perspective>> Package stereotype as part of its UML base profile. You can think of a
<<perspective>> Package as generally the equivalent of RUP for Model-Driven Systems Development or IEEE 1471- 2000 Viewpoint.
<<perspective>> stereotype to a Package does several things.
- Visually identifies that Package as representing a particular viewpoint
- Supports a model validation rule that warns you when semantic elements are placed in
- Designates Packages that should be bypassed by Rational transformations
For the most part,
<<perspective>> Packages are meant to hold only diagrams that depict views based on the alternate organizational concern or application viewpoint. However, there are a couple of situations where you might want to place semantic elements into
- To prevent those elements from being processed by transformations.
- To depict behavior in the
<<perspective>>. The products under discussion treat behavioral (or "machine") diagrams as "canonical," meaning simply that the content of such diagrams must fully and exclusively reflect the semantics of the UML semantic element that owns the diagram:
- An Activity diagram must be owned by an Activity and must fully and exclusively depict the semantics of that Activity.
- A Sequence diagram or Communication diagram must be owned by an Interaction and must fully and exclusively depict the semantics of that Interaction.
- A State Machine diagram must be owned by a State Machine and must fully and exclusively depict the semantics of that State Machine.
<<perspective>>Package that expresses on the semantic details you wish to express within that
In such cases, just ignore the validation rule warnings about semantic content in the
<<perspective>> Package stereotype.
Use interdiagram navigation
In the products under discussion, there are two mechanisms that support interdiagram navigation:
- You can drag a diagram node from the Project Explorer onto some other "host" diagram. Then you can double-click the resultant icon on the host diagram to open the referenced diagram.
- Any Package in a Rational UML model can have a default diagram. The default diagram in a Package offers special behavior: If you place the Package itself onto any diagram in any model, you can then double-click on the Package icon, and that will open the Package's default diagram.
You can set diagram Preferences so that whenever you create a new UML Package in a model, a Main diagram is automatically created and set as the default diagram for that Package. The default settings in Preferences for the products under discussion are that a free-form diagram is created and set as the default diagram of each new Package. However, the default settings can be changed so that other types of main diagrams can be created and so that a diagram can be created but not designated as the default diagram for a Package. It is also possible to designate another diagram in any Package as the default diagram of the Package.
These mechanisms support the following organizational guidelines, which can apply to models of any type:
- Compose the Main diagram (or other default diagram) of each model/LU to depict:
- Each top-level Package in the model/LU
- The diagram icons for any other diagrams that reside in the root Package of the model/LU). In other words, do not depict the icon for the default diagram itself.
- Compose the Main diagram (or other default diagram) of each top-level Package to depict:
- The Packages that it directly contains the diagram icons for
- Any other diagrams that it directly contains.
- Repeat this pattern for each successively lower level of Packages.
What to model and how much is "enough"
This is the shortest and yet the most important section in this article.
There is an exquisite tension when it comes to providing modeling guidance. Those who are new to modeling want to be given highly prescriptive guidance so they can get going quickly. But having such guidance can lead them into the trap of doing more modeling than is really necessary.
The subsequent articles in this series will present fairly prescriptive guidance about how to structure models when practicing a particular modeling style, such as "Classic" RUP, Business-Driven Development for SOA, or Model Driven Systems Development. These can provide value to a broad cross-section of readers, but what it’s important to take from them is a sense of possibilities, not a set of rules. They are meant to convey "Here is how you might do it," not "Here is how you must do it."
An important exception is that you might use a transformation that requires the input to be a model that is composed in a very specific way. If you will be using such transformations, then formulate, publish, and educate team members with highly prescriptive guidance on how to create such source models.
Much of the value of models lies in abstracting away details to focus on separated concerns. How much you model depends on what level of abstraction you need to understand. your problem or solution domain, how you drive automation in your development process, and how you communicate to project stakeholders. The best modeling guidance of all is this:
Model something only if doing so has recognized business value.
Accordingly, the style-specific guidelines in this series strive to identify the business values of particular modeling activities and, thereby, help you decide how much of the guidance applies to your situation.
|English (V7)||ArchtMgt_SW_series_Part1.pdf||318 KB|
- More context for this series of articles is available at the series summary page.
- Part 2 of this series, IBM Rational Architecture Management Software model structure guidelines: Part 2. Classic Rational Unified Process.
- Read Comparing and merging UML Models in IBM Rational Software Architect: Part 5. Model management with IBM Rational ClearCase and IBM Rational Software Architect Version 7 and later (IBM® developerWorks®, July 2007) and browse the other six parts of this series.
- For more about the general principles of managing dependencies to achieve strong cohesion and low coupling, read Designing Object Oriented C++ Applications Using The Booch Method, by Robert C. Martin (Prentice Hall, 1995), Chapter 3, the section on Cohesion, Closure, and Reusability.
- Visit the Architecture area of developerWorks for technical resources and best practices.
- Explore the articles and tutorials in the Architecture Technical Library on developerWorks The library includes a wide range of technical articles and tips, tutorials, standards and specifications, and IBM® Redbooks®.
- Visit the Rational software area on developerWorks for technical resources and best practices for Rational Software Delivery Platform products.
- Subscribe to the developerWorks Rational zone newsletter. Keep up with developerWorks Rational content. Every other week, you'll receive updates on the latest technical resources and best practices for the Rational Software Delivery Platform.
- Subscribe to the Rational Edge newsletter for articles on the concepts behind effective software development.
- Browse the technology bookstore for books on these and other technical topics.
Get products and technologies
- Download trial versions of IBM Rational software.
- Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Tivoli®, and WebSphere®.
- Join the Architecture forum on developerWorks to get connected with others and take advantage of their expertise and experience to get you tips that can help you as a developer or architect to use the principles of service-oriented architecture (SOA).
- Check out developerWorks blogs and get involved in the developerWorks community.
- Rational Software Architect, Data Architect, Software Modeler, Application Developer and Web Developer forum: Ask questions about Rational Software Architect and other modeling products.