Unified Modeling Language version 2.0

In support of model-driven development

So-called "model-driven" development (MDD) methods, which are based on higher levels of abstraction and greater use of automation compared to traditional methods, have already demonstrated their potential for radical improvements in the quality of software and the productivity of development. Since the role of modeling languages is crucial to the success of MDD, a major revision of the industry-standard Unified Modeling Language (UML) was recently completed. While several important new modeling capabilities were added -- such as the ability to more accurately capture software architectures -- the dominant characteristic of this revision is the heightened precision of the language definition that enables greater levels of automation. This article explains how this was achieved and also describes other highlights of UML 2.0.


Bran Selic, Distinguished Engineer, IBM

Bran Selic is an IBM Distinguished Engineer and works at IBM Rational software Canada. He has close to 30 years of experience in the software industry particularly in the design and development of large-scale real-time systems. Bran received his Bachelor's degree in electrical engineering in 1972 and a Master's degree in systems theory in 1974, both from the University of Belgrade in Yugoslavia. He has been living and working in Canada since 1977.

11 November 2005 (First published 21 March 2005)

Also available in Chinese Russian


The early part of the 1990's saw a greatly heightened interest in the object paradigm and related technologies. New programming languages based on this paradigm (such as Smalltalk, Eiffel, C++, and Java) were devised and adopted. These were accompanied by a prodigious and confusing glut of object-oriented (OO) software design methods and modeling notations. Thus, in his very thorough overview of OO analysis and design methods (covering over 800 pages), Graham lists over 50 seminal methods [Graham01]. Given that the object paradigm consists of a relatively small set of fundamental concepts (including encapsulation, inheritance, and polymorphism), there was clearly a great deal of overlap and conceptual alignment across these methods -- much of it obscured by notational and other differences of no consequence. This caused a great deal of confusion and needless market fragmentation -- which, in turn, impeded the adoption of the useful new paradigm. Software developers had to make difficult and binding choices between mutually incompatible languages, tools, methods, and vendors.

For this reason, when then Rational Software proposed the Unified Modeling Language (UML) initiative -- led by Grady Booch, Ivar Jacobson, and Jim Rumbaugh -- the reaction was immediate and positive. The intent was not to propose anything new, but -- through collaboration among top industry thought leaders -- to consolidate the best features of the various OO approaches into a single, vendor-independent modeling language and notation. Because of that, UML very quickly became a widely practiced standard. Following its adoption by the Object Management Group in 1996, it became an accepted industry standard [OMG03a] [OMG04] [RJB05].

Since then, UML:

  • Was adopted and supported by the majority of modeling tool vendors
  • Became an essential part of the computer science and engineering curricula in universities throughout the world and in various professional training programs
  • Is being used by academic and other researchers as a convenient common language

UML also helped raise general awareness of the value of modeling when dealing with software complexity. Although this highly useful technique is almost as old as software itself -- with flowcharts and finite state machines as very early examples -- most developers have been slow to accept it as anything more than a helpful minor tool. It is fair to say that this is still the dominant attitude, which is why model-driven methods are encountering a great deal of resistance in this community.

There are some very valid reasons for this situation (as well as some not-so-valid reasons, such as a general human distrust of innovation). The main reason is that software models can often be terribly inaccurate in unpredictable ways: clearly, the practical value of any model is directly proportional to its accuracy. If a model cannot be trusted to tell you true things that you want to know about the software system that it represents, then it is worse than useless, since it can lead you to the wrong conclusions. The key to increasing the value of software models, then, is to narrow the gap between them and the systems they are modeling. Paradoxically, as this article discusses later, this is easier to do in software than in any other engineering discipline.

Some of the inaccuracy of software models can be blamed on the extremely detailed and sensitive nature of current programming language technologies. Minor lapses and barely-detectable coding errors -- such as misaligned pointers or uninitialized variables -- can have enormous consequences. For instance, there is a well-documented case where a single missing break in one case of a nesting switch statement resulted in the loss of long-distance telephone service for a large part of the United States, causing immense economic losses [Lee92]. If such seemingly minute detail can have such dire consequences, how can we trust models to be accurate (since models, by definition, are supposed to hide or remove detail)?

Model-driven development

The solution to this conundrum is to formally link a model to its corresponding software implementation through one or more automated model transformations. Perhaps the best and most successful example of this is a compiler, which translates a high-level language program into an equivalent machine language implementation. The model, in this case, is the high-level language program -- which, like all useful models, hides irrelevant detail about the idiosyncrasies of the underlying computing technology (such as internal word size, the numbers of accumulators and index registers, the type of ALU, and so on).

It is interesting to note that few, if any, other engineering media can provide such a tight coupling between a model and its corresponding engineering artifact. This is because the artifact that you are modeling is software rather than hardware. A model of any kind of physical artifact (for instance, an automobile, building, bridge, and so on) inevitably involves an informal step of abstracting the physical characteristics into a corresponding formal model (such as a mathematical or scale model). Similarly, implementing an abstract model using physical materials involves an informal transformation from the abstract into the concrete. The informal nature of this step can lead to inaccuracies that, as noted above, can render the models ineffective or even counter-productive. In software, however, this transformation can, in principle, be performed formally in either direction.

The potential behind this powerful combination of abstraction and automation has led to the emergence of new modeling technologies and corresponding development methods collectively referred to as model-driven development (MDD) [Brown04] [Booch04]. The defining feature of MDD is that models have become primary artifacts of software design, shifting much of the focus away from the corresponding program code. They serve as blueprints from which various automated and semi-automated processes derive programs and related models. The degrees of automation being used today in MDD vary from deriving simple skeleton code to generating complete automatic code (which is comparable to traditional compilation). Clearly, the greater the levels of automation, the more accurate are the models and the greater are the benefits of MDD.

Model-driven methods of software development are not particularly new, and they have been used in the past with varying degrees of success. The reason they are receiving much more attention now is that the supporting technologies have matured to the point where much more can be automated practically than was the case in the past. This is not just in terms of efficiency, but also in terms of scalability, as well as the ability of such tools to be integrated with legacy tools and methods. This maturation is reflected in the emergence of MDD standards that result in the commoditization of corresponding tools and obvious benefits to users. One of these standards is the revised version of the Unified Modeling Language.

The rationale behind revising UML 1

UML 2.0 is the first major revision of the UML standard, following a series of lesser minor revisions [OMG04] [RJB05]. Why was it necessary to revise UML?

The primary motivation for revising the language came from the desire to better support MDD tools and methods. In the past decade, a number of vendors had developed UML-based tools that supported significantly greater levels of automation than traditional CASE (computer-aided software engineering) tools.

To support these higher forms of automation, it was necessary to define UML in a much more precise manner than provided for in the original standard. (In tune with the times, the original UML standard was primarily designed to serve as an auxiliary tool for the informal capture and communication of design intent). Unfortunately, these definitions varied from vendor to vendor, threatening once again to lead to the kind of fragmentation that the original standard was intended to eliminate. A new version of the standard could rectify this.

In addition, after close to a decade of practical experience in using UML -- as well as the emergence of important new technologies (such as web-based applications and service-oriented architectures) during that time, new modeling capabilities were identified. While practically all of these could be represented by appropriate combinations of existing UML concepts, there were clear benefits to introducing some of these as first-class built-in language features.

Finally, during the same extensive period, the industry has learned a lot about suitable ways of using, structuring, and defining modeling languages. For example, there are now emerging theories of meta-modeling and of model transformations, which impose certain demands on how a modeling language should be defined. While we still lack a consolidated and systematic theory of modeling language design that is comparable to the current theory of programming language design, these and similar developments needed to be incorporated in UML to ensure its utility and longevity.

The highlights of UML 2.0 functionality

The new developments in UML 2.0 can be grouped into the following five major categories, listed in order of significance:

  • A significantly increased degree of precision in the definition of the language: This is a result of the need to support the higher levels of automation required for MDD. Automation implies the elimination of ambiguity and imprecision from models (and, hence, from the modeling language) so that computer programs can transform and manipulate models.
  • An improved language organization: This is characterized by a modularity that not only makes the language more approachable to new users, but also facilitates inter-working between tools.
  • Significant improvements in the ability to model large-scale software systems: Some modern software applications represent integrations of existing stand-alone applications into more complex systems of systems. This is a trend which will likely continue resulting in ever more complex systems. To support such trends, flexible new hierarchical capabilities were added to the language to support software modeling at arbitrary levels of complexity.
  • Improved support for domain-specific specialization: Practical experience with UML demonstrated the value of its so-called "extension" mechanisms. These were consolidated and refined to allow simpler and more precise refinements of the base language.
  • Overall consolidation, rationalization, and clarifications of various modeling concepts: This resulted in a simplified and more consistent language. It involved consolidation and -- in a few cases -- removing redundant concepts, refining numerous definitions, and adding textual clarifications and examples.

We'll now delve into each of these in more detail.

Degree of precision

Most early software modeling languages were defined informally, with little attention paid to precision. More often than not, modeling concepts were explained using imprecise and natural language. This was deemed sufficient at the time, since the majority of modeling languages were used either for documentation or for what Martin Fowler refers to as design sketching [Fowler04]. The idea was to convey the essential properties of a design, leaving detail to be worked out during implementation.

However, this often led to confusion because models expressed in such languages could be -- and often were -- interpreted differently by different individuals. Furthermore, unless the question of model interpretation was explicitly discussed up front, such differences could remain undetected, to be discovered only in the latter phases of development (when the cost of fixing the resulting problems was much greater).

To minimize ambiguity -- and in contrast to most other modeling languages of the time -- the first standardized definition of UML was specified using a metamodel. This is a model that defines the characteristics of each UML modeling concept, and its relationships to other modeling concepts. The metamodel was defined using an elementary subset of UML, and was supplemented by a set of formal constraints written in the Object Constraint Language (OCL).

Note:: This subset of UML, primarily comprising concepts defined in UML class diagrams, is called the Meta-Object Facility (MOF). This subset was chosen such that it could be used to define other modeling languages.

This combination represented a formal specification of the abstract syntax of UML, so-called because it is independent of the actual notation or concrete syntax (that is, text and graphics) that is used to represent models. In other words, it defined the set of rules that can be used to determine whether a given model is well formed. For example, such rules would allow us to determine that it is incorrect to connect two UML classes by a state machine transition.

However, the degree of precision used in this initial UML metamodel proved insufficient to support the full potential behind MDD (see for example the discussion in [Stevens02]). In particular, the specification of the semantics (or meaning) of the UML modeling concepts remained inadequate for such MDD-oriented activities as automatic code generation or formal verification.

Consequently, the degree of precision used in the definition of UML 2.0 has increased significantly. This was achieved by the following means:

  • A major refactoring of the metamodel infrastructure: The infrastructure of UML 2.0 comprises a set of low-level modeling concepts and patterns that are in most cases too rudimentary or too abstract to be used directly in modeling software applications. However, their relative simplicity makes it much easier to be precise about their semantics and the corresponding rules regarding being well-formed. These finer-grained concepts are then combined in different ways to produce more complex user-level modeling concepts. For instance, in UML 1, the notion of ownership (that is, elements owning other elements), the concept of namespaces (named collections of uniquely named elements), and the concept of classifiers (elements that can be categorized according to their features), were all inextricably bound into a single semantically complex notion. (Note that this also meant that it was impossible to use any one of these without implying the other two.) In the new UML 2.0 infrastructure, these concepts were separated, and their syntax and semantics defined separately.
  • Extended and more precise semantics descriptions: The definition of semantics of the UML 1 modeling concepts was problematic in a number of ways. The level of description was highly uneven, with some areas having extensive and detailed descriptions (for instance, state machines), while others had little or no explanation. The UML 2.0 specification puts a lot more emphasis on the semantics and, in particular, in the key area of basic behavioral dynamics (see below). For a more detailed discussion of the semantics of UML 2.0, refer to [Selic04] in the Resources section.
  • A clearly defined dynamic semantic framework: The UML 2.0 specification clarifies some of the critical semantic gaps in the original version. This framework is depicted in Figure 1, and is described in more detail in Resources. [Selic04]. In particular, the following issues are addressed explicitly by this framework:
    • The structural semantics of links and instances at run time
    • The relationship between structure and behavior
    • The semantic underpinnings or causality model shared by all current high-level behavioral formalisms in UML (that is, state machines, activities, and interactions) as well as potential future ones. This also ensures that objects whose behaviors are expressed using different formalisms can interact with each other.
Figure 1. The UML 2.0 semantics framework
The UML 2.0 semantics framework

New language architecture

One of the immediate consequences of the increased level of precision in UML 2.0 is that the language definition has gotten bigger -- even without accounting for the new modeling capabilities. This would normally be of concern, especially given that the original UML was criticized as being too rich (and, therefore, too cumbersome to learn and use).

Such criticisms typically ignore the fact that UML is intended to address some of the most complex of today's software problems, and that such problems require sufficiently powerful tools. (Successful technologies -- such as automobiles and electronics-- have never gotten simpler; it is a part of human nature to persistently demand more of our machinery, which, ultimately, implies more sophisticated tools. For example, no one would even contemplate building a modern skyscraper using basic hand tools.)

With this concern in mind, nevertheless, and in order to deal with the problem of language complexity, UML 2.0 was modularized in a way that allows selective use of language modules. The general form of this structure is shown in Figure 2. It consists of a foundation comprising shared concepts (such as classes and associations), on top of which is a collection of vertical sub-languages or language units. Each one of these is suited to modeling a specific form or aspect (Table 1). These vertical language units are generally independent of each other, and thus you can use them independently. (Note that this was not the case in UML 1, where, for example, the activities formalism was based entirely on the state machine formalism.)

Figure 2. The language architecture of UML 2.0
The language architecture of UML 2.0

Furthermore, the vertical language units are hierarchically organized into up to three levels, with each successive level adding more modeling capabilities to those available in the levels below. This provides an additional dimension of modularity so that, even within a given language unit, it is possible for you to use only specific subsets.

This architecture means that you can learn and use only the subset of UML that suits you best. It is no more necessary to become familiar with the full extent of UML in order to use it effectively than it is to learn all of English to speak effectively. As you gain experience, you have the option of gradually introducing more powerful modeling concepts as necessary.

Table 1. The Language Units of UML 2.0
Language Unit Purpose
Actions(Foundation) modeling of fine-grained actions
ActivitiesData and control flow behavior modeling
Classes(Foundation) modeling of basic structures
ComponentsComplex structure modeling for component technologies
DeploymentsDeployment modeling
General Behaviors(Foundation) common behavioral semantic base and time modeling
Information FlowsAbstract data flow modeling
InteractionsInter-object behavior modeling
ModelsModel organization
ProfilesLanguage customization
State MachinesEvent-driven behavior modeling
StructuresComplex structure modeling
TemplatesPattern modeling
Use CasesInformal behavioral requirements modeling

As part of the same architectural reorganization, the definition and structure of compliance has been significantly simplified in UML 2.0. In UML 1, the basic units of compliance were defined by the packages of the metamodel, with literally hundreds of possible combinations. (In fact, because UML 1 formalized the notion of incomplete compliance to a given compliance point, the possible number of different combinations of capabilities that allowed a vendor to claim compliance was orders of magnitude greater.) This meant that it was highly unlikely to find two or more modeling tools that could interchange models, since each would likely support a different combination of packages.

In UML 2.0, only three levels of compliance are defined, and those correspond to the hierarchical language unit levels already mentioned and depicted in 0 earlier. These are defined in such a way that models at level (n) are compliant with models at any of the higher levels (n+1, etc.). In other words, a tool compliant to a given level can import models -- without loss of information -- from tools that were compliant to any level equal to or below its own.

Note: Formally, UML 2 also defines a fourth level (Level 0), but this is an internal level intended primarily for tool implementers.

Four types of compliance are defined:

  • Compliance to the abstract syntax
  • Compliance to the concrete syntax (that is, the UML notation)
  • Compliance to both abstract and concrete syntax
  • Compliance to both the abstract and concrete syntax, and to the diagram interchange standard [OMG03b]

This means that there is a maximum of only 12 different compliance combinations with clear dependency relationships between them (for example, abstract and concrete syntax compliance is compatible with only concrete syntax compliance or only abstract syntax compliance). Consequently, in UML 2.0, model interchange between compliant tools from multiple vendors becomes more than just a theoretical possibility.

Large-scale system modeling capabilities

The number of features added in UML 2.0 is relatively small. This was done intentionally to avoid the infamous second system effect [Brooks95], whereby a language gets bloated by an excess of new features demanded by a highly diverse user community. In fact, the majority of new modeling capabilities are, in essence, simply extensions of existing features that allow you to use them for modeling large-scale software systems.

Moreover, these extensions were all achieved using the same basic approach: recursive application of the same basic set of concepts at different levels of abstraction. This means that you can combine model elements of a given type into units that, in turn, you can use as the building blocks to be combined in the same way at the next level of abstraction, and so on. This is analogous to the way that procedures in programming languages can be nested within other procedures to any desired depth.

Specifically, the following modeling capabilities are extended in this way:

  • Complex structures
  • Activities
  • Interactions
  • State machines

The first three of these account for more than 90% of the new features in UML 2.0.

Complex structures

The basis for this set of features comes from long-term experience with various architectural description languages, such as UML-RT [SR98], Acme [GMW97], and SDL [ITU02]. These languages are characterized by a relatively simple set of graph-like concepts: basic structural nodes called parts that may have one or more ports, and which are interconnected via communication channels called connectors (as shown in Figure 3). These aggregates may be encapsulated within higher-level units that have their own ports so that they can be combined with other higher-level units into yet higher-level units, and so on.

Figure 3. Complex structure modeling concepts
Complex structure modeling concepts

To a degree, these concepts could already be found in the UML 1 definition of collaborations, except that they were not applied recursively. To allow recursion, a collaboration structure is nested within a class specification, which means that all instances of that class will have an internal structure specified by the class definition. For example, in Figure 3, parts /a:A and /b:B are nested within part /c:C, which represents an instance of the composite structure class C. Other instances of that class would have the same structural pattern (including all the ports, parts, and interconnections).

It turns out that with these three simple concepts and their recursive application, it is possible for you to model arbitrarily complex software architectures.


Activities in UML are used to model flows of various kinds: signal or data flows, as well as algorithmic or procedural flows. Needless to say, there are numerous domains and applications that are most naturally rendered by such flow-based descriptions. In particular, this formalism was embraced by business-process modelers -- and also by systems engineers, who tend to view their systems primarily as flow-through signal processors. Unfortunately, the UML 1 version of activity modeling had a number of serious limitations on the types of flows that could be represented. Many of these limits were due to the fact that activities were overlaid on top of the basic state machine formalism and were, therefore, constrained to the semantics of state machines.

UML 2.0 replaced the state machine underpinning with a much more general semantic base that eliminated all of these restrictions. (In fact, the semantic foundations are represented by a variant of generalized colored Petri nets [pet].) Furthermore, inspired by a number of industry-standard business-processing formalisms -- including notably BPEL4WS [BPEL03] -- a very rich set of new and highly refined modeling features were added to the basic formalism. These include the ability to represent:

  • Interrupted activity flows
  • Sophisticated forms of concurrency control
  • Diverse buffering schemes

The result is a very rich modeling toolset that can represent a wide variety of flow types.

As with complex structures, you can recursively group activities and their interconnection flows into higher-level activities with clearly defined inputs and outputs. You can, in turn, combine these with other activities to form more complex activities, up to the highest system levels.


Interactions in UML 1 were represented either as sequenced message annotations on collaboration diagrams, or as separate sequence diagrams. Unfortunately, two fundamental capabilities were missing:

  1. The ability to reuse sequences that may be repeated in the context of more extensive (higher level) sequences. For example, a sequence that validates a password may appear in multiple contexts in a given application. Without the ability to package such repeated sequences into separate units, you had to define them numerous times. This not only added overhead, but also complicated model maintenance (for instance, when the sequence needed to be changed).
  2. The ability to adequately model various complex control flows that are common in representing interactions of complex systems. These include repeating subsequences, alternative execution paths, concurrent and order-independent execution, and so on.

Fortunately, the problem of specifying complex interactions was extensively studied in the telecommunications domain, where a standard evolved based on many years of practical experience in defining communications protocols [ITU04]. This standard was used as a basis for representing interactions in UML 2.0.

The key innovation was the introduction of an interaction as a separate named modeling unit. Such an interaction represents a sequence of inter-object communications of arbitrary complexity. It may even be parameterized to allow the specification of context-independent interaction patterns.

You can invoke these packaged interactions recursively from within higher-level interactions analogous to macro invocations (Figure 4). As one might expect, you can nest these to an arbitrary degree. Furthermore, interactions can serve as operands in complex control constructs such as loops (for example, a given interaction may have to be repeated some number of times) and alternatives. UML 2.0 defines a number of convenient modeling constructs of this type, providing you a very rich facility for modeling complex end-to-end behavior at any level of decomposition.

Figure 4. An example of a complex interaction model
An example of a complex interaction model

Figure 4 illustrates an extended interaction model. In this case, the interaction ATMAccess first "invokes" another lower-level transaction called CheckPIN (the contents of this interaction are not shown in the diagram). Note that the latter interaction has a parameter (in this case, say, the number of times an invalid PIN can be entered before the transaction is cancelled). After that, the client sends an asynchronous message specifying what kind of interaction is required and -- based on the value specified -- either the DispenseCash interaction or the PayBill interaction is performed.

Interactions in UML 2.0 can be represented by sequence diagrams (as shown in the example above), as well as by other diagram types (including the collaboration-based form defined in UML 1). There is even a non-graphical tabular representation.

State machines

The main new capability added to state machines in UML 2.0 is quite similar to the previous cases. The basic idea is that you can make a composite state fully modular, with explicit points of transition entry and transition exit. This, in turn, allows you to define the internal decomposition of that state separately by a discrete and reusable state machine specification. That is, the same specification can be reused in multiple places within the state machine or some other state machines. This simplifies the specification of shared behavior patterns in different contexts.

One other notable state machine innovation in UML 2.0 is a clarification of state machine inheritance between a class and its subclasses.

Language Specialization Capabilities

Experience with UML 1 indicated that a very common way of applying UML was first to define a UML profile for a particular problem or domain, and then to use that profile instead of or in addition to general UML. In essence, profiles were a way of producing what are now commonly referred to as domain specific languages (DSLs).

An alternative to using UML profiles is to define a new custom modeling language using the MOF standard and tools. The latter approach has the obvious advantage of providing a clean slate, enabling the definition of a language that is optimally suited to the problem at hand. At first glance, this may seem the preferred approach to DSL definition, but closer scrutiny reveals that there can be serious drawbacks to this approach.

As noted in the introduction, too much diversity leads to the kind of fragmentation problems that UML was designed to eliminate. In fact, this is one of the primary reasons why it was accepted so widely and so rapidly.

Fortunately, the profile mechanism provides a convenient solution here for many practical cases. This is because there is typically a lot of commonality even between diverse DSLs. For example, practically any object-oriented modeling language will need to define the concepts of classes, attributes, associations, interactions, and so on. UML, which is a general-purpose modeling language, provides just such a convenient and carefully-defined collection of useful concepts. This makes it a good starting point for a large number of possible DSLs.

But there is more than just conceptual reuse at play here. Because a UML profile, by definition, has to be compatible with standard UML. In other words, a UML profile can only specialize the standard UML concepts by defining constraints on those concepts that gives them a unique domain-specific interpretation. For example, a constraint may disallow multiple inheritance, or it may require that a class must have a particular type of attribute. This means that:

  • Any tool that supports standard UML can be used for manipulating models based on that profile
  • Any knowledge of and experience with standard UML is directly applicable

Therefore, many of the fragmentation problems stemming from diversity can be mitigated or even avoided altogether. This type of reasoning is what led the international standards body responsible for the SDL language [ITU02] -- a DSL widely used in telecommunications -- to redefine SDL as a UML profile [ITU00] [ITU03].

This is not to say that any DSL can and should be realized as a UML profile; there are indeed many cases where UML may lack the requisite foundational concepts that can be cast into corresponding DSL concepts. However, given the generality of UML, it may be more widely applicable than many people might think.

With these considerations in mind, the profiling mechanism in UML 2.0 has been rationalized and its capabilities extended. The conceptual connection between a stereotype and the UML concepts that it extends has also been clarified. In effect, a UML 2.0 stereotype is defined as if it were simply a subclass of an existing UML metaclass, with associated attributes (representing tags for tagged values), operations, and constraints. The mechanisms for writing such constraints using a language such as OCL have been fully specified.

In addition to constraining individual modeling concepts, a UML 2.0 profile can also explicitly hide UML concepts that make no sense or are unnecessary in a given DSL. This allows the definition of minimal DSL profiles.

Finally, the UML 2.0 profiling mechanism can also be used as a mechanism for viewing a complex UML model from multiple different domain-specific perspectives -- something not generally possible with DSLs. That is, any profile can be selectively applied or de-applied without affecting the underlying UML model in any way. For example, a performance engineer might choose to apply a performance modeling interpretation over a model, attaching various performance-related measures to elements of the model. These can then be used by an automated performance analysis tool to determine the fundamental performance properties of a software design. At the same time -- and independently of the performance modeler -- a reliability engineer might overlay a reliability-specific view on the same model to determine its overall reliability characteristics, and so on

General Consolidation

This item covers a number of areas, including the removal of overlapping concepts as well as numerous editorial modifications (such as adding clarifications to confusing descriptions, and standardizing terminology and specification formats).

Removing overlapping and the clarification of poorly defined concepts was another important requirement for UML 2.0. The three major areas affected by this requirement were:

  • Actions and activities
  • Templates
  • Component-based design concepts

Actions were introduced in UML 1.5. The conceptual model of actions was intentionally made general enough to accommodate both data-flow and control-flow computing models. This resulted in a significant conceptual similarity to the activities model. UML 2.0 exploits this similarity to provide a common syntactic and semantic foundation for actions and activities. From your point of view these are formalisms that occur at different levels of abstraction, since they typically model phenomena at different levels of granularity. However, the shared conceptual base results in an overall simplification and greater clarity.

In UML 1, templates were defined very generally: any UML concept could be made into a template. Unfortunately, this generality was an impediment to its application, since it allowed for potentially meaningless template types and template substitutions. The template mechanism in UML 2.0 was restricted to cases that were well understood: classifiers, operations, and packages. The first two were modeled after template mechanisms found in popular programming languages.

In the area of component based design, UML 1 had a confusing abundance of concepts. You could use classes, components, or subsystems. These concepts had a lot in common but were subtly different in non-obvious ways. There was no clear delineation as to which to use in any given situation. Was a subsystem just a "big" component? If so, how big did a component have to be before it became a subsystem? Classes provided encapsulation and realized interfaces, but so did components and subsystems.

In UML 2.0, all of these concepts were aligned, so that components were simply defined as a special case of the more general concept of a structured class; similarly, subsystems were merely a special case of the component concept. The qualitative differences between these were clearly identified so that you can make decisions about when to use which concept on the basis of objective criteria.

On the editorial side, the format of the specification was consolidated with the semantics and notation specifications for the modeling concepts combined for easier reference. Each metaclass specification was expanded with information that explicitly identifies semantic variation points, notational options, and its relationship to the UML 1 specifications. Finally, the terminology was made consistent so that a given term (for example, type, instance, specification, or occurrence) has the same general connotation in all contexts in which it appears.


UML 2.0 was designed to give you a gradual introduction to model-driven methods. Those who prefer to use it as a sketching tool (as described earlier in this article) can still use it in the same informal way as UML 1. Moreover, since the new modeling capabilities are non-intrusive, in most cases such users will not see any change in the look and feel of the language.

However, the opportunity to move forward on the MDD scale is now available in a standardized way. UML 2.0 contains necessary increased precision, and if you desire you may use its new capabilities -- all the way to completely automated code generation.

The language structure was carefully reorganized to allow a modular and graduated approach to adoption: you only need to learn the parts of the language that are of interest to you, and can safely ignore the rest. As your experience and knowledge increases, you can selectively add new capabilities. Along with this reorganization comes an immense simplification of the compliance definitions, which will facilitate interoperability between complementary tools as well as between tools from different vendors.

Only a small number of new features were added (to avoid language bloat), and practically all of those are designed along the same recursive principle that enables you to model very large and complex systems. In particular, extensions were added to more directly model software architectures, complex system interactions, and flow-based models, making it ideal for applications such as business process modeling and systems engineering.

The language extension mechanisms were slightly restructured and simplified, providing a more direct way for you to define domain-specific languages based on UML. These languages have the distinct benefit of being able to take advantage of UML tools and expertise directly, both of which are abundantly available.

The overall result is a second-generation modeling language that will help you develop more sophisticated software systems faster and more reliably -- while allowing you to continue using the same type of intuition and expertise that is the bread and butter of every software developer. In essence, it is still program design, only at a higher level -- comparable to the step that occurred in hardware design, when discrete components gave way to large-scale integration.


  • [BPEL03] BEA, et al., Business Process Execution Language for Web Services (Version 1.1), 5 May 2003, 2003
  • [Brooks95] Brooks Jr., F., The Mythical Man-Month (1995 edition), Addison-Wesley, 1995.
  • [Brown04] Brown, A., An Introduction to Model Driven Architecture, IBM Rational Developer Works, 2004.
  • [Fowler04] Fowler, M., UML Distilled (3rd edition), Addison-Wesley, 2004.
  • [GMW97] Garlan, D., Monroe, R., and Wile, D., Acme: an architecture description interchange language, in Proceedings of the 1997 Conference of the Centre for Advanced Studies on Collaborative Research, Association For Computing Machinery (ACM), 1997.
  • [Graham01] Graham, I., Object-Oriented Methods: Principles and Practice (3rd edition), Addison-Wesley, 2001
  • [ITU00] International Telecommunications Union, ITU Recommendation Z.109: SDL Combined with UML, ITU-T, 2000.
  • [ITU02] International Telecommunications Union, ITU Recommendation Z.100: Specification and Description Language (SDL), (08/02), ITU-T, 2002.
  • [ITU04] International Telecommunications Union, ITU Recommendation Z.120: Message Sequence Chart (MSC), (04/04), ITU-T, 2004.
  • [ITU05] International Telecommunications Union, Study Group 17: Question 13/17 -- System Design Languages Framework and Unified Modelling Language, ITU-T Study Group 17, 2003.
  • [Lee92] Lee, L. The Day the Phones Stopped Ringing, Plume Publishing, 1992.
  • [Booch04] Booch, G., et al., An MDA Manifesto, in Frankel, D., and Parodi, J. (eds.), The MDA Journal, Meghan-Kiffer Press, 2004.
  • [OMG03a] Object Management Group, Unified Modeling Language (UML), Version 1.5, OMG document formal/03-03-01, 2003.
  • [OMG03b] Object Management Group, UML 2.0 Diagram Interchange, Final Adopted Specification, OMG document ptc/03-09-01, 2004.
  • [OMG04] Object Management Group, UML 2.0 Superstructure, Available Specification, OMG document ptc/04-10-02, 2004.
  • [RJB05] Rumbaugh, J., Jacobson, I., and Booch, G., The Unified Modeling Language Reference Manual (2nd edition), Addison-Wesley, 2005.
  • [Stevens02] Stevens, P., On the interpretation of binary associations in the Unified Modeling Language, Journal of Software and Systems Modeling, vol.1, no.1, Springer-Verlag, September 2002.
  • [Selic04] Selic, B., On the Semantic Foundations of Standard UML 2.0, in Bernardo, M., and Corradini, F. (eds.), Formal Methods for the Design of Real-Time Systems, Lecture Notes in Computer Science vol. 3185, Springer-Verlag, 2004.
  • [SR98] Selic, B. and Rumbaugh, J. Using UML for Modeling Complex Real-Time Systems. Unpublished white paper, Apr. 4, 1998,
  • Get the evaluation version of Rational Application Developer, Rational Software Architect and the other IBM Rational products supporting UML 2.0 from the Trials and betas page.
  • The IBM Software Developer Platform homepage provides detailed information on the overall IBM Software Development platform, of which IRAD, IRSA, IRSM, and other UML 2.0 based products, are a part.
  • For technical resources about Rational's products, visit the developerWorks Rational content area. You'll find technical documentation, how-to articles, education, downloads, product information, and more.
  • Find more product related information by visiting the IBM Rational marketing pages.
  • Get involved in the developerWorks community by participating in developerWorks blogs.
  • Ask questions about Rational Application Developer and Rational Software Architect in the Rational Software Architect, Software Modeler, Application Developer and Web Developer forum.
  • Browse for books on these and other technical topics.


developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into Rational software on developerWorks

Zone=Rational, Rational
ArticleTitle=Unified Modeling Language version 2.0