Representing nonfunctional aspects using TOGAF ArchiMate
The focus of this article is on the graphical and formal representation of the nonfunctional aspects of an IT system, starting from nonfunctional requirements (NFRs), as the representation is performed by the architect.
Before presenting some of the available notations and how the represented concepts can be used or reused in The Open Group Architecture Framework (TOGAF), we quote here an excerpt from Alfred North Whitehead's An Introduction to Mathematics on notations, because it explains why we devote so much attention to notations in this article:
By relieving the mind of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the [human] race.
The flip side of the coin is the model: we use the notation to represent a model formally.
Models are central to the architect's work. Architects are mainly responsible for the integrity of the design of the system. Models enable them to represent and specify the system, its governance, and its implementation.
Models that support multiple views (function, performance, availability, management, for example) are particularly relevant for designing a complex system. Those multiple views need to be linked to each other and with the system model in a coherent and easy-to-manage fashion. The integrated model also needs to be as domain-independent as possible (although this article focuses on software-intensive systems).
Last but not least, the model needs to be expressed in a formal language, such as UML (or IBM's ADL or OMG's SysML). In a nutshell, modeling is driven by (see Maier and Rechtin, Chapter 8, cited in Resources, for a more in-depth discussion of this point):
- The need to share and communicate with the client and the design and implementation teams
- The need to maintain the overall integrity of the design across views, layers, requirements, and so forth
In the rest of this article, we introduce two major modeling languages and notations, one largely used inside IBM and the other largely used in the systems engineering community. We will see how they enable the architect to represent and integrate nonfunctional requirements into the overall system design. They provide input and guidance to designing the TOGAF ArchiMate extensions required to support NFRs.
IBM's System Description Standard
The IBM System Description Standard, or SDS (formerly Architecture Description Standard, or ADS), is a set of conventions for notation, terminology, and semantics to describe the architecture of an IT system to be used by IBM architects (see the reference to A Standard for Architecture Description. in Resources for a more detailed presentation).
Overview of the System Description Standard
Wherever possible, SDS has conformed to the OMG UML concepts and notations, and leverages IBM's experience in large IT projects. In practice, such projects include work groups concerned with two kinds of design and development:
- Application design and development, with a focus on the following purposes:
- Break down the complexity of the IT system so that developers can analyze and design components in relative isolation from each other
- Infrastructure design and development, with the focus on the following purposes:
- Analyze the functionality so that required technical components can be identified
- Assist in the analysis of service-level requirements so that the means of delivering them can be designed
- Provide a basis for the specification of the physical computer systems on which the IT system will run and mapping components onto these computer systems
SDS and the associated viewpoints reflect this separation of concerns by identifying two main viewpoints of the architecture:
- Functional viewpoint
- Operational viewpoint
The functional viewpoint's focus is on describing the function of the IT system, and it is primarily concerned with these factors:
- The structure and modularity of the software components (both application and technical)
- Interactions between components, including protocols
- Interfaces provided by components and their use
- Dynamic behavior, expressed as collaboration between components
The operational viewpoint's focus is on describing how the function of the IT system is deployed over the geographical structure of the system. It is primarily concerned with:
- Representing the system's topology (hardware platforms, locations, external systems, and such)
- Describing what runs where — where software and data are placed within this topology
- Providing the basis for the definition of the service management aspects of the IT system (capacity planning, software distribution, backup and recovery).
The following picture presents a view of a system represented using SDS.
Nonfunctional requirements management in SDS
In SDS, nonfunctional requirements are defined as quality requirements or constraints that an IT system must satisfy. They are the key influencer for placement of groups of deployment units (set of functional components merged into a single package) onto nodes (platforms on which software executes) and the collection of nodes into zones (areas with homogeneous nonfunctional requirements).
In SDS, NFRs can be loosely classified into two groups:
- Service level requirements (performance, availability, security)
- System qualities (ability to maintain)
NFRs can be attached to any component or node. In SDS, no graphic standard (notation) is defined for the representation of NFRs. NFRs are expected to be described in free-format text.
OMG's SysML specification
SysML supports the specification, analysis, design, verification, and validation of a broad range of complex systems. These systems can include hardware, software, information, processes, personnel, and facilities. The origins of the SysML initiative can be traced to a strategic decision by the International Council on Systems Engineering's (INCOSE) Model-Driven Systems Design workgroup in January 2001 to customize the Unified Modeling Language (UML) for systems engineering applications.
Similar to SDS, SysML reuses a subset of UML 2.1 and provides additional extensions needed to address the requirements for system engineering applications.
Because SysML uses UML 2.1 as its foundation, systems engineers who model with SysML and software engineers who model with UML 2.1 are able to collaborate on models of software-intensive systems. This improves communication among the various stakeholders who participate in the systems development process and promotes interoperability among modeling tools. It is anticipated that SysML will be customized to model domain-specific applications, such as automotive, aerospace, communications, and information systems.
SysML provides facilities to capture so-called text-based requirements and then integrate them into the system model. Text-based requirements serve as a general-purpose tool that lets the architect model virtually any kind of requirement. In practice, functional requirements are modeled in SysML by using use cases, as in UML, whereas nonfunctional requirements are modeled by using a text-based notation.
Integrating text-based requirements into the model lets the architect precisely track their impact on the system components. In SysML, similar requirements are usually collected together into a specification. Specifications are typically organized as trees inside of packages. Each requirement or specification can be linked to both other requirements and specifications or to model elements that, in turn, can be components, test cases, and so forth. This is the same approach that we take in the proposed extension to ArchiMate.
An example of a nonfunctional requirements representation
In SysML, we have two different ways to represent text-based requirements:
- Requirement diagram
- Requirement tables (used to represent requirements) and matrices (used to represent the relationships between requirements and other objects).
Given our focus on graphical representations, hereafter, we focus on the diagram representation only. However, the table and matrix form can be very useful, too, especially with a very large set of requirements. Both are also much easier to implement as interfaces to data stored in relational databases.
Rather than presenting in the Requirement diagram in detail, we present a simple example that we hope conveys the nature of requirements management in SysML. The example is taken from the OMG's non-normative example (see Resources) that describes the design of a hybrid SUV (sports utility vehicle). We summarize the main points, closely following the description in the cited OMG document (with minor adaptation).
The vehicle system specification contains several NFRs that are managed as text-based requirements. They are components of a package called HSUV Requirements. Figure 3 shows the decomposition of requirements. A few requirements are highlighted,
including the requirement for the vehicle to pass emissions standards, which is expanded for illustration purposes. The containment (crosshair) relationship refers to the practice of decomposing a complex requirement into simpler, single requirements.
Figure 4 shows the derivation of requirements and the rationale. A set of requirements is derived from the lowest-tier requirements and displayed in the figure. Other model elements might be necessary to help develop a derived requirement, and these model elements can be related by a
«refinedBy» relationship. For example, notice how PowerSourceManagement is RefinedBy the HSUVOperationalStates model. Also notice that rationale can be attached to the
Lastly, we see how SysML lets us relate the requirements to other model elements. The next figure models the Acceleration requirement. The refine relationship, introduced in the previous figure, shows how the Acceleration requirement is refined by a similarly named use case. The Power requirement is satisfied by the PowerSubsystem (a component of the system architecture), and a Max Acceleration test case verifies the Acceleration requirement.
Now, let us see how these concepts can be introduced into an EA framework like the one provided by ArchiMate.
In the following, we will not focus on the functional content of (the work performed by) the system, but rather on how the system is structured to perform this work. We may think in fact to a parallel architecture in the model, almost completely decoupled from the functional component architecture and almost entirely driven by the NFRs affecting the system.
The TOGAF, ArchiMate approach to nonfunctional requirements (NFRs)
ArchiMate is the modeling language sponsored by the Open Group as a companion to the well-known TOGAF enterprise architecture framework. ArchiMate elements are categorized as structural and behavioral (dynamic), and are organized in three layers: business, application, technology. Each layer uses the services exposed by the underlying one. Viewpoints specific to the needs of one of the EA stakeholders can be defined either within the layer or cross-layers.
ArchiMate 1.0 Specifications do not mention the word nonfunctional at all.
In ArchiMate 2.0 Specifications, the only mention of nonfunctional aspects is in this statement:
"For the external users, only this exposed functionality and value (of a service) together with nonfunctional aspects, such as the quality of service, costs, etc., are relevant. These can be specified in a contract or Service Level Agreement (SLA)."
The method also suggests that additional viewpoints (performance, for example) can be added to the model by using "profiles" composed of attributes of the model elements.
This would result in a data structure which is, in fact, "…separate from the ArchiMate language, but can be dynamically coupled with concepts or relationships…" (ArchiMate v2)
However, we think that this approach is severely reductive when compared to the importance of the nonfunctional aspects for the IT architecture of the enterprise. In particular, it lacks essential graphic modeling capabilities and standardization.
We propose, instead, using the Model Extension mechanism of ArchiMate 2.0 to add a Nonfunctional Aspects Metamodel to the base ArchiMate specifications.
Proposed extensions to the TOGAF model
The Nonfunctional Aspects Metamodel (hereafter, we use the more practical term: operational) is composed of the following model elements (in white in the diagram):
- Nonfunctional requirement
- Stress case
- Architectural pattern
- Technical component
- Technical interaction
Most of these concepts and their definitions are borrowed from SDS R3. In addition, while requirement, zone, system, component, and interaction are extensions of elements already used in the ArchiMate base model, stress case and architectural pattern are mostly new concepts.
In our model, NFRs originate from the needs of a business role. All NFRs are originally business requirements, with an identifiable business reason behind them (see The operational context diagram, cited in Resources). For instance, the system that supports bank tellers shall not stop for more than three minutes; otherwise, the queue of customers can grow to a level that makes the customers' "banking experience" unsatisfactory, regardless of the time actually spent in their specific transactions.
As further example, a security requirement like the one to encrypt critical data may equally derive from the enforcement of a legal or regulatory rule by the company's lawyer, from the desire of the CFO to avoid financial losses or from the need to protect company vital information enforced by the security officer.
Business NFRs can originate application-specific or technology-specific NFRs. For instance, a maintainability requirement, or one to make an application reasonably easy to maintain, might ultimately derive from the desire to keep the cost of application maintenance budget low or from the desire to be vendor-independent in the assignment of this task.
NFRs lead to the identification of zones with a homogenous set of requirements. The zone concept is implemented as an extension of the ArchiMate grouping relationship.
For instance, the no-interruption requirement referred to earlier ("the system shall not stop for more than three minutes") will define a zone where all model elements share the need to be continuously available. Notice that the zone concept applies to all three layers of the ArchiMate model: there are business functions that share the same set of requirements (fall within the same zone), as well as application functions and infrastructure services.
Well-known examples of zones are security zones: Internet zone, DMZ, protected LAN, internal LAN. However, in a complex system, many types of zones might exist: critical business continuity zone (RTO close to zero), critical data zone (RPO close to zero), sensitive data zone (all data encrypted with no access from operation personnel to non-encrypted images), and so on.
As introduced in previous articles, the IT environment in production is "stressed" by combinations of mutually reinforcing stress factors. During real production operations, NFRs that occur at the same time might combine to form stress cases (stress conditions).
A stress case is a representation of the real load on the IT environment (or on a part of it), much as static loads (weight of furniture, people, snow) and dynamic loads (winds, for example) are the loads that require a steel-and-concrete building (see An Introduction to the IBM Views and Viewpoints Framework for IT Systems, cited in Resources).
Although this concept is more readily applicable to the combination of application and technology, it also applies to the application layer, because it might be necessary to introduce modifications in the application architecture (for example, replication of components in different locations to move processes close to data) to handle specific conditions (The operational context diagram.)
An example of stress case could be: the system response time at the teller interface must be under 2 seconds for all most common operations, with 20.000 tellers concurrently active.
Notice that, during development, this stress case must be tested by combining two different test cases (a scalability test and a performance test). On the other hand, each stress case can be associated with a specific test environment, meaning to a unique set of system configuration and data.
A stress case is implemented as an extension of the ArchiMate Business Contract element.
To simplify the high-level mapping of zones and stress cases to the IT environment, we will use the popular concept of system.
We model the system as an extension of the ArchiMate Application Collaboration element (for lack of a similar element in the technology layer), but it has technology as well as application significance. A system can expose application interfaces and infrastructure interfaces.
Stress conditions on a system will ultimately affect all or part of its elements. For example: The Teller Support System must achieve a 2-second response time with the expected peak number of concurrent users.
For a system, the mapping to zones might be partial, in the sense that only specific parts of a multi-purpose system (sub-systems) can be affected by a certain requirement and need to be zoned. For example, the teller system can be made of two sub-systems: the teller's workstation and the central server. It may be that, in order to ensure process continuity, only the server needs to be in the high-availability zone, since the teller can continue the process by simply moving to the next counter if her original workstation stops working,.
Technically, a system is structured into architectural patterns that organize and control technical components. The architectural pattern is usually the first attribute you mention when describing the system or the sub-system: "It is a web application", or "It is a RAC cluster", or "It is a portal"...
An architectural pattern is an extension of the Application Collaboration element of ArchiMate to the application and to the technology layers. It can be the realization of application services (an application architectural pattern) or technology services (a technology architectural pattern).
An architectural pattern coordinates technical components in the execution of a specific behaviour that can be described with an ArchiMate interaction.
Examples of architectural patterns are server-side, stateless services, database clusters, horizontally scaling servers, queue-based systems, REST systems, and so on.
A technical interaction is an extension of the Application Interaction element of ArchiMate to the application and to the technology layers. It represents the dynamic behavior of an architectural pattern.
A technical component is modeled as an extension of the ArchiMate component element to the application and to the technology layers.
Application technical components might, or might not, be implemented by functional application components, because it might be worthwhile to keep the operational view parallel and separated from the functional view.
On the other hand, a technology technical component might be a composition of several infrastructure elements. As an example, the "node of a DB cluster" technical component is a complex combination of system software (the relational database management system, or RDBMS), infrastructure nodes (the servers), devices (storage devices), the network (the communication path with other nodes of the same cluster), and exposed interfaces.
In a system, a technical component can be used by one or several architectural patterns.
Of course, the ultimate operational aspects lie in the characteristics of the model's technology elements: CPU speed, memory size, available storage, transmission rates, and so forth.
In fact, these represent the capabilities of the technical components in our model. In this sense, each technical component can be associated with an ArchiMate 2.0 profile of its nonfunctional characteristics.
Mapping to business domain elements
If you have ever "talked business" with a businessperson, you surely ended up talking about things such as market conditions, competition, marketing strategy, cash flow, and debt.
None of these is captured in the business layer of an enterprise architecture. The EA focus is really the way the business operates, meaning its processes. Therefore, an EA business model is truly a business operational model. As such, it is primarily a consequence of the non-functional requirements of the business.
Functional and nonfunctional requirements are the expression of the business need that one or more actors has while performing particular roles. As an example, a bank employee will have very different requirements when acting as a teller with customers lined up, waiting for service, than when performing back-office work, such as checking someone's credit for a loan application.
As a consequence, business functions and the processes that implement them might fall into specific NFR zones. For instance, all operations performed at the bank's counter share the critical need for "continuity of the service" that required for the Teller role.
Mapping to application domain elements
The best way to visualize an application system in the application domain is to think of its System Context diagram. A system is an object that is capable of delivering a specific set of application functions to a well-identified set of business actors.
Application functions will fall in the same NFR zone of their corresponding business functions. Therefore, a system might span multiple NFRs zones.
The operational elements of a system, architectural patterns and technical components, are the privileged clients of lower layer (technology) services, i.e. the application operational view is the best way to identify and track relationships with the underlying hardware.
Mapping to technology domain elements
The concepts of zone and systems are familiar in the infrastructure realm, because they often relate to physical entities and physical layouts. In the technology domain, the new model elements of architectural patterns and technical component are added to stress the conceptual view of the infrastructure as support elements of specific business requirements.
For instance, consider a database (DB) cluster. We might have pure active-passive DB clusters, based on shared storage, which will guarantee that no data will be lost (recovery point objective, or RPO = 0) but will take several minutes before being able to resume normal operations (recovery time objective, or RTO = different from 0). On the other hand, an active-standby cluster, where the secondary DB is kept up-to-date with an asynchronous mechanism, will bring this RTO close to 0, but some updates might be temporarily unavailable during the operation of the "promoted" secondary (hence, an RPO different from 0).
The choice between those two (or more) DB technologies or configurations depends on what was deemed critical for the business that the DB has to support. The introduction of the DB cluster architectural pattern highlights this dependency by making the reason of the choice explicit, whereas the introduction of technical components as cluster nodes highlights the fact that each node is a complex combination of several hardware and software elements and their interaction patterns.
Given that architectural patterns and technical components are mainly conceptual elements of the technology domain, they are very natural providers of (infrastructure) services to the upper (application) layer. For the same reason, architectural patterns are bound to follow zone's boundaries (another conceptual element), thus all of the pattern's elements shall stay in the same zone.
Example of using the notation
To clarify the use of the notation, we will work out a small example of the role of operational aspects in a retail banking enterprise architecture.
At the business level, we will consider the collaboration between a Customer and a Teller that occurs for a Cash Payment and that is triggered by a customer showing up at the counter with the money necessary for the operation.
All over-the-counter interactions between Teller and Customer are subject to the dynamics of queues, because there are often more Customers willing to be served than Tellers ready to serve them.
Because even small outages of the service may result in abnormal queue length and a customer's dissatisfaction, a requirement exists for the service not to stop for more then 3 minutes during the workday. The requirement affects all functions that are performed at the counter, including the one of Payment of Invoices and the underlying process that implements it (Payments Management).
We will represent this commonality of requisites by drawing a Critical Continuity of Operations zone and mapping these functions and processes to it.
As usual for the ArchiMate model, the business function will use some application services for its execution. We will focus on the Payments service to explore the operational aspects of the application layer.
The Payments service application is implemented among others by the function Invoice Payments. This, in turn, is one of the functions of the Teller Platform system that is normally used by the tellers of the bank.
Because the system is structured in client/server subsystems, Payments and the Teller Platform both fall into the Critical Operation Continuity zone. The client subsystem, which is responsible for the application presentation and the navigation, is technically an Eclipse stateful desktop. In the example, the server subsystem is built as a cluster of stateless web services.
This second architectural pattern is realized by using two technical (application) components:
- A session Enterprise JavaBean (EJB) with the function to route the request to the appropriate functional component
- A web services provider that maps the EJB interface into the WSDL of the web services.
They are coordinated in a way that ensures load balancing among incoming requests. In addition, each operation produces an entry in the Accounts database through a specialized Accounting component.
Now, while the function Invoice Payments shares only the same 3-minute maximum unavailability business requirement of its business layer counterpart, the Teller Platform server that all 50,000 Tellers at the bank branches use concurrently must ensure continuity of service for all of them. That is, the server must be able to survive the stress condition of 50,000 concurrent service recovery operations in a 3-minute time interval.
From the model described here, it is very clear that these are the main questions to be answered by the IT architect:
- Is the Eclipse client able to restart in 3 minutes?
- Is this server design capable of ensuring service restart in 3 minutes with 50,000 concurrent users?
While a pure functional model would have hidden these problems in complex relationships of functional details, the operational model is very effective in pointing out what the real issues are.
To continue the model at the technology layer, the Accounts database uses the services of a database cluster, and the web services cluster the services of a cell of IBM® WebSphere® Application Servers.
Although the WebSphere cell and database cluster services are conceptually well-separated, they can be supported by the same logical infrastructure system — in this case, the Teller Platform, which falls entirely in the Critical Continuity of Operations zone.
Of course, each service is provided by a different subsystem with its own architectural pattern. The DB cluster is provided as an active/standby database that implements a near-synchronous transaction replication on a secondary "hot" database (as High Availability & Disaster Recovery, HADR in IBM DB2 for multiplatform).
Notice here that a simple active/passive architecture with a "cold" restart after failure might not meet the 3-minute target, but an active/standby architecture with full asynchronous data replication might be equally inadequate in the event of too many queued updates (because of the 50,000 concurrent users) when database switching needs to occur.
The cluster technical components (active node and standby node) are, in fact, a combination of hardware (the node) and middleware (the RDBMS), using devices such as the disk storage and the communication path to the storage in the disaster recovery site.
For the WebSphere cell, the choice of a redundant architecture is justified by both the number of concurrent users and by the need to have a backup processing environment in case the primary one fails.
Extensions implemented in the IBM Rational Software Architect plug-in for ArchiMate
These extensions to the ArchiMate model have been implemented on top of the Rational System Architect CORSO ArchiMate 1.0 plug-in, which is available from IBM. The extensions are contained in a Rational System Architect USRPROPS file, and they add four new types of diagrams and related resources:
- Operational Aspects diagram: This is the main tool to draw the new operational view of the enterprise architecture. In addition to the NFR elements, it includes all of the ArchiMate elements shown in Figure 6.
- The Business Operational diagram: Its objective is to detail relevant nonfunctional aspects of the enterprise business model. For this purpose, it includes most of the business layer elements plus the nonfunctional aspects and zone symbols.
- Application Operational diagram: Its objective is to detail relevant nonfunctional aspects of the enterprise application model. For this purpose, it includes most of the application layer elements and all operational symbols.
- Technology Operational diagram: The same considerations for the application layer apply here.
We have shown a proposal to complement the TOGAF ArchiMate enterprise Architecture model, with an operational Aspects View, which aims at describing not WHAT the system does but HOW it does it and WHY, that is which requirements have originated which architectural decision.
We have implemented this view as an extension to the Rational System Architect ArchiMate plug-in originating a new category of nonfunctional diagrams.
The authors thank Philippe Spaas, Executive IT Architect at IBM and the author of the SDS architecture description standard, for his review and for the many ideas and suggestions that he gave to us in the process, as well as Pete J. Cripps, Senior IT Architect at IBM and an appreciated blogger on the subject of architecture, for his review.
- A selected bibliography on the topic of how NFRs can be integrated into the architectural description of complex, software-intensive systems:
- ISO/IEC JTC 1/SC7. Systems and software engineering: Architecture description. Draft, ISO and IEEE, 2010.
- Object Management Group. OMG Systems Modeling Language (OMG SysML). Version 1.2, 2010.
- Object Management Group. OMG SysML Hybrid SUV Non-Normative Example. PDF, 2010.
- Mark W. Maier and Eberhardt Rechtin. The Art of Systems Architecting, Third Edition. CRC Press, 2009.
- TOGAF Version 9, The Open Group, 2009.
- Sanford Friedenthal, Alan Moore, Rick Steiner: A Practical Guide to SysML. Morgan Kaupfmann/OMG Press, 2008.
- Denise Cook, Peter J. Cripps, Philippe Spaas. An Introduction to the IBM Views and Viewpoints Framework for IT Systems. IBM developerWorks, 2007.
- Sanford Friedenthal, Alan Moore, Rick Steiner. OMG Systems Modeling Language (OMG SysML) Tutorial. INCOSE 2007 conference, Final 2009.
- R. Youngs, D. Redmond-Pyle, P. Spaas, E. Kahan, A Standard for Architecture DescriptionIBM Systems Journal, Vol.38 No. 1, 1999.
- The Open Group.
ArchiMate 2.0 Specification.
- Vito Losacco and Fabio Castiglioni. The operational context diagram. IBM developerWorks, 2009.
- Nick Rozanski and Eóin Woods. Software Systems Architecture: Working with Stakeholders Using Viewpoints and Perspectives (2nd Edition). Addison-Wesley Professional, 2011.
- Learn more about Rational System Architect:
- Subscribe to the developerWorks weekly email newsletter, and choose the topics to follow. Stay current with developerWorks technical events and webcasts focused on a variety of IBM products and IT industry topics.
- Improve your skills. Check the Rational training and certification catalog, which includes many types of courses on a wide range of topics. You can take some of them anywhere, any time, and many of the "Getting Started" ones are free.
- Download a free, fully enabled trial version of Rational System Architect.
- Evaluate IBM software in the way that suits you best: Download it for a trial, try it online, use it in a cloud environment, or spend a few hours in the SOA Sandbox learning how to implement service-oriented architecture efficiently.