Bruce's Top Ten Modeling Hints: #10 - Forget "7+/- 2". Every diagram should have a singular mission.
Back in the '80s, (yes, I am that old), people used to model software and systems with data flow diagrams. Among the problems with the commonly used approaches to modeling was the notion of "7+/-2" diagram elements per diagram. In order to comply with the goal deeply nested decomposition hierarchies were created, sometimes dozens of levels deep all to ensure that each diagram only had a tiny number of elements on it. The resulting models were virtually impossible to navigate and practically impossible to actually use. Why did people even try?
The reason is that someone, whose identify is lost, decided to apply the results of neuroligusitic research - notably the 1956 paper "The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information" (by George A. Miller, published in Psychological Review) to the application of visual models. The basic outcome of the actual paper is a discussion about the correlation between the limits of of short term memory and out capacity to reason about things. Specifically, if a person is presented with a data set that vary along one dimension (e.g. tones varying in pitch), and asked to perform a value-specific responses, performance drops abruptly around 7 different stimuli discriminates. This correlates, according to Miller, with memory span - the longest list of items that a person can recall correctly 50% or more of the time. Interestingly, Miller recognized that the correlation is actually coincidental, a point ignored by many subsequent readers of the paper.
So there is nothing "magic" about the magic number of 7 +/- 2, and even if there was, Miller's research doesn't apply to visual modeling. In the case of visual modeling, memory span and one-dimensional absolute judgement is irrelevant because the subject elements are right in front of you all the time and don't have to be recalled in order to be used in reasoning. Further, adherence to this rule in visual modeling leads to arbitrarily decomposing collections of directly related elements into distributed aggregated decomposition hierarchies. This means you are converting one-dimensional absolute judgements into multi-dimensional absolute judgements - that is, you are increasing the conceptual distance between directly related elements by introducing unnecessary and arbitrary separation. This makes comprehension of the relations among those elements far more difficult than it would be if the direct relationships were maintained.
So what's the alternative? The Harmony(r) process uses the notion of a "diagram mission statement" - a singular concept or purpose visualized by the diagram. The primary reason that many models are so difficult to navigate and understand is that either the diagrams have no coherent intent or that they have too many. If we create a set of class diagrams each with a specific mission - such as show the collaboration realizing a use case, show a generalization taxonomy, show the contents of a package, show an architectural viewpoint, etc - each diagram becomes a clear statement of that mission. This means that class diagrams are built up around interesting aspects of the model so that the stakeholders can address specific questions. If you have a new question, you can build up a viewpoint (i.e. diagram) around that question. As an aside, I usually explicitly state the mission of the diagram in a comment in a corner of the diagram. For example, such a comment might read "The mission of this diagram is to show the collaboration of high-level elements realizing the 'Track Tactical Objects' Use Case." And the diagram then shows all of the elements that contribute to that mission, even if they number 30 or more.
Another consequence is that the same class will likely show up on many different diagrams. That's ok, because as long as you are using a modeling tool (as opposed to a drawing tool), the model repository maintains all the views in sync. A modeling tool, such as Rational Rhapsody(tm), manages a semantic repository of information that dynamically links the diagrams and their elements to the underlying semantic basis of the model. If the repository is changed - such as happens when an element on a diagram is modified - then all the relevant diagrams are likewise changed. This is because the tooling ensures the elements on the diagrams are dynamically linked to that repository. Similarly, in high fidelity modeling, the source code is simply another view of the semantic repository and is likewise linked dyanmically to it. But that's another hint, for another day.
One complaint I hear about models is that they're abstract and therefore not useful. My initial response is "Duh!" to the first but "Huh?" to the second.
Humans cogitate almost exclusively using abstractions, hence the initial response. An abstraction focuses on some properties of an object (not using the term in the OO sense) at the exclusion of others. When we want to reason about some characteristic of a thing - such as its behavior under certain circumstances - we can ignore other aspects as irrelevant to that question, even though those omitted properties may be very relevant to reasoning about other qualities of the object.
Take a chair. If I'm designing the chair, I really need to focus on the parts, how they connect, their structural strength, the stabilty of the base, and so on. If I'm trying to put the chair together, I need the parts list and the order in which those pieces must be assembled. If I'm an interior designer, I care about the chair's style and color. If I'm trying to arrange seating at an event, I care about the physical space it takes up and its capacity. Which model represents the chair?
The answer, of course, is that they all do. Each focuses on some aspects of the chair and elides detail not relevant to the reasoning required of the model.
This does not mean that models, and the abstractions underlying them, are either imprecise or useless. We can, and should, build models for systems and software that are both precise enough and broad enough to support the necessary reasoning. Sometimes, this entails detailed state behavioral modeling; other cases might result in mathematically precise models using languages/tools such as Simulink. In other cases, we'll build detailed architectural structure models to understand what the large scale pieces of the system are and how they connect.
To be generally helpful in systems and software modeling, the models rmust represent precise details about the system but not necessarily all the available detail. For example, I might model a device driver with UML. I precisely specify the structure in a UML class diagram, identifying the set of relevant attributes, their types and subranges, as well as the services that manipulate those attributes. I might also specify the order of the execution of the services within a state machine as the driver responds to various environmental events. I can then generate code using tools such as Rational Rhapsody(tm) and download it to the target, execute it, and visualize that execution using Rhapsody's execution and animation environment. I might very well have elided details about unrelated structural aspects such as links to an event logger implemented as an observer or attributes not related to the device driver manipulation of the hardware. I certainly have omitted specifying the source level language constructs managing the state machine execution. I have certainly not specified which CPU registers should be used. Those details aren't part of the model, but that doesn't mean that the model was imprecise with respect to the behavior and structure relevant to my viewpoint.
There are people who think that source code is the ultimate answer and modeling should - at most - be a very high level vague notional view, scribbled on a napkin and then discarded. I am not among them. I have seen tremendous benefits in the application of models precise enough to support execution/simulation and code generation.
Over the course of an entire project, 10-20 source lines of code (SLOCs) is the norm for software productivity. I understand that when you're actually sitting in front of your computer wielding vi (or Emacs - my wife and I argue about which is best!) to write code, your productivity is higher. Nevertheless, over the lifespan of an entire project, 10-20 SLOCs/day per programmer is typical. Interestingly, this seems to be independent of the abstraction level of the language. This is why source level languages smacked assembly language over the head. You can do a lot more with a line of high level source language than with a single assembly language statement. 10-20 lines of a high level source language might result in between 100 to 300 lines of assembly language. Similarly - properly applied - UML modeling can have a similar magnification of productivity when applied to source level languages. I've seen 200-300 SLOCS/day per programmer for effective modeling teams. It's all about the maturity of the organization with respect to its use of modeling.
I've created a model of modeling maturity for software development teams called the UML Modeling Maturity Index (UMMI), shown below:
The percent benefit is (informally) derived from the observation about the productivity of hundreds of teams using UML and/or SysML with varying degrees of success. As expected, the more mature the modeling an organization applies, the more they benefit from modeling.
To do precise modeling, be sure to specify that is the purpose of the model, and of each diagram. This is the "mission statement" for the diagrams that I talked about in my last blog "Forget 7 +/- 2". Provide all the detail relevant to the purpose of the model and the diagrams. If that purpose is to develop running software, then you'll need to specify the structural elements (classes with attributes and services for OO designs, functions, variables, and data types for structured designs, along with structural relations), state based behavior (with state diagrams) and algorithmic behavior (with activity diagrams or flow charts). This precise modeling of the software structure and behavior takes less effort that writing the equivalent source code and the source code can be generated from it. Further, you get the benefit of automatic design documentation because you know the design represents the actual shipping code.
It's a win-win scenario.
Ports are a design pattern
Ports are a design pattern (see my book Real-Time Design Patterns for a more detailed description of the pattern). All design patterns are an attempt to optimize some set of qualities of a system at expense of other aspects. All patterns have both pros (benefits) and cons (costs), just like any other design optimization decision. In the case of ports, they allow the connection points of elements to be identified and characterized, independent of the types of things connecting to them. That means that the class types at either end of the ports need not be known to the port – the only requirement is that the interface (or one of its subtypes) must be supported. Ports therefore support an interface-oriented design strategy pretty well. In fact, their best use arises when the behavior being invoked via a port is actually met by an internal part of a composite class; ports enforce information hiding by not allowing the client of the service to know who delivers the service or how it is done. This greatly improves maintainability of the system since internal changes to a composite element only require changes to the client when the interface changes, not when the implementation strategy changes.
As with all design patterns, there are costs to using ports. The first, and less serious, is that they clutter the diagram by adding more visual elements to the model. In addition, they add a small amount of overhead that is not always easy to optimize away. Thirdly, although multiplicity of ports is provided for by the UML specification, tools generally don’t support a multicast to send the same message to all elements connected over a (multi)port and so designers have to fill in means to do that when necessary. Lastly, ports can only be connected via links, not associations. That means that in order to show connected collaborations, only instances of the containing class can be shown on the diagram. Many developers find this limiting.
Technically speaking, a port is a feature of its containing class. Ports can pass both asynchronous (e.g. asynchronous events) and synchronous messages (e.g. function calls). Ports are “typed” by the interfaces they support. An interface is a collection of message specifications. Each port has two sets of interfaces supported, each of which can contain zero or more interfaces. The first are known as “provided” interfaces. These are services provided by the containing element or parts within that element. The second are called “required” interfaces. These refer to sets of services that must be provided by elements (outside the current one) that connect to this port. Ports that can be connected (what is provided by one is required by the other and vice versa) are said to be port conjugate pairs.
Ports may either terminate on the element that actually provides the behavior (in which case, they are called “behavioral ports”) or they may delegate to other ports on internal parts of a composite element (in which case, they are “delegation” or “relay” ports).
Delegate and behavioral ports form the core value of the usage paradigm of ports because they strongly enforce implementation hiding. If a service implementation is completely refactored inside a composite element, the client can remain unchanged. Sure information hiding is crucial for the development of large-scale reusable software elements and for product line engineering.
Are Ports for Systems Engineering or Software Development?
Modeling is used in both systems engineering and software development, but how about ports? The output of systems engineering is ultimately specification, not implementation. Within system modeling, done either with UML or SysML, ports are useful in two different cases. First, the Harmony Process (see my book Real-Time Agility for more information) contains a requirements capture practice known as “System Functional Analysis”. In this practice, use cases are modeled as blocks (classes) and given state-based behavior that represents the externally visible aspects of the requirements relevant to that use case. These blocks are then linked to actors (also blocks) that represent external elements that interact with the system and this collaboration is then executed. I’ve used this in many systems projects, from automotive to spacecraft development, and it’s always been helpful in finding missing, incorrect, incomplete, and conflicting requirements. These blocks are typically connected with ports. Although system functional analysis is inherently a requirements definition activity, constructing an executable model of the use case is made easier through the use of ports.
Another kind of specification output from systems engineering is the system architecture. The key here is to identify and characterize the subsystems to which we will allocate functionality. These subsystems are typically multi-disciplinary; that is, the implementation will be some combination of engineering disciplines such as pneumatic, hydraulic, mechanical, electronic, and software. Subsystems are most commonly modeled as blocks in SysML (classes in UML) that connect to other subsystems through ports characterized with well-specified interfaces. Ports provide a powerful means to model the connectivity of the subsystems at a “logical interface” definition level without having to specify even the implementation discipline, let alone the specific technology.
Having said all that, ports are still used even more in software. The Port Pattern is very common, especially in large-scale software units, called composite classes. These classes realize their behavior primarily through the orchestration of their internal parts. These parts are themselves typed by classes and may also be composites. This whole-part taxonomy provides a simple means of organizing the most complex software architectures and ports provide message delegation and information hiding throughout the software architecture.
Ports with links or associations between classes?
Of course, ports aren’t the only way to connect model elements. Normal associations between classes specify links that exist at run-time. Associations are most commonly implemented as pointers or references and are both lightweight and easy to manage. Non-unary multiplicity can be managed with arrays of pointers or references or more elaborate containers such as those provided by the C++ STL. Although they are lightweight, associations also have a cost; the navigable end must know the type (class) of the element on the other end of the association. This ties the relationship to one between specific types and their subtypes. This is more confining than using ports because ports only require interface compliance not type dependency.
Although ports can be used to connect to part object within a composite from outside that composite, ports provide a cleaner way to do so than with associations. Generally, I recommend that when traversing a link to an internal part object, ports be used if the overhead is acceptable. A common use of ports in this way is to send messages across thread boundaries.
The primary means for thread management in the UML is via active classes. An active class owns its own thread and event queue and its parts typically run inside of that thread. Active classes are typically composites as well and their internal parts perform the semantic functionality of the thread (the active object normally does the thread management and event distribution). Thus, to send an event to an object inside another thread, it is common to connect that object with its client via ports on the boundary of the active class.
Normal or Flow?
Normal ports in UML or SysML support fundamentally discrete communication. This communication can be synchronous (function calls) or asynchronous (asynchronous events) but the message is fundamentally discrete in nature. So, how do you model a continuous flow?
With SysML, a new kind of port was introduced – flow ports. These ports are used in systems engineering as a means to model continuous flows, such as the (continuous) pressure on a brake pedal, the voltage level on a wire, or the energy in a battery. Flow ports are bound to attributes of a block and when linked, the change of an attribute in one block flows along the link to update the value of an attribute bound to the flow port at the other end of the link.
Since UML defines four different kinds of event (asynchronous, synchronous, time, and change), these value changes can induce an event on the receivers state machine via a change event. Flow ports thus provide a simple means for data distribution in systems. Used in this way, you can even think of flow ports are providing a rudimentary publish-subscribe pattern since when the server object’s data changes it automatically propagates to the receiver.
Ports are a very useful design pattern for requirements analysis, architecture definition, and data propagation. They do have some overhead that may not always be optimized away but they support an interface-based development strategy and as well as simple data distribution. I personally recommend ports be used primarily between architectural or large-scale elements such as subsystems, software components, structured classes, and task threads. However, once you’re inside a composite, associations are recommended for inter-class communications.
This webcast was pretty popular and had a large number of questions that I wasn’t able to address on line, so I’ll address some of them here.
Can these design processes be applied to systems engineering as well?
For the most part, absolutely – especially for the architectural patterns. However, keep in mind that the outcome of system engineering is specification while the outcome of software development is implementation. In addition, the Concurrency and Resource view of architecture is specific to the software domain.
Can you speak about the implications for tool qualification in a DO-178 level A environment?
The DO-178 standard identifies two types of tools (formally three types in DO-330, but it still works out to be effectively two): tools that could insert a defect through their use (development tools) and tools that could fail to identify a defect (verification tools). In general, you need not qualify development tools because verification will identify the defects so they may be fixed – provided, of course, that verification is adequate to discuss such defects. However, verification tools are usually relied upon to identify defects to they usually must be qualified. So modeling tools, editors, and compilers are not usually qualified but testing and test coverage tools usually are.
How large a development team does this scale to and what are some of the scaling issues?
The approach and practices discussed in the talk can be applied to small teams and large teams. Large teams always carry along with them greater concerns for project management and governance. The biggest issue in terms of the design practices per se has to do with the coordination of the work items among the various subteams. Hierarchical planning addresses that concern but does require more diligence and effort. Distributed teams also add some barriers that can be addressed through a combination of process and tooling. For example, Design Manager supports web-based reviews of models to support the review objectives of a number of safety standards. Practices around those issues will be addressed in an upcoming webcast on project management practices.
Can you comment on deploying this approach across a distributed team?
The concerns specific to distributed systems include:
1. Lack of project control
2. Lack of visibility of project and product status
3. Detailed coordination is more difficult because of delays, bandwidth limitations, and difficulty in
4. Difficulty in communication
5. Cultural and organizational differences
6. Configuration management and work product sharing
Cloud-based CM solutions are an approach to share work products and related information. Distributed project management/governance tools such as Rational Team Concert can address visibility of metrics and project status of distributed teams. These concerns will be addressed in the aforementioned upcoming webcast.
Hello Bruce- How what you discuss maps to ISO 26262. Have you ever check/tried Harmony mapping into this?
There are mappings to IEC 61508 and ISO 26262 and the Harmony process content under development. Right now, we have detailed mappings to DO-178B, DO-178C, and IEC 62304.
You have provided and example of SW Design Pattern, more related to SW architecture than to SYS. Are there any good Systems Architecture patterns used in embedded world?
There certainly are patterns defined within the Harmony process. They are similar or identical to some of the software design practices, but apply to designs that have not been allocated to specific engineering disciplines. The Deployment view (done with block diagrams (SysML) or UML (software) are used to show the allocation of responsibilities and requirements to different disciplines and the specification of the cross-discipline interfaces, such as the interface between the software and electronics.
You mentioned using TDD and other processes which include testing into a cross-functional team, how do you deal with keeping the independence of the test team as some standards like Cenelec state in order to verify and validate your safety critical system.
Independence of the development of the system verification test cases is required – that means that the designers/developers cannot be the same people that develop the test cases (it doesn’t affect who executes the test cases though). Independence is not required for integration, although it is useful and it is not recommended for white box developer testing. TDD specifically refers to the white box developer tests so independence is not a concern. It remains a requirement for verification however.
You said we were mainly talking about collaboration between electronics and software. but are there more specific issues when developers interface with nuclear, avionics, mechanical, etc. as different disciplines?
The process I outlined specifies systems level requirements and creates systems architectures. The subsequent hand-off to downstream engineering takes place primarily at the subsystem level, including the allocation of subsystem requirements to different engineering disciplines and the specification of the cross-disciplinary interfaces. It is expected that disciplines such a mechanical, chemical and nuclear will accept their requirements and interfaces specifications and then proceed with the discipline-specific design in appropriate tools. The software development will continue in UML but it is difficult to imagine how mechanical or chemical design would be done in UML or SysML.
Is system level testing applied in Harmony? At what level of development probably not during the nanocycles? My understanding is that during nanocycle there is rather unit testing in place. can you confirm?
There are system verification activities done in the Harmony process but not during the nanocycles. The microcycle (sprint) has a verification activity at the end of each iteration that verifies the produced (but possibly incomplete) system meets all the requirements developed during that and previous iterations. The nanocycle TDD testing is fundamentally white box developer testing.
Bruce, please do not throw away napkins model. We often use them when we communicate and it works as long as we can share ideas during meetings etc. You well know how difficult it is to create precise and fully ready model, often not worth time spend on it. Can you please comment that?
I’m not saying that you should never develop napkin models, just that they shouldn’t be the primary form of model and they shouldn’t be treated as normative. Napkins models are useful for framing a discussion. They are not useful, in my experience, for specifying a normative design. If a set of napkins models are used for discussion to discuss trade offs, for example, the winning napkin should probably be developed into a normative high-fidelity model. IMHO.
Is Rational considering AUTOSAR in their Rhapsody Road Map?
Rhapsody already supports AUTOSAR as a profile in Rhapsody and is continuing to evolve it based on customer experiences.
What Time Frame do you think is necessary put together Design Pattern/ implement system
I think of the microcycle (sprint) as a 4-6 week period of time during which requirements are elaborated, functional software is developed, designs optimize the implementation, and verification is performed. The functional software and design optimizations are usually in the 2-3 week timeframe but this can vary +/1 a week and still stay within the overall microcycle timeframe.
What if Design Flaw shows up, can you trace it back?
Design flaws manifest themselves initially as failed test outcomes, then change requests, then as work items, and ultimately as updated design, code and tests. Good tooling – such as Change or Clear Quest can manage the traceability among the various forms defects take, the work products affected and the workers making the changes.
In defensive design couldn't that also result in extra money which may be a deterrent and as a result not performed?
That is absolutely true and this gives rise to the software folk wisdom “we have time to fix it but we don’t have time to do it right.” I personally believe that the “rush to code” and the lack of defensive design is short sighted. As Grady Booch says “If we developed houses the same way we develop software, the first woodpecker would have destroyed civilization.”
I would like to know why this approach cannot be applied to Waterfall models of development
It can be, but the result is no longer waterfall.
Should the aspects of Defensive Design be considered during analysis (and then designed and implemented) or after should this be put off after implementation of the "working model" - please explain tradeoffs
Defensive design is, first and foremost, an optimization of robustness and so it typically added into the design phase, which follows the creation of the initial functionally-correct high-fidelity model. Code, of course, is continuously being generated and updated during both the functional modeling and the design optimization. The problems with deferring the defensive design approach include 1. If time is short, it will be removed and 2. It is much harder to go back and “make it right” later than it is to “make it right” the first time. This is why, for example, TDD puts developer testing as a parallel activity to modeling and coding. The idea is to avoid defects rather than introduce defects now and find and remove them later.
Can you elaborate on what is meant by 'Continuous Safety Assessment'?
With the software development activities, design and technology decisions are made. These can have impacts on safety. For example, deciding to add a thirty party math library or using a commercial CORBA ORB have impacts on safety that must be addressed. Therefore in parallel with the software development activities, such decisions are examined by the Safety Czar to look at how they impact safety. This may result in new concerns and work items that must be addressed. In the Harmony process, “Update the Hazard Analysis” task runs in parallel to the entire microcycle so that technical decisions can be reviewed and safety concerns can be addressed.
Bruce, for someone who knows UML, how do you suggest to pick up SysML most effectively?
If you know UML, then the biggest changes when you move to SysML are just adopting the name changes (for example, “class” becomes “block”). There are some other important changes, such as the addition of parametric diagrams (useful for trade studies) and continuous behavioral modeling in activity diagrams. There are a few books on SysML but none that I’m really happy with. I’d probably look online for presentations that address my concerns. Also, you can defer learning about SysML features that might be new to you until you need them.
Note the project I'm using employs Rational Software Architecture Real-Time Edition (RSARTE); to what extent are some of these approaches applicable to that as well?
All the design practices can be performed within RSARTE. The Harmony process and practices are tool agnostic. I spend most of my time with Rational Rhapsody so I use that tool for examples and work, but other tools can be applied as well. Certain of the flows are made more difficult with some of the less capable tools. For example, in the High-Fidelity Modeling workflow there is a Translation task which produces code from the model. With Rhapsody (or RSARTE), that is mostly a matter of pushing the Generate/Make/Run button. In a less capable tool, you might pop up vi or edlin and bang out the code by hand. The workflow remains the same though.
The system seems extremely thorough but not very agile - how is this more agile then previous methods?
is really all about avoiding defects with continuous testing, being responsive to changing needs and circumstances, and dynamically project planning and governance. This talk focused specifically on the design practices, which included high-fidelity modeling, TDD, continuous integration, continuous safety assessment, and continuous traceability. These practices are what makes the approach agile. These certainly aren’t all the practices in place – if it was then I wouldn’t need the other four parts to the series! In addition, developing safety critical systems levies another set of needs of the development team and these can be address in an agile way as well.
What if you have portion of your system that is safety-critical (vital) and non-vital, would you recommend splitting to use your techniques (Rhapsody) and standard not critical processes (maybe like Enterprise Architect)
If part of the system is non-critical, then you must have architectural partitioning to ensure that any defects in the less-robust-and-verified parts of the system cannot affect the safety of the safety critical parts. I would tend to use the same tool for all to facilitate architectural allocation and refactoring personally.
What are your recommendations for Information Security-specific design patterns?
Check out Security Patterns: Integrating Security and Systems Engineering http://www.amazon.com/Security-Patterns-Integrating-Engineering-Software/dp/0470858842/ref=sr_1_1?ie=UTF8&qid=1362597852&sr=8-1&keywords=security+patterns.