Bruce's Top Ten Modeling Hints: #10 - Forget "7+/- 2". Every diagram should have a singular mission.
Back in the '80s, (yes, I am that old), people used to model software and systems with data flow diagrams. Among the problems with the commonly used approaches to modeling was the notion of "7+/-2" diagram elements per diagram. In order to comply with the goal deeply nested decomposition hierarchies were created, sometimes dozens of levels deep all to ensure that each diagram only had a tiny number of elements on it. The resulting models were virtually impossible to navigate and practically impossible to actually use. Why did people even try?
The reason is that someone, whose identify is lost, decided to apply the results of neuroligusitic research - notably the 1956 paper "The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information" (by George A. Miller, published in Psychological Review) to the application of visual models. The basic outcome of the actual paper is a discussion about the correlation between the limits of of short term memory and out capacity to reason about things. Specifically, if a person is presented with a data set that vary along one dimension (e.g. tones varying in pitch), and asked to perform a value-specific responses, performance drops abruptly around 7 different stimuli discriminates. This correlates, according to Miller, with memory span - the longest list of items that a person can recall correctly 50% or more of the time. Interestingly, Miller recognized that the correlation is actually coincidental, a point ignored by many subsequent readers of the paper.
So there is nothing "magic" about the magic number of 7 +/- 2, and even if there was, Miller's research doesn't apply to visual modeling. In the case of visual modeling, memory span and one-dimensional absolute judgement is irrelevant because the subject elements are right in front of you all the time and don't have to be recalled in order to be used in reasoning. Further, adherence to this rule in visual modeling leads to arbitrarily decomposing collections of directly related elements into distributed aggregated decomposition hierarchies. This means you are converting one-dimensional absolute judgements into multi-dimensional absolute judgements - that is, you are increasing the conceptual distance between directly related elements by introducing unnecessary and arbitrary separation. This makes comprehension of the relations among those elements far more difficult than it would be if the direct relationships were maintained.
So what's the alternative? The Harmony(r) process uses the notion of a "diagram mission statement" - a singular concept or purpose visualized by the diagram. The primary reason that many models are so difficult to navigate and understand is that either the diagrams have no coherent intent or that they have too many. If we create a set of class diagrams each with a specific mission - such as show the collaboration realizing a use case, show a generalization taxonomy, show the contents of a package, show an architectural viewpoint, etc - each diagram becomes a clear statement of that mission. This means that class diagrams are built up around interesting aspects of the model so that the stakeholders can address specific questions. If you have a new question, you can build up a viewpoint (i.e. diagram) around that question. As an aside, I usually explicitly state the mission of the diagram in a comment in a corner of the diagram. For example, such a comment might read "The mission of this diagram is to show the collaboration of high-level elements realizing the 'Track Tactical Objects' Use Case." And the diagram then shows all of the elements that contribute to that mission, even if they number 30 or more.
Another consequence is that the same class will likely show up on many different diagrams. That's ok, because as long as you are using a modeling tool (as opposed to a drawing tool), the model repository maintains all the views in sync. A modeling tool, such as Rational Rhapsody(tm), manages a semantic repository of information that dynamically links the diagrams and their elements to the underlying semantic basis of the model. If the repository is changed - such as happens when an element on a diagram is modified - then all the relevant diagrams are likewise changed. This is because the tooling ensures the elements on the diagrams are dynamically linked to that repository. Similarly, in high fidelity modeling, the source code is simply another view of the semantic repository and is likewise linked dyanmically to it. But that's another hint, for another day.
One complaint I hear about models is that they're abstract and therefore not useful. My initial response is "Duh!" to the first but "Huh?" to the second.
Humans cogitate almost exclusively using abstractions, hence the initial response. An abstraction focuses on some properties of an object (not using the term in the OO sense) at the exclusion of others. When we want to reason about some characteristic of a thing - such as its behavior under certain circumstances - we can ignore other aspects as irrelevant to that question, even though those omitted properties may be very relevant to reasoning about other qualities of the object.
Take a chair. If I'm designing the chair, I really need to focus on the parts, how they connect, their structural strength, the stabilty of the base, and so on. If I'm trying to put the chair together, I need the parts list and the order in which those pieces must be assembled. If I'm an interior designer, I care about the chair's style and color. If I'm trying to arrange seating at an event, I care about the physical space it takes up and its capacity. Which model represents the chair?
The answer, of course, is that they all do. Each focuses on some aspects of the chair and elides detail not relevant to the reasoning required of the model.
This does not mean that models, and the abstractions underlying them, are either imprecise or useless. We can, and should, build models for systems and software that are both precise enough and broad enough to support the necessary reasoning. Sometimes, this entails detailed state behavioral modeling; other cases might result in mathematically precise models using languages/tools such as Simulink. In other cases, we'll build detailed architectural structure models to understand what the large scale pieces of the system are and how they connect.
To be generally helpful in systems and software modeling, the models rmust represent precise details about the system but not necessarily all the available detail. For example, I might model a device driver with UML. I precisely specify the structure in a UML class diagram, identifying the set of relevant attributes, their types and subranges, as well as the services that manipulate those attributes. I might also specify the order of the execution of the services within a state machine as the driver responds to various environmental events. I can then generate code using tools such as Rational Rhapsody(tm) and download it to the target, execute it, and visualize that execution using Rhapsody's execution and animation environment. I might very well have elided details about unrelated structural aspects such as links to an event logger implemented as an observer or attributes not related to the device driver manipulation of the hardware. I certainly have omitted specifying the source level language constructs managing the state machine execution. I have certainly not specified which CPU registers should be used. Those details aren't part of the model, but that doesn't mean that the model was imprecise with respect to the behavior and structure relevant to my viewpoint.
There are people who think that source code is the ultimate answer and modeling should - at most - be a very high level vague notional view, scribbled on a napkin and then discarded. I am not among them. I have seen tremendous benefits in the application of models precise enough to support execution/simulation and code generation.
Over the course of an entire project, 10-20 source lines of code (SLOCs) is the norm for software productivity. I understand that when you're actually sitting in front of your computer wielding vi (or Emacs - my wife and I argue about which is best!) to write code, your productivity is higher. Nevertheless, over the lifespan of an entire project, 10-20 SLOCs/day per programmer is typical. Interestingly, this seems to be independent of the abstraction level of the language. This is why source level languages smacked assembly language over the head. You can do a lot more with a line of high level source language than with a single assembly language statement. 10-20 lines of a high level source language might result in between 100 to 300 lines of assembly language. Similarly - properly applied - UML modeling can have a similar magnification of productivity when applied to source level languages. I've seen 200-300 SLOCS/day per programmer for effective modeling teams. It's all about the maturity of the organization with respect to its use of modeling.
I've created a model of modeling maturity for software development teams called the UML Modeling Maturity Index (UMMI), shown below:
The percent benefit is (informally) derived from the observation about the productivity of hundreds of teams using UML and/or SysML with varying degrees of success. As expected, the more mature the modeling an organization applies, the more they benefit from modeling.
To do precise modeling, be sure to specify that is the purpose of the model, and of each diagram. This is the "mission statement" for the diagrams that I talked about in my last blog "Forget 7 +/- 2". Provide all the detail relevant to the purpose of the model and the diagrams. If that purpose is to develop running software, then you'll need to specify the structural elements (classes with attributes and services for OO designs, functions, variables, and data types for structured designs, along with structural relations), state based behavior (with state diagrams) and algorithmic behavior (with activity diagrams or flow charts). This precise modeling of the software structure and behavior takes less effort that writing the equivalent source code and the source code can be generated from it. Further, you get the benefit of automatic design documentation because you know the design represents the actual shipping code.
It's a win-win scenario.
Ports are a design pattern
Ports are a design pattern (see my book Real-Time Design Patterns for a more detailed description of the pattern). All design patterns are an attempt to optimize some set of qualities of a system at expense of other aspects. All patterns have both pros (benefits) and cons (costs), just like any other design optimization decision. In the case of ports, they allow the connection points of elements to be identified and characterized, independent of the types of things connecting to them. That means that the class types at either end of the ports need not be known to the port – the only requirement is that the interface (or one of its subtypes) must be supported. Ports therefore support an interface-oriented design strategy pretty well. In fact, their best use arises when the behavior being invoked via a port is actually met by an internal part of a composite class; ports enforce information hiding by not allowing the client of the service to know who delivers the service or how it is done. This greatly improves maintainability of the system since internal changes to a composite element only require changes to the client when the interface changes, not when the implementation strategy changes.
As with all design patterns, there are costs to using ports. The first, and less serious, is that they clutter the diagram by adding more visual elements to the model. In addition, they add a small amount of overhead that is not always easy to optimize away. Thirdly, although multiplicity of ports is provided for by the UML specification, tools generally don’t support a multicast to send the same message to all elements connected over a (multi)port and so designers have to fill in means to do that when necessary. Lastly, ports can only be connected via links, not associations. That means that in order to show connected collaborations, only instances of the containing class can be shown on the diagram. Many developers find this limiting.
Technically speaking, a port is a feature of its containing class. Ports can pass both asynchronous (e.g. asynchronous events) and synchronous messages (e.g. function calls). Ports are “typed” by the interfaces they support. An interface is a collection of message specifications. Each port has two sets of interfaces supported, each of which can contain zero or more interfaces. The first are known as “provided” interfaces. These are services provided by the containing element or parts within that element. The second are called “required” interfaces. These refer to sets of services that must be provided by elements (outside the current one) that connect to this port. Ports that can be connected (what is provided by one is required by the other and vice versa) are said to be port conjugate pairs.
Ports may either terminate on the element that actually provides the behavior (in which case, they are called “behavioral ports”) or they may delegate to other ports on internal parts of a composite element (in which case, they are “delegation” or “relay” ports).
Delegate and behavioral ports form the core value of the usage paradigm of ports because they strongly enforce implementation hiding. If a service implementation is completely refactored inside a composite element, the client can remain unchanged. Sure information hiding is crucial for the development of large-scale reusable software elements and for product line engineering.
Are Ports for Systems Engineering or Software Development?
Modeling is used in both systems engineering and software development, but how about ports? The output of systems engineering is ultimately specification, not implementation. Within system modeling, done either with UML or SysML, ports are useful in two different cases. First, the Harmony Process (see my book Real-Time Agility for more information) contains a requirements capture practice known as “System Functional Analysis”. In this practice, use cases are modeled as blocks (classes) and given state-based behavior that represents the externally visible aspects of the requirements relevant to that use case. These blocks are then linked to actors (also blocks) that represent external elements that interact with the system and this collaboration is then executed. I’ve used this in many systems projects, from automotive to spacecraft development, and it’s always been helpful in finding missing, incorrect, incomplete, and conflicting requirements. These blocks are typically connected with ports. Although system functional analysis is inherently a requirements definition activity, constructing an executable model of the use case is made easier through the use of ports.
Another kind of specification output from systems engineering is the system architecture. The key here is to identify and characterize the subsystems to which we will allocate functionality. These subsystems are typically multi-disciplinary; that is, the implementation will be some combination of engineering disciplines such as pneumatic, hydraulic, mechanical, electronic, and software. Subsystems are most commonly modeled as blocks in SysML (classes in UML) that connect to other subsystems through ports characterized with well-specified interfaces. Ports provide a powerful means to model the connectivity of the subsystems at a “logical interface” definition level without having to specify even the implementation discipline, let alone the specific technology.
Having said all that, ports are still used even more in software. The Port Pattern is very common, especially in large-scale software units, called composite classes. These classes realize their behavior primarily through the orchestration of their internal parts. These parts are themselves typed by classes and may also be composites. This whole-part taxonomy provides a simple means of organizing the most complex software architectures and ports provide message delegation and information hiding throughout the software architecture.
Ports with links or associations between classes?
Of course, ports aren’t the only way to connect model elements. Normal associations between classes specify links that exist at run-time. Associations are most commonly implemented as pointers or references and are both lightweight and easy to manage. Non-unary multiplicity can be managed with arrays of pointers or references or more elaborate containers such as those provided by the C++ STL. Although they are lightweight, associations also have a cost; the navigable end must know the type (class) of the element on the other end of the association. This ties the relationship to one between specific types and their subtypes. This is more confining than using ports because ports only require interface compliance not type dependency.
Although ports can be used to connect to part object within a composite from outside that composite, ports provide a cleaner way to do so than with associations. Generally, I recommend that when traversing a link to an internal part object, ports be used if the overhead is acceptable. A common use of ports in this way is to send messages across thread boundaries.
The primary means for thread management in the UML is via active classes. An active class owns its own thread and event queue and its parts typically run inside of that thread. Active classes are typically composites as well and their internal parts perform the semantic functionality of the thread (the active object normally does the thread management and event distribution). Thus, to send an event to an object inside another thread, it is common to connect that object with its client via ports on the boundary of the active class.
Normal or Flow?
Normal ports in UML or SysML support fundamentally discrete communication. This communication can be synchronous (function calls) or asynchronous (asynchronous events) but the message is fundamentally discrete in nature. So, how do you model a continuous flow?
With SysML, a new kind of port was introduced – flow ports. These ports are used in systems engineering as a means to model continuous flows, such as the (continuous) pressure on a brake pedal, the voltage level on a wire, or the energy in a battery. Flow ports are bound to attributes of a block and when linked, the change of an attribute in one block flows along the link to update the value of an attribute bound to the flow port at the other end of the link.
Since UML defines four different kinds of event (asynchronous, synchronous, time, and change), these value changes can induce an event on the receivers state machine via a change event. Flow ports thus provide a simple means for data distribution in systems. Used in this way, you can even think of flow ports are providing a rudimentary publish-subscribe pattern since when the server object’s data changes it automatically propagates to the receiver.
Ports are a very useful design pattern for requirements analysis, architecture definition, and data propagation. They do have some overhead that may not always be optimized away but they support an interface-based development strategy and as well as simple data distribution. I personally recommend ports be used primarily between architectural or large-scale elements such as subsystems, software components, structured classes, and task threads. However, once you’re inside a composite, associations are recommended for inter-class communications.