This article was originally published in February, 2003.
During the summer of 1999, I started trying to depict visually the often complex workloads that needed to be modeled for accurate performance testing. Because those early whiteboard sketches were so well received, I began using the same technique extensively to design and document performance test workloads. I later used the technique to demonstrate various workloads throughout my "User Experience, Not Metrics" series of articles. About the same time, I started jokingly referring to the technique as the User Community Modeling Language, or UCML. Much to my surprise, I received a lot of requests for the specification document for the language. Even more to my surprise, one of my readers (Nathan White, a retired military officer currently working for A. G. Edwards) wrote a specification document based on the articles and graciously sent me a copy.
This article describes UCML and discusses its use. Additional examples can be found in many of my articles, presentations, and document samples here on RDD and on my Web site. I hope you find this technique as useful as I have over the years.
UCML is a set of symbols that can be used to create visual user community models and depict associated user parameters. These symbols can serve to represent the workload distributions, operational profiles, pivot tables, matrixes, and Markov chains that performance testers often employ to determine what activities are to be included in a performance test, as well as with what frequency they'll occur. You can effectively use the resulting diagrams for documentation -- even of automated test designs -- and as a basis for discussions and data-gathering workshops. You can also use them to review and verify workflow and workload requirements with business analysts and users.
I developed UCML over the course of several years during my work on dozens of projects, with input from numerous clients, colleagues, and friends. My intent was to create an intuitive visual model of the often complex mathematical models that go along with performance-related testing: not as a replacement for those models, but as a supplement to them. This visualization technique has proven to be at least reasonably intuitive to testers, application users, developers, managers, and business owners alike. In my experience, it has enabled productive discussions about important topics like "Is this activity really that common?" and "Why does activity B have a prerequisite of activity D? That's not right," rather than "How do I read this pivot table, again?"
While I draw most of my UCML models with SmartDraw; I still find myself hand drawing them in sketches, whiteboards, Microsoft® PowerPoint® presentations, Microsoft® Visio®, and various other media. I've also seen UML state diagrams based on the general rules of UCML drawn with the IBM® Rational Rose® visual modeling tool. At this time, there's no "UCML-compliant" drawing tool, and I don't recommend using any software product in particular to create these models.
My sole intent in publishing this article is to provide access to a tool that performance test teams may find useful. This isn't an industry standard, nor has it been evaluated to comply with any industry standard or specification such as IEEE. You can use the symbols in whole or in part, and modify or improve upon them as you see fit. I ask only that if you have comments, improvements, or constructive criticisms, you send those to me by e-mail so that I can evaluate them for inclusion in any future documents about this technique.
UCML's basic symbols serve to represent possible user paths through the system. You can also think of a user path as an application session. One of the critical concepts of UCML is that it represents system usage during a specific, finite period of time -- often an hour. The fact that user paths are the basis of UCML is part of what makes it powerful, because all members of the project team already have an understanding of how users navigate through the system.
The quantity circle is used to answer the questions "How many?" and "How often?" Inside the circle is a number or a percentage, as illustrated in Figure 1.
A number is used when an exact quantity is being modeled (for instance, one administrator). A percentage is used when a portion of the entire population of the user community, regardless of the size of that population, is being modeled.
Description lines are solid horizontal lines that represent activities and user types. Each description line is labeled with one or more descriptors indicating the user type, activity, or navigation path represented. Enclosed in parentheses after the descriptor is the percentage of total users of that type, or the percentage of users performing that activity or following that navigation path over the period of time represented by the user community model.
At the far left side of the model, Figure 2 shows a user type description line that represents the type(s) of user(s) that will be accessing the modeled application being -- for example, basic user, power user, or administrator. The percentage associated with the user type is intended to indicate the proportion of all users who access the system represented by that type.
Activity type description line
An activity type description line represents the activity or navigation path of each type of user through the system. The user actions/activities completed, or the pages viewed on the navigation path, are listed above the line. It's the modeler's choice whether to list each step of a particular activity or simply name the activity itself. The key point to note is that each horizontal line represents a straight-line path (with no diversion) through that activity, as illustrated in Figure 3.
This line may branch or merge (see below), representing a choice of navigation paths through the system. The percentage of users choosing a particular navigation path is indicated by a quantity circle at the beginning of the path. At any given point in the model, the aggregate of users on all navigation paths is 100% of the users visiting the site.
Figure 4 presents a vertical dashed line representing a synchronization (sync) point in the model. If the synchronization point is named, the name is shown above the line.
Synchronization points are used to depict convergence. There are two types of convergence that can be represented by the synchronization point symbol -- convergence in time, or in navigational location.
- A convergence in time means that the modeled user that reaches the symbolic sync point first will wait for all other modeled users in the system to also reach that point before proceeding. This is often useful when modeling messaging systems.
- A convergence in navigational location means that the modeled users all navigate past the same place in the application but not necessarily at the same time. A common example is the login page of a Web site. In that example, all users must enter their credentials on the same page before being permitted to see the other pages. Identifying these navigational synchronization points often becomes useful when developing scripts, as they can serve as common start/stop points for scripts, scriptlets, functions, procedures, split scripts, and so on.
Some people find it useful to annotate the type of synchronization point parenthetically after the name -- for instance, Sync Point 1 (time) or Home Page (nav).
As users navigate through a system, they're often presented with an option or asked to provide data to the system that doesn't change their overall activity. These options are represented in the model by a dashed box suspended below the navigation path, demonstrated in Figure 5.
The option box should be labeled with a word or phrase that describes the option or data that varies from user to user. The specific data will be recorded in the spreadsheet below the model, described later in the section Providing Supporting Information.
Conditions represent points in the model where users will change their navigation path based on the results displayed to them. In Figure 6 below, imagine a user trying to purchase a book. If after searching for the book, the user finds it to be in stock, the user follows the Yes path and purchases the book. If, however, the book isn't in stock, the user follows the No path and abandons the system.
Figure 7 shows a dashed semicircle representing a loop on the navigation path, where the activity encircled by the semicircle is repeated. Either the number of iterations to be completed or the percentage of users to repeat the activity (not the percentage of all users represented by the model) is indicated in a quantity circle.
It's often important to model users exiting the system at points other than the end of a navigation path. A downward-curving line pointing to a quantity circle indicates the percentage of users leaving the system at a specific point on the path. Put simply, the model should show that all users who enter the system leave the system, as Figure 8 illustrates. The type of system exit should be noted by a label -- for instance, Log out (if the user leaves the system via the preferred means) or Abandon (if the user doesn't leave the system via the preferred means).
The following example depicts multiple exit paths from one navigation path. In Figure 9, 50% of the users are abandoning the system and 40% of the users are logging out and thus exiting the system in the preferred manner. (The other 10% are exiting from another navigation path, not shown.)
Most applications aren't limited to straight-line navigation paths. A group of users may start out performing the same activity,for instance, only to branch off later into different navigation paths. If we think of an activity line as a road on which a car is traveling, then a branch represents an intersection where the driver can choose which direction to turn. The following example shows a single branch where one of two paths can be selected off of the home page. The sum of the percentages of users on the branches must equal the percentage of users on the page leading into the branches. In Figure 10, the home page has 100% of the users and the sum of the percentages on the branches is 100%.
A single description line can have any number of branches, as shown in Figure 11 below.
Applications with branching navigation paths often also have instances where different navigation paths merge back together. In keeping with the traffic analogy, this is similar to cars from different roads merging onto the same road. Figure 12 following shows two different paths converging on the Update Billing Information activity. Once again, the total percentage of users before the merge must equal the total percentage of users after the merge.
Another common use of the merge is to show different user types entering an application at the same point, as Figure 13 demonstrates. In this case each different user type is represented by a different color. This is done so that later in the model, activities that are specific to a particular user group can be identified by color. Each user type is followed by a parenthetical abbreviation that can also be used later in the model. This is particularly useful for models that must be represented in black and white.
Combining symbols into a user community model
Combining the basic symbols is what gives UCML its power. You can form visual models of entire communities by combining the symbols to represent all the possible user navigation paths through a system. Using the traffic analogy again, you can envision the lines in the model as roads, with each modeled user as a car driving those roads. Think of the places where two or more lines meet as intersections, synchronization points as traffic signals, and exits as off-ramps. Quantity circles show how heavily a section of "road" is traveled, thus creating a "map" of how users will traverse the application.
In order to demonstrate this usage map, I'll build a UCML diagram for an online bookstore. Let's assume this is a new application so we have no existing usage data to analyze. Through a series of interviews we've collected the following information about the activities users conduct on the site, along with their anticipated relative volume:
- There will be four user types:
- New Users (20%)
- Members (70%)
- Administrators (4%)
- Vendors (6%)
- All user types enter through the home page.
- New Users and Members can conduct the following activities:
- Search books by
- Title
- Author
- Keyword
- Add one or more books to their cart
- Save their cart to order later
- New Users will also be able to choose from these activities:
- Create an account
- Order items (only after creating an account and becoming members)
- Members will also be able to choose from these activities:
- Log in
- Update their account
- Order items
- Check order status
- Administrators and Vendors must log in from the home page and then start on the administration page.
- Administrators can choose from the following activities:
- Add new books
- Check order status
- Update order status
- Cancel orders
- Vendors can run the following reports:
- In stock
- Sales last week
- Sales last month
The UCML diagram in Figure 14 takes a first cut at representing this information and uses all of the UCML symbols.
|
| Figure 14: UCML diagram for an online bookstore |
| (click here to enlarge) |
It's obvious that this diagram contains information not mentioned in the interview summary. For the purposes of this first draft, we can estimate the information. In some cases, the first draft will contain notes to reviewers or question marks to indicate that more information is needed. A close look will also reveal that not all information from the interview summary is included. This is also typical for a first draft. The diagram can now be circulated back to the people who were interviewed so that they can validate or correct navigation paths and user volumes, and add or remove activities.
Providing supporting information
Unfortunately, the UCML diagram alone can't provide all of the information required to implement the depicted workload. To fully implement the user community model, several more pieces of information are needed. This information includes how long users may spend on a page, what data may need to be entered on that page, and what the actual conditions are behind a condition symbol. The information itself is mostly useless without a model, and the model can't be implemented without this supporting information.
Instead of trying to cram all of this information onto the diagram, it makes sense to supplement the diagram with a spreadsheet. This spreadsheet organizes supporting information by model component (activity, sync point, condition, and so on) so that it's easy to relate it back to the model. Figure 15 is a subset of the spreadsheet that contains the supporting information for the diagram in Figure 14. You can download the template for this spreadsheet if you want. The nature of much of this information -- such as user think time, abandonment, and data variance -- is discussed in more detail in the "User Experience, Not Metrics" and "Beyond Performance Testing" series.
|
| Figure 15: Part of the spreadsheet that supports the UCML model in Figure 14 |
If done carefully, the UCML diagram and supporting spreadsheet are all you need in order to plan, design, and document any given workload distribution. Along with actual data files, furthermore, this is all of the information you'll need to implement the workload distribution using your load-generation tool.
The User Community Modeling Language (UCML) is a simple yet powerful way to visually depict the workload distribution models and associated data necessary to create an effective performance test. The models created using UCML are easily understandable by users, managers, analysts, developers, and testers alike with little or no explanation. This allows for an ease of conversation and results in more accurate user models than simply presenting complex spreadsheets of data, thus increasing your confidence that your performance tests will accurately predict performance in production.
Thanks go to Nathan White for sharing with me the UCML specification he wrote for his company. Many of the examples in this article are largely unmodified from his specification. Thanks, Nate -- I probably never would have formally defined UCML if not for your assistance.
Currently, Scott Barber serves as the lead Systems Test Engineer for AuthenTec. AuthenTec is the leading semiconductor provider of fingerprint sensors for PCs, wireless devices, PDAs, embedded access control devices and automotive markets. He is also member of the Technical Advisory Board for Stanley-Reid Consulting, Inc.
With a background in consulting, training, network architecture, systems design, database design and administration, programming, and management, Scott has become a recognized thought leader in the field of performance testing and analysis. Before joining AuthenTec, he was a software testing consultant, a company commander in the United States Army and a government contractor in the transportation industry.
Scott is a co-founder of WOPR (the Workshop on Performance and Reliability), a semi-annual gathering of performance testing experts from around the world, a member of the Context-Driven School of Software Testing and a signatory of the Agile Manifesto. He is a discussion facilitator for the Performance and VU Testing forum on Rational DeveloperWorks and a moderator for the performance testing and Rational TestStudio related forums on QAForums.com. Scott speaks regularly at a variety of venues about relevant and timely testing topics. Scott's Web site complements this series and contains much of the rest of his public work. You can address questions/comments to him on either forum or contact him directly via e-mail.
Comments (Undergoing maintenance)





