Evolutionary architecture and emergent design

Harvesting idiomatic patterns

Tying together emergent design techniques to find and harvest idiomatic patterns


Content series:

This content is part # of # in the series: Evolutionary architecture and emergent design

Stay tuned for additional content in this series.

This content is part of the series:Evolutionary architecture and emergent design

Stay tuned for additional content in this series.

In the first installment of this series, "Investigating architecture and design," I asserted that every project of any significant size includes design elements that no one ever anticipated. When you get into the details of a problem, it is quite common to discover that things you thought would be hard are easier than expected, and vice versa. Subsequent installments demonstrated techniques for uncovering hidden but interesting design elements. In this article, I tie those ideas together and present an extended case study, using tools and approaches to discover a neglected but important part of a code base.

I introduced the concept of idiomatic patterns in "Composed method and SLAP." In contrast to the formal Design Patterns popularized by the Gang of Four's Design Patterns book (see Related topics), idiomatic patterns don't apply across all projects. But they are pervasive, representing common design idioms in your code. They can range from purely technical patterns (for example, the way a project handles transactions) to problem-domain patterns (such as "always check the customer's credit before processing an order"). Discovering these patterns is the key to emergent design.

Proponents of Big Design Up Front design methodologies spend a lot of time before coding starts trying to determine all the required design elements for the application at hand. Most of what they document is important to the solution's overall design. However, implementing software uncovers surprises. Every design element you implement is coupled to other design elements, creating an extremely complex web of dependencies and relationships. Aspects of the code that you think are trivial will magnify in complexity once you implement all the other required parts of the system. Inability to understand the complex interactions of disparate design elements in code leads to enormous difficulties in estimating the effort required to complete a solution. Estimation is still somewhat of a black art in software precisely because it's difficult to understand, and therefore analyze, this complex spider web of coupling and interaction.

Agile methodologies that rely on emergent design try a different approach. Agile architecture and design don't eschew design before coding, but their practitioners have figured out that you cannot understand the full extent of the problem until you've implemented a nontrivial portion of the entire thing. Developing skills in emergent design allows you to defer decisions until you have more context. The lean software movement (see Related topics) has a great concept called the last responsible moment: defer decisions not until the last moment, but until the last responsible moment. The longer you can put off design decisions, the more information you have, enabling a more nuanced and contextualized decision.

Harvesting idiomatic patterns

Emergent design implies the ability to find design elements in existing code. You can think about those elements as effective abstractions with a reuse potential. One technique for harvesting those idiomatic patterns uses a combination of metrics. To illustrate this technique, I'm going to use (as I have in previous installments) the Apache Struts code base (see Related topics). I'm using Struts not because I think it has deficiencies (actually just the opposite), but because it is well-known and open source. I contend that every code base includes idiomatic patterns, so any project would do.

Using metrics, redux

In "Emergent design through metrics," I discussed using metrics to uncover interesting parts of an unfamiliar code base as the target for refactoring to improve design. I used two metrics: cyclomatic complexity and afferent coupling. Cyclomatic complexity is purely a measure of the relative complexity of one method versus another. Thus, it makes sense only when compared to other cyclomatic-complexity measures. However, it is reasonable to state that a method with a lower cyclomatic complexity is generally less complex. Afferent coupling represents the count of how many other classes reference the current class, via either fields or parameters. I use the CJKM metrics tool (see Related topics) to gather these numbers on the Struts code base.

Running these two metrics against the Struts 2 code base produces the table in Figure 1, showing only the two metrics in question:

Figure 1. ckjm metrics results in a table
Tabular view of ckjm metrics results
Tabular view of ckjm metrics results

Figure 2 shows the same table, sorted by Weight Methods per Class (WMC):

Figure 2. ckjm metrics, sorted by WMC
ckjm results, sorted by WMC, shows     the DoubleListUIBean class is the most complex class in the Struts code base.
ckjm results, sorted by WMC, shows the DoubleListUIBean class is the most complex class in the Struts code base.

Just by looking at this result, you can tell that the DoubleListUIBean class is the most complex class in the Struts code base. That suggests that it is a good candidate for refactoring to remove some of the complexity and see if you can find some abstractable, repeating patterns. However, the WMC number doesn't tell you whether investing in refactoring this class toward better design is a good use of time. Notice the Ca (afferent coupling) metric for this class, which has a value of 3. That means that only three other classes use this class. It might not be worth investing lots of time to improve this class's design.

Figure 3 shows the same CKJM results, sorted this time by Ca:

Figure 3. ckjm results, sorted by Ca (afferent coupling)
ckjm results, sorted by Ca, shows that the most-used class in Struts is Component
ckjm results, sorted by Ca, shows that the most-used class in Struts is Component

This combined view shows that the most-used class in Struts is Component (not surprising, given that Struts is a Web framework). While Component isn't as complex as DoubleListUIBean, it is used by 177 other classes, which makes it a good candidate for design improvements. Making the design of Component better has a ripple effect on a large number of other classes.

The view shown in Figure 3 allows you to see the complexity and number of references side by side. To find classes that have design challenges, look for high combinations of numbers (implying a complex class used by many other classes). My prime candidate for investigation is the UIBean class, which has a cyclomatic complexity of 53 and afferent coupling of 22. This is a complex class used by many other classes, so I'm going to investigate it further.

The cyclomatic complexity number ckjm reports represents the sum of the complexity of all the methods in the class. I want to determine what makes this class so complex, so I need individual complexity numbers for methods. Running JavaNCSS, an open source cyclomatic-complexity tool (see Related topics), on this single class yields the results shown in Figure 4:

Figure 4. Individual complexity numbers for the UIBean class
method complexity values, shows the most complex method is evaluateParams(), with a complexity of 43
method complexity values, shows the most complex method is evaluateParams(), with a complexity of 43

By far, the most complex method is evaluateParams(), with a complexity of 43 (and also the lion's share of lines of code). This method apparently handles the common case of extra parameters passed as part of the request to the Struts controller, dispatching the parameter types to actual Struts classes and components. Much structural duplication exists in this code, as shown in Listing 1:

Listing 1. Partial contents of the evaluateParams() method showing structural duplication
if (label != null) {
    addParameter("label", findString(label));

if (labelPosition != null) {
    addParameter("labelposition", findString(labelPosition));

if (requiredposition != null) {
    addParameter("requiredposition", findString(requiredposition));

if (required != null) {
    addParameter("required", findValue(required, Boolean.class));

if (disabled != null) {
    addParameter("disabled", findValue(disabled, Boolean.class));

if (tabindex != null) {
    addParameter("tabindex", findString(tabindex));

if (onclick != null) {
    addParameter("onclick", findString(onclick));
// much more code elided for space considerations

This code is a candidate for improvement (see the upcoming section, Improving the code, part 1), but I want to poke around a little more as to the reason this code exists and perhaps why it contains so much complexity.

Looking at the other high combinations of cyclomatic complexity and afferent coupling, I find WebTable, which has values of 33 and 12, respectively. Running JavaNCSS on it confirms my suspicion: its second most complex method is evaluateExtraParams(). I see a pattern here! Seeing this repeated complex element in lots of different classes makes me suspect that a lot of accidental complexity around parameters exists, so I conduct an experiment. Using a bit of UNIX® command-line magic, I look to see how many classes in Struts have a method named either evaluateParams() or evaluateExtraParams():

find . -name "*.java" | xargs grep -l "void evaluate.*Params" > pbcopy

This command finds all the Java™ source files from the current directory downward, and for each found file it searches within the file for any method definition that starts with evaluate and ends with Params. The last bit of redirection (>) pastes the resulting file list on the clipboard (at least on the Mac). When I paste the results, I get a surprise:



All these classes have one or both of those methods in them! I have found an idiomatic pattern. Obviously, lots of classes in Struts need to override and customize the behavior of how parameters are handled, and all of these classes handle custom cases themselves. Now the question is: how do I make this better?

Improving the code, part 1

In UIBean's evaluateParams() method, you see lots of the variety of structural duplication that one of my colleagues calls "same white space, different values." In other words, the structure is the same with substitutions for different class or variable names. This represents a code smell because you have essentially copied-and-pasted code throughout your application with minor variations.

A common technique for fixing structural duplication uses metaprogramming to encapsulate the repeated structure in a single place. Using reflection to supply different required values, Listing 2 shows a new method and an improved prelude to the evaluateParams() method:

Listing 2. Metaprogrammatically removed structural duplication
protected void handleDefaultParameters(final String paramName) {
  try {
      Field f = UIBean.class.getField(paramName);
      if (f.get(this) != null)
          addParameter(paramName, findString(f.get(this)));            
  } catch (Exception e) {
      throw new RuntimeException(e.getMessage());

public void evaluateParams() {

  addParameter("templateDir", getTemplateDir());
  addParameter("theme", getTheme());

  String[] defaultParameters = new String[] {"label", "labelPosition", "requiredPosition",
      "tabindex", "onclick", "ondoubleclick", "onmousedown", "onmouseup", "onmouseover", 
      "onmousemove", "onmouseout", "onfocus", "onblur", "onkeypress", "onkeydown", 
      "onkeyup", "onselect", "onchange", "accesskey", "cssClass", "cssStyle", "title"};

  for (String s : defaultParameters)

The handleDefaultParameters() method in Listing 2 encapsulates the repeated structure from the original into a single if statement. It accepts a parameter that specifies the Struts parameter name and uses reflection to grab the appropriate field programmatically. Then, it does the null check from the original code, finally calling the Struts addParameter() method.

Once I have the handleDefaultParameters method, I can significantly reduce the number of code lines (and the cyclomatic complexity) of the original. I create an array of Strings for each of the applicable Struts parameter names and iterate over that array, calling the handleDefaultParameters() method on each one.

By consolidating all the parameter checking into a concise location, I've done more than cut down on the method's size. The original method had a cyclomatic complexity of 43. Each of the former if blocks took 3 lines of code (and contributed 1 cyclomatic complexity point). I removed the duplication with a single 9-line method (with a cyclomatic complexity of 4) and eliminated 66 lines of code (22 parameters x 3 lines each). That means that this simple change removed 57 lines of code from this class and dropped the cyclomatic complexity by 18 points (1 CC point x 22 parameters - 4 CC points) for the new method. For such a small change, I greatly improved the application's readability, metrics, size, and maintainability. If in the future I need to change the way I call the Struts addParameter() method, I can do it in one place.

This is a short-term fix, but I show it to illustrate how simple changes can have a profound effect on the cleanliness of code. However, if this were my code base, I would put a longer-term solution in place.

Improving the code, part 2

If this were my project, I would abstract the entire parameter-handling mechanism to its own set of classes, essentially building a subframework within Struts. The complexity of the code for handling parameters, as well as its pervasiveness and quantity, suggests that it should be treated as a first-class citizen within Struts. Doing so is beyond the scope of a single article, but you can see that a huge amount of the complexity of Struts (based on the metrics) revolves around this problem.

Emergent design and idiomatic patterns

Do you think that the original designers of Struts ever dreamed how much code would be required to handle parameters? Software is like that. You can sometimes predict complexity based on speculative knowledge of the problem domain, but writing code creates new constraints and opportunities that are virtually impossible to predict. In fact, senior developers don't get that much better at predicting the hard stuff; they get better at guessing that mysterious hard stuff will eventually rear its head.

Part of the appeal of emergent design is the realization that we cannot reliably predict what is going to be tough, but we should keep a wary eye out for it. If you look at a code base with the expectation that you'll find abstractions and patterns, they get easier to see.

I finish with a case study, based on a ThoughtWorks project I've worked on intermittently. Very early in this large Ruby on Rails project, the tech lead realized that we needed asynchronous behavior in a few isolated cases (for example, when uploading a large number of images, the user wanted to be able to leave the page and come back later for status). If we'd had a Big Design Up Front mentality, we would have immediately gone for a message queue. But at the outset of the project, when we couldn't know all the things that would require asynchronicity, the default position would have been to acquire the most elaborate message queue we could find, to ensure that it could handle future new requirements. But the tech lead intelligently did not do that. He decided that what we had was good enough for the situation at hand.

Fast-forward two years. By this point, the application had three distinct asynchronous behaviors, and the current solution started to become a bottleneck. Now it was time to get a message queue. But because the tech lead delayed the decision so long, we knew exactly what this application needed in terms of messaging, allowing us to get the simplest tool that did the job. By waiting until the last responsible moment, we saved years of working around accidental complexity imposed by a tool that was more elaborate than we needed — yielding cleaner code, higher velocity for new features, and less annoying stuff to work around.

Allowing the code to lead you toward design means that you have a better understanding of what you need. The longer you can defer design decisions, the clearer the path once it comes time to commit to a decision that will have long-term implications.


Much of this article series has been loading context into your head so that the real benefits can appear. This installment ties together virtually every previous article in the series, leveraging the techniques, tools, and attitudes presented. Emergent design requires the ability to see and harvest idiomatic patterns and abstractions, and tools and techniques to take advantage of those things when they appear. Design up front helps you discover the important parts (and agile projects do enough design up front to determine those things), but keeping eyes and mind open once you start coding will lead you to surprisingly important design elements. Every code base has idiomatic patterns; you just have to learn to see them and act.

I've mostly been covering design and related concerns in the last few installments. Next time, I'll delve back into evolutionary architecture and show some common concerns and solutions that arise when agile development techniques are married with architectural concepts.

Downloadable resources

Related topics

  • The Productive Programmer (Neal Ford, O'Reilly Media, 2008): Neal Ford's most recent book expands on a number of the topics in this series.
  • Design Patterns: Elements of Reusable Object-Oriented Software (Erich Gamma et al., Addison-Wesley, 1994): The Gang of Four's classic work on design patterns.
  • Lean software development: This Wikipedia article provides a good overview of the lean software movement, which borrows techniques from lean manufacturing.
  • Mary Poppendieck: Poppendieck is a well-known pioneer in the lean software development movement. Her Web site includes numerous resources highlighting this approach.
  • Apache Struts: This article's examples use the code base for Struts, a popular open source Web framework for Java.
  • ckjm: ckjm is an open source tool for generating Chidamber/Kemerer object-oriented metrics suite results for Java code.
  • JavaNCSS: JavaNCSS is an open source metrics tool that provides method-level values for cyclomatic complexity.
Zone=Java development
ArticleTitle=Evolutionary architecture and emergent design: Harvesting idiomatic patterns