Way back in the very first installment of this series, I suggested some definitions for architecture in the software world. However, if you've been reading this whole series (and if you're not my mother, I thank you for that!), you've noticed that I've spent most of my time on design. I've done so for a couple of reasons. First, many definitions of architecture exist in the software world (for better or worse), while emergent design currently enjoys less fame. Second, many of the concerns in design have concrete, less-contextualized solutions. Architecture always involves a lot of coupling with physical and logical infrastructure within organizations, making it harder to talk about in isolation.
This installment rectifies the lack of material about agile architecture. Here I talk about how to distinguish architecture from design, cover some broad architectural considerations, then dip a toe into the agile service-oriented architecture (SOA) space with a discussion about versioning endpoints.
Distinguishing architecture from design
Martin Fowler's definition of architecture (from conversations with him) is my favorite:
Architecture is the stuff that's hard to change later. And there should be as little of that stuff as possible.
You can think of the interaction between architecture and design as the relationship shown in Figure 1:
Figure 1. Relationship between architecture and design
The architecture of a software system forms the foundation upon which everything else sits, represented in Figure 1 as gray boxes. Design elements sit atop the architecture, as shown in red boxes. Being more foundational, architectural elements are harder to move around and replace because you'll have to move all the things on top of them to accommodate the changes. This distinction makes it easier to identify design vs. architecture. For example, the Web framework you use is an architectural element because it is hard to replace. Within that Web framework, though, you can use different design patterns to express specific goals, which suggests that most of the formal design patterns are indeed part of design rather than architecture.
The corollary to Fowler's definition of architecture is that you should construct the architectural elements so that they become easier to replace if you really need to. But how can you ensure that? Here is an example.
Lots of frameworks try to seduce you into using some of their classes rather than the more general ones that come either with the JDK or from an open-standards body (such as OASIS). This is the crack-dealer model of coupling; if you yield to the temptation, you are coupled to the framework forever. The general approach these frameworks take is to make something significantly easier if you use their classes. A perfect example of this comes from the Apache Struts Web framework (see Resources).
The classes within your application that include business rules and other noninfrastructural code are domain classes: they hold the interesting information about your problem domain. One of the cool helper classes included within Struts is the
ActionForm class. If you inherit your domain objects from
ActionForm, all sorts of magic happens in your application. You get automatic population of form fields from parameters, automatic validation (at both the Web and server tiers), and other handy stuff. All you have to do is subclass the Struts
ActionForm class, as shown in Figure 2:
Figure 2. Using the Struts
In Figure 2, the box labeled Model includes your domain object. It extends Struts's
ActionForm, making this structure hard to change later. If, at some point in the future, you decide that the
ScheduleItem also needs to work within a Swing application, you're sunk. You are left with two equally unpalatable solutions: drag all of Struts into the Swing application (and don't use it) or untangle the dependency to Struts.
The better alternative uses composition rather than inheritance, as illustrated in Figure 3:
Figure 3. Decoupling your domain class through composition
In this version, the domain class (in yellow) includes an interface that defines the semantics of a schedule item. The original
ScheduleItem implements this interface, which is also implemented by the
ScheduleItemForm, forcing the semantics of the two classes always to agree. The
ScheduleItemForm in turn owns an instance of the
ScheduleItem domain object, and all of
ScheduleItemForm's accessors and mutators pass through to the underlying accessors and mutators of the encapsulated
ScheduleItem. This allows you to take advantage of Struts's cool features while keeping you decoupled from the framework.
The rule of thumb is this: it is okay for the framework to know about you, but it's not okay for you to know about the framework. As long as you can maintain that relationship, you prevent coupling your code to the infrastructure, allowing you to make changes to both architecture and design more easily. It sometimes takes a bit more work to do this, but you end up with improved flexibility. Struts isn't the only framework to offer these tempting affordances. Virtually every framework includes some helpers that will couple you to the framework. If you ever find yourself importing packages from a framework or vendor in your domain classes, you are probably causing yourself a future headache.
Some architectural considerations
Beyond the definition of architecture, a wide variety of concerns arise in typical enterprise settings. I'll cover agile architectural approaches to a few of them here.
Politics of architecture
Corporate politics is one of the first rude awakenings you encounter when promoted to architectural position. Because architect is generally the highest technical position within companies, you become the spokesperson (and defender) of all the decisions happening in the IT department, for better or worse. Actually, you generally get blamed for the bad things and get no credit for the good ones. Some burgeoning architects try to ignore this (which seemed to work pretty well while you were in the technical trenches), but it will no longer work at your new position.
Remember that communication is more important than technology in most software projects. If you've ever been on a failed software project, consider the reasons why it failed. Was it because of a technology reason or because of some communication problem? Most of the time, it's communication rather than technology. Technological problems have solutions. (Sometimes they are hard solutions, but they always have a solution.) Social problems are much stickier and harder to resolve. One of the famous quotes from the book Peopleware (see Resources) is:
It's always a people problem.
Even for technology decisions that you think are cut and dried, politics will rear its head, especially if you find yourself in the position of approving purchases of enterprise tools. (On the bright side, you might get to go on an exotic golf outing, courtesy of one of the tool vendors.) Remember that, as a architect, you not only have to make important decisions; you must defend them too. Sometimes the people you are talking to have their own agendas that don't make logical sense but do make sense in the crucible of corporate politics. Don't get frustrated, and remember why you made the decision in the first place.
Build vs. buy
One of the common questions that arise in big companies is the decision whether to build or buy: for the current requirements, should we buy COTS (Commercial Off-the-Shelf Software) or build it ourselves? The motivation for this decision is understandable — if the company can find some already written software that does exactly what's needed, it saves time and money. Unfortunately, lots of software vendors understand this desire and write packaged software that can be customized if it doesn't do exactly what the client needs. They are motivated to build the most generic software they can because it will potentially fit into more ecosystems. But the more general it is, the more customization is required. That's when an army of consultants shows up, sometimes taking years to get all the custom coding done.
The question of whether you should buy COTS really boils down to another question: is the business process supported by that software strategic or overhead? Buying COTS makes perfect sense if the business process in question is merely overhead. Examples of this type of software include human resources, financials, and other common business processes. Strategic software affords a competitive advantage in your field of business. That competitive advantage shouldn't be given away lightly.
The flowchart in Figure 4 is designed to help you decide between build and buy:
Figure 4. Decision flowchart for build vs. buy
In this flowchart, the first decision you must make revolves around the important distinction between strategic and overhead. If the need is strategic, you should always build the solution yourself. If you don't, you are purposely putting yourself on a level playing field with your competitors rather than building something exactly suited to your current and future needs. Package software touts its customizability, but there are limits to how much can be tailored. If you write your own, it takes longer, but you have a platform upon which you can build things that distinguish you from your competitors.
The second decision in the flowchart asks if the package software is immediately useful. A common trap in buying package software is to misunderstand exactly how long it will take to morph it into your business process; most companies misjudge this by an order of magnitude. The more you must customize it, the longer it will take. Even worse, some companies allow their business process to change to accommodate the software. This is a mistake because, for better or worse, your business process should be distinct from your competitors'.
The third step in the decision tree asks if the package is extensible as opposed to customizable. Extensible systems have well-defined ways to extend functionality without needing to hack anything into place. These extension points include well defined APIs, SOAP calls, and the like. Customization implies that you must "cheat" to get the package to do something you want. For example, if you find yourself cracking open a WAR file so that you can replace the file named index.gif with a different image (which must be named index.gif), you are customizing, not extending. The litmus test is whether or not your changes have a fighting chance of surviving an upgrade. If so, you've extended the package; if not, you've customized it. Customization discourages you from keeping the package up to date because you realize how much effort is required to make the same changes to the new version. Thus, the tendency is not to upgrade, eventually leaving you four or five versions behind the latest, which puts you at risk of losing support for the ancient version you are using.
What is overhead to some businesses is strategic to others. For example, I've done some consulting for a financial-services company whose hiring process is considered one of it key strategic advantages. They hire the best and brightest, spending a lot of time and effort to find the right people. They once asked me for advice about purchasing a COTS human-resources system, and I advised them against it: why should they put themselves on a level playing field with their competitors? Instead, they took my advice and wrote their own HR system. It took longer to write, but once it was done they had a platform that facilitated tasks that were more labor-intensive for their competitors. Hiring is simply overhead for many organizations, but for this company it was strategic.
Typing in architecture
One more-technical (less process-oriented) topic that arises often in SOA initiatives has to do with typing and versioning for distributed systems. This is one of the more common pitfalls in these types of projects. It is common both because it is easy to follow a path laid down by tool vendors and because it takes a while before the problem manifests itself — and the hardest problems arise from not knowing what you don't know in a project's early stages.
The debate over whether you can build "enterprise" systems with dynamically typed languages has been beaten to death, and the arguments now offer much heat with little light. However, this debate informs an important consideration for distributed systems with respect to the typing of endpoints. By endpoints, I'm referring to the communication portal between two disparate systems. The two competing typing styles are SOAP, which typically engenders strong typing using standards such as Web Services Description Language (WSDL), and Representational State Transfer (REST), which favors a more loosely typed document-centric approach (see Resources). The detailed pros and cons of SOAP vs. REST are outside this article's scope; here I mainly want to talk about the benefits of loose typing at the endpoint level, which you can achieve with either style.
More-dynamic typing is important at endpoints because those endpoints form a published integration API between systems that usually evolve at different paces. You want to avoid tightly coupling specific signatures (types and parameter names) between those systems, which would make either side of the conversation brittle and prone to break, hampering your ability to version the two applications independently.
Here is an example. In traditional SOAP-style integration, you use a Remote Procedure Call (RPC) type of protocol, using WSDL to define the details of the conversation between the two applications. This is illustrated in Figure 5:
Figure 5. Using RPC-style calls between applications
The RPC-style integration uses WSDL to take a "regular" method call and abstract it out to SOAP. Thus, each class maps to a type in WSDL, including the types of all its parameters. This approach strongly couples the two sides of the conversation together because they both rely on WSDL to define what is sent and what is expected. The problem lies with this strict definition. What if you need to modify one of the applications to take different parameters or change the type of an existing one, and you can't modify both applications at the same time? How do you version the endpoint? Several ways are possible, but all of them have serious compromises. For example, you could create another endpoint with the new WSDL definition. If the original endpoint was named
addOrder, you could create another endpoint named
addOrder2. You can see the dark place that this leads to. Soon, you'll have dozens of slightly different endpoints, with duplicated code everywhere, handling one-off situations because it is hard to anticipate how people will use the integration point once it is published. You can also play games with endpoint resolution using tools like Universal Description, Discovery, and Integration (UDDI) (or just a hash map), but that doesn't scale well either. The fundamental problem is the tight coupling between the endpoints, which prevents them from evolving at a natural, independent pace.
One alternative approach is to treat the integration endpoints as loosely typed, as shown in Figure 6:
Figure 6. Using loose typing at integration endpoints
By passing the interesting information to the endpoint inside a document, you can leave the endpoint definition unchanged across both major and minor upgrades to either side of the conversation. Rather than relying on WSDL to define stringently what is expected, you have the option of flexibility. Now, the endpoint always takes in a document that encapsulates the types of things the endpoint needs.
To handle versioning of the endpoint, the first resolution step of the endpoint is to unpackage the document, determine what has been passed, and reconcile that with what is expected. I generally use a combination of the Factory and Strategy design patterns (see Resources) to determine if I'm getting what I expect, as shown in Figure 7:
Figure 7. Unpackaging content just inside the endpoint to determine types
The endpoint's first job is to look at the document's manifest and determine what it contains. Then, it uses a factory to instantiate the proper strategy for pulling that information out of the document. Once all the parts have been verified (using WSDL if necessary), the deserialized objects are passed on for business processing.
A couple of benefits appear in this approach. First, it is a bad idea to have one mechanism with two orthogonal jobs, yet that's what traditional RPC assumes: the endpoint is responsible both for providing the published API for integration and for verifying typing. Because it has two behaviors, you tend to intermingle the code, making it harder to understand and maintain. Second, you can now have any number of users of this endpoint, all using slightly different versions of it. As long as you have a strategy, you can support any version (including old versions for applications that are slow to update) with the same endpoint. This allows you to make changes as you need to and not worry about forcing the rest of the applications in the enterprise to keep up with your changes. They can change and use new document versions on their own time scales.
No tools or frameworks are available (yet) that allow you to implement this approach trivially, but a little extra up-front work provides the aforementioned benefits. You can implement this style using either SOAP or REST (although it is easier in REST because REST is inherently document-centric). By creating a loosely typed ecosystem, you can enable disparate development groups to move at their own pace, allowing the overall enterprise use of applications to move forward with the least friction. That is the essence of evolutionary architecture: putting a foundation in place that allows for frictionless change at the fastest possible pace without compromising capabilities.
Architecture is a big and complex topic in software; in this installment I tried to touch on many different facets, ranging from politics to implementation details for versioning endpoints in SOA. In future installments, I'll flesh out more of these ideas around architecture in general and some new architectural approaches for building an evolvable SOA without paying millions to vendors.
- The Productive Programmer (Neal Ford, O'Reilly Media, 2008): Neal Ford's most recent book expands on a number of the topics in this series.
- Design Patterns: Elements of Reusable Object-Oriented Software (Erich Gamma et al., Addison-Wesley, 1994). The Gang of Four's classic work on design patterns.
- Apache Struts: Struts is a popular open source framework for building Web applications in the Java™ language.
- Peopleware: Productive Projects and Teams (Tom DeMarco and Timothy Lister, Dorset House Publishing, 1999): This book offers excellent advice on people issues in software development.
- "Resource-oriented vs. activity-oriented Web services" (James Snell, developerWorks, October 2004): Read a discussion of the REST and SOAP approaches to SOA.
- Browse the technology bookstore for books on these and other technical topics.
- developerWorks Java technology zone: Find hundreds of articles about every aspect of Java programming.
- developerWorks SOA zone: A wealth of hands-on technical content and a comprehensive accounting of SOA standards.
- Get involved in the My developerWorks community.