In construction, the civil engineer must resolve two primary forces: tension and compression. In software, the developer must resolve a different set of forces, primarily because the laws of physics don't apply directly. Indeed, software is such a fluid medium that the development team can, in a sense, create new materials and new laws. In that context, for every software-intensive system, the team must resolve the forces of cost, functionality, and compatibility. Depending upon the specific domain, the team must also resolve the forces of capacity, availability, performance, throughput, fail-safeness, and fault-tolerance (varying in intensity by the specific requirements of the problem under consideration). Furthermore, for mission-critical, long-lived, and continuously-evolving systems, the team must resolve the forces of technology churn and resilience.
However, the presence of the Web changes everything. I completely agree with Mr. Gerstner, who has observed that "every day, it becomes more clear that the Net is taking its place alongside the other great transformational technologies that first challenged, and then fundamentally changed, the way things are done in the world."
And yet, in other ways, the Web has changed nothing. Edward Tufte once remarked that the Web "combines the worst of TV with the worst of CB radio."
All that being said, most days I land on middle ground: the Web has changed some things. In particular, the presence of the Web has changed in some specific ways:
- New patterns of visualization and access to information are made possible.
- Object technology is now mainstream.
- Component-based development is now mainstream.
- UML has emerged as the language of blueprints for the Web.
I could write a book about each of these subjects (well, ok, I did write books about some of these subjects), but instead, in this article, I want to focus on the technology behind successful Web systems.
From Web sites to Web applications
In an interview for Wired magazine, Steve Jobs, of Apple, Inc., observed that the Web is moving through several stages of evolution. Stage zero represented the most primitive state of the Web, wherein most sites were document-centric and essentially static. With the coming of scripting and then Java applets, the world moved to stage minus one -- a step backwards, because many naive projects used these technologies initially for little more than "eye candy." Most reasonable organizations quickly grew out of that stage and into stage two, wherein the client and the server were made smarter. This meant evolving mechanisms to link the user experience to databases in the back-end, essentially starting to make every application accessible through a browser. Today, we see many sites moving to stage three, with greater semantics attached to data (in particular, via XML) and with greater exploitation of the potential of distributed computing. Stage four is likely to occur as agent technology matures and as mechanisms for pervasive distributed computing (such as Jini) become more stable.
Around the point of stage two, moving to stage three, successful sites that provide real business value morph into what Jim Conallen, of Rational Software, calls truly a Web application. A Web application is a Web site where user input, including navigation through the site and data entry, affects the state of the business. In such applications, there are consistently three critical points the development team must address:
- Integration of legacy
- Continuous evolution
- Architecting for peak traffic
Secondary critical points include security and the presentation of a compelling user experience.
Bill Gurley, of Hummer Winblad Venture Partners, calls these things "big fat Web sites." He goes on to note that such sites have a number of common characteristics similar to Jim's view:
- There's one copy, usually deployed in house or to a hosting company.
- That one copy is deployed to a farm of Web servers.
- The system serves millions of customers.
- Its code base dwarfs traditional client/server systems.
- Changes are made continuously.
- Speed is essential: speed of the user experience and speed of development.
Building a Web site is easy: learn a little HTML, find yourself a server, and you are in business. However, a static site is a dead site, and so every meaningful business will quickly realize the need for a more sophisticated solution. Whereas those simple sites can be built by a single person with minimal modeling, simple tools, and a simple process, creating a complex Web site requires a team using a greater degree of modeling, more sophisticated tools, and a well defined, repeatable process. Unfortunately, it's easy to take a successful Web site and try to morph it into a successful Web application by adding more and more moving parts (Perl is ultimately the duct tape of the software world). Many sites today are essentially high rises that are a combination of many hundreds of dog houses strapped together. However, those simple sites just don't scale to Web applications without some energy applied. As Scott Peterson, of PC Week, once observed, "most of us are running Web sites or networks built of straw or sticks." What worked for a small site doesn't usually scale, and so a large application will often come crashing down around your heads if you don't apply that energy.
The importance of architecture
No ethical civil engineer would build a high rise without first having a solid architecture in place. Strangely enough, many organizations undertake morphing their Web sites into Web applications with little thought for architecture. In my experience, the presence (or absence) of a meaningful architecture is an essential predictor to the success (or failure) of a project.
Building upon the work of Mary Shaw and David Garlan, I define architecture as encompassing the set of significant decisions about the organization of a software system, including:
- Selection of the structural elements and their interfaces by which a system is composed
- Behavior as specified in collaborations among those elements
- Composition of these structural and behavioral elements into larger subsystems
- An architectural style that guides this organization
As someone much wiser than I observed, a system's architecture is not finished until there's nothing left to take away. In other words, a system's architecture represents the necessary strategic design decisions sufficient to form that system. A stable architecture is essential to every successful system for two reasons. First, the creation of a stable architecture helps drive the highest risks out of the project. Second, the presence of a stable architecture provides the basis upon which the system may be continuously evolved with minimal scrap and rework.
Drawn from some of Jim Conallen's writings, he offers the representation of a
canonical Web architecture:
The right-most elements -- the file system, the application sever, data, and external systems -- are essentially the same as found in traditional client/server systems. The left-most elements -- the browser, the Web server, and again the file system (in this case, a distributed one) -- are elements unique to the Web space. From the perspective of the user experience, this otherwise physically distributed back-end looks like traditional mainframe computing.
However, there are significant architectural differences owing to the very different mechanisms that tie these elements together. For example, from the browser to the Web server, communication is generally stateless, involving the request for, and then the delivery of, a Web page. Here's the first architectural challenge: how do you preserve the user's session state? There are a number of alternatives, of which cookies and communication via IIOP (the Internet Inter-Orb Protocol) are the most common.
The placement of the application's business logic represents another architectural challenge: should it live in the server (the thin client model), should it live in the client (the fat client/thin server model), or should it be spread out overall? In the spectrum of thin to thick client, each alternative has its own advantages and disadvantages: a thin client offers simplicity of security and distribution but makes the browser look more like a dumb terminal; a thick client offers greater locality of reference and better interactivity but at the cost of distribution. Most systems today tend to push business logic to the server.
Business logic must touch the state of the business. This notion presents the next architectural challenge. Should there be stateless communication from the logic to the data via mechanisms such as Java Server Pages (JSP), or should it be more stateful, such as through servlets? Again, there are advantages and disadvantages to each approach. Scripting is easier to change but comes with computational overhead, and servlets are potentially faster but more challenging to develop and deploy.
Connection to the application's persistence data, which may be bound in legacy systems, also involves many architectural challenges. First, how does one give the illusion of objects to the user while data continues to live in relational tables? Second, how should the connection from the system's business logic to its data be manifest? For example, a coupling via JDBC is more direct but requires that the application developer have intimate knowledge of the data's form. Alternatively, a messaging architecture is less direct but is more scalable.
Philippe Kruchten, of Rational Software, has observed that complex systems
cannot be understood from just a single viewpoint. Indeed, the previous diagram is really only a simple view of a system's deployment. Philippe has pioneered the concept of a 4+1 model view, as illustrated below:

A system's design view encompasses the classes, interfaces, and collaborations that form the vocabulary of the problem and its solution. This view primarily supports the functional requirements of the system, meaning the services that the system should provide to its end-users.
The process view of a system encompasses the threads and processes that form the system's concurrency and synchronization mechanisms. This view primarily addresses the performance, scalability, and throughput of the system.
The implementation view of a system encompasses the components and files that are used to assemble and release the physical system. This view primarily addresses the configuration management of the system's releases. The releases are comprised of somewhat independent components and files that can be assembled in various ways to produce a running system.
The deployment view of a system encompasses the nodes that form the system's hardware topology on which the system executes. This view primarily addresses the distribution, delivery, and installation of the parts that make up the physical system.
The use case view of a system encompasses the use cases that describe the behavior of the system as seen by its end-users, analysts, and testers. This view exists to specify the forces that shape the system's architecture.
Note that this is where the UML fits in: the UML is a graphical language for visualizing, specifying, constructing, and documenting the artifacts of a software-intensive system. Thus, it is well suited to express each of these five views.
One of the most important developments in software engineering in the past few years has been the realization that all well structured systems are full of patterns. In its classical definition, drawn from the work of the architect Christopher Alexander, a pattern is a solution to a problem in a context. A pattern codifies specific knowledge from experience in a given domain and offers a common solution that resolves the forces that press upon a system.
Ultimately, we come back to our starting point. For all kinds of computing systems, and for Web applications in particular, a number of common patterns are emerging for resolving the forces in that domain. Mechanisms, such as described in Gamma, Vlissides, Johnson, and Helm's seminal book, Design Patterns, represent the codification of collaborations among societies of classes that must work together. Architectural patterns are emerging as well, giving us the means to express the common architecture of simple Web sites to what is essentially an "Amazon-in-a-box."
I'll leave an expression of some of these common architectural patterns to my next column.

Grady Booch is the Chief Technical Officer and vice-president of catapulse. Prior to catapulse, Grady served as Chief Scientist for Rational Software Corporation shortly after its founding in 1981. He is recognized internationally for his work on software architecture, modeling, and software engineering processes, all of which have contributed to improving the effectiveness of developers worldwide. He is one of the original developers of the Unified Modeling Language (UML) and was also one of the original developers of several of Rational's products, including Rational Rose, the industry leading visual development tool. He is the author of six best selling books, has published several hundred technical articles on software engineering, and has lectured and consulted worldwide. You can contact him at egb@rational.com
Comments (Undergoing maintenance)

