When developing large and complex applications using an enterprise-class infrastructure, great discipline, care, and planning is required in order to ensure the greatest likelihood of success. There are a number of aspects to creating such a disciplined environment including, but not limited to, rigorous software engineering, strong architecture, detailed design, quality staffing, careful planning, risk management, and others. This paper focuses on one aspect that is often overlooked: a set of well-designed stages where applications can be rigorously developed, tested, and deployed.
While this article is focused around a WebSphere environment, it is important to realize that these issues are not related specifically to WebSphere, but rather to the general problem of disciplined development.
An overview of the ideal environment is presented in Figure 1.
Figure 1. The ideal WebSphere environment
Figure 1 represents an "ideal" environment that includes all of the necessary stages for a first-class rigorous environment. In this environment, each sub-environment is as complete as it needs to be to serve the intended task.
The extent to which you adopt an environment as is described here depends on the amount of risk you can accept for your applications. An environment such as that in Figure 1 does require a significant investment in both time and money. Organizations developing simpler applications or that are willing to assume increased risk may scale down some aspects of this ideal environment. Later in this article, we will describe how this can be done and the risks involved. First, however, we will discuss each environment in more detail.
1. Development environment
The development environment is where developers live and work every day; this is where developers work hard and need to be productive. Thus, they need the best tools and the fewest barriers to progress. The ideal development environment for WebSphere is shown in Figure 2. This environment is composed of a number of development workstations (one for each developer), a source code management (SCM) tool, and an integration workstation.
Figure 2. The ideal WebSphere development environment
Ideally, application developers use WebSphere Studio. There are many great advantages to using WebSphere Studio, including tight integration with the WebSphere product family. (WebSphere Application Server and WebSphere Portal Server, for example, are very tightly integrated into WebSphere Studio.) The unit test environments, in general, have the same code base as their run time equivalents, so developers can test their application within their IDE and have confidence that the behavior of the application in that environment will be very similar to the behavior of the application in the production environment.
Although WebSphere Studio is the development tool of choice, others can be used. If WebSphere Studio is not used, we recommend that WebSphere Application Server be installed on the developer's desktop to support unit testing. In addition, even if developers work on, for example, a shared UNIX server, it is possible for individual developers to run their own instance of WebSphere Application Server for testing purposes.
The majority (90%) of all testing should occur in the unit test environment. Testing should occur on an ongoing basis -- developers will typically run their application many times an hour within this environment as they incrementally develop and then test their work. In order to make repeated testing effective, some form of automated unit testing can be used. See the appendix for more information about unit testing.
As developers complete their work on individual features, they should immediately consolidate their changes against what is currently in the repository. Developers should be integrating their work with other changes before committing their own work to the repository. Such integration should occur frequently throughout the day.
1.1 Integration Workstation
It is often useful to dedicate one special machine to the purpose of integration. This means that, ideally, the integration workstation runs WebSphere Studio (or another development configuration) and is connected to the SCM.
Periodically, perhaps daily, an integration team or development lead should load the code onto the integration workstation and run the entire suite of unit tests. This sort of testing will help to uncover bugs that may be a result of developer workspaces getting out of synch with the baseline. More importantly, this adds formality to the integration process and helps to uncover problems with the integration process (like developers not doing regular integrations). The process is both formal and controlled. Feedback to the developers should be less formal, however: this is not a development stage where copious amounts of documentation and bug tracking is required.
It is technically possible to do this integration from an individual developer's workstation. However, having the integration process executed on physically separate hardware helps to formalize the process. Further, the use of an integration environment encourages frequent integration. By extension, this generally means that each integration involves a relatively small number of changes which makes it generally easier to do. With a smaller number of changes, fewer things can break as part of the process and remediation of problems is generally less difficult and time consuming.
When the development integration machine is not being used for integration, it contains the most up-to-date version of the application (and always runs). This makes the machine a very handy tool for doing quick demos when required.
The desktop machines used by developers need to be high-caliber desktop machines. Development productivity is crucial, and slow, underpowered machines (lacking in CPU power, disk space, or memory) will only impede your team's success. In extreme cases, underpowered systems can trigger bugs in the underlying products that would never be seen elsewhere. This can waste precious time.
Refer to the official product documentation of the tools you are using for the recommended (not minimum) settings and ensure that you have machines of at least that caliber. Having even more is better and beneficial.
1.3 Configuration Management Systems
The hardware running your source code management software must perform well and be highly reliable (daily backups are strongly recommended). Developers will move files to and from the version management server frequently and any breakdown of the system will likely be very costly in terms of developer productivity. In order to perform well, the machine must be fast and have plenty of memory and disk. Do not use some left over PC that no one else wants. Also, avoid using a development desktop as a version management server for all but the smallest projects. Refer to your configuration management vendor for more detailed advice.
2. Development integration run time
The development integration run time environment is used by developers to test their application on hardware and software that resembles the target production environment. Testing in this environment is concerned with uncovering issues related to subtle differences between the development and production systems as well as testing the deployment procedures. This may include such things as use of various operating system services, WebSphere Application Server security, backend systems, and others. Developers use this environment to perform integration tests among all system components. This environment is also used to test installation and operational procedures which are often operating system specific.
Figure 3. The ideal development integration run time environment
The development integration run time environment is configured to mirror the production environment at the smallest possible scale and complexity. In general, this environment does not include network devices such as load-balancers, routers, or firewalls.
Systematic testing on this environment does not typically occur on a daily basis, but does occur regularly, perhaps bi-weekly, as significant change is introduced into the application. It is a recommended practice that the application is run on these production-like platforms by developers on a regular basis.
This environment is controlled by the development team; it is used informally by developers and updated as often as necessary by developers while performing their tests. Periodically, this environment is refreshed using a formal build, deploy, and test procedure thereby removing any inconsistency and testing the full build and install procedures.
In general, this environment does not include any development tools. As such, testing depends on the use of test scripts and tracing to determine correctness and identify problems.
The development desktop is generally different from the production platform: typically, Windows or Linux is used on development desktops, and more robust Windows, Linux, UNIX, or z/OS systems are used for production. Most of the code base for WebSphere Application Server is shared across all platforms and so applications built on one platform should behave the same way on all other platforms. In general, J2EE artifacts such as servlets, JSPs and EJBs should all work the same way on all platforms.
However, there may be subtle differences between the unit test and production environments that can be exposed during this testing. It is best that these differences be discovered as early in the development process as possible. In general, it is easier to find and remediate a small number of bugs at key points in your development cycle than to find and remediate a much larger number of bugs as you try and move your application into production.
The development integration run time environment mirrors the production environment at the smallest possible scale. For a typical project, this is a single machine running the same operating system and version of WebSphere Application Server as is used in the production environment.
2.3 Sizing the development integration run time environment
One topic that comes up frequently is how to properly size the development integration run time environment. We won't discuss specifics like how many CPUs or how much memory, as these issues depend on far too many variables to enumerate. The issue here is to identify in broad strokes what is appropriate.
There is little point in spending a lot of money on hardware you don't need, but there is a catch. With a small team of developers (10 or fewer), it is feasible to have a single shared environment. Developers do most of their work on their desktop systems and only occasionally use the development integration run time environment. When they use this environment, since it is shared, they run the risk of interfering with each other's work, but this should be manageable.
As the size of the development team grows, or if multiple projects share the same machine, a single shared integration environment becomes extremely difficult to manage. In that case, multiple environments should be created so that developers can experiment independently of each other. This is where things get tricky. For complete separation, developers need completely independent WebSphere Application Server domains or cells. Thus, a single development integration run time environment needs to support a large number of cells or there needs to be multiple environments.
As a rule of thumb, no more than about 5 to 10 developers should share a single "integration" environment. The right number for you might be higher or lower, but hopefully this gives you an idea. In addition, at least some of the developers may need to perform more complex experiments that may damage the cell. They may need a dedicated WebSphere Application Server instance, or entire cell, for their experiments. The essential point is that you should plan to create multiple cells and multiple application server instances and that there should be sufficient hardware to support this.
3. System test environment
The system test environment is a carefully controlled formal test environment. Development teams run their applications on this environment on a relatively infrequent basis -- perhaps every six to eight weeks. A system test environment mirrors the production environment more closely than does a development integration environment, but it still does so at the smallest possible scale. Figure 4 shows an example system test environment.
Figure 4. The ideal system test environment
A key aspect of the system test environment is formality. The purpose of this environment is to ensure that the application will truly deploy and run as required in production. Thus, the system test team is responsible for testing all aspects of the application, including both functional and non-functional requirements. Functional requirements are generally obvious: does the application execute the business rules as defined; does the application behave as required from the user's perspective, etc. However, it is important to remember that testing here should also encompass non-functional requirements, including installation, backup, and failure procedures. Any failure during system test should result in a rejection of the build. This is not a place to experiment. Development experiments are executed informally in the development integration run time environment.
A system test environment may serve multiple masters. In addition to being used formally by testers, other groups may use it as well depending on what is appropriate in your environment. For example, the administration staff may use this environment to test new patches and configuration changes before they are rolled into the pre-production and production environments. User Acceptance Testing, if this is a formal stage in your development processes, may also occur here. A formal user acceptance process usually means that the system is left running for some well defined period (without developers touching it) and users simply use the system to validate that it is working as they wish. In many ways this overlaps with the concept of functional testing. However, having the end user community actually agree that the system meets their needs is valuable. And, if those tests are done early enough, problems can be addressed.
It should be clear that a system test environment will have many different activities that need to be carefully controlled, scheduled, and managed. The reality of system test is that it is slow. Every problem found requires formal procedures for resolution; otherwise crucial information may be lost. Ideally, system test environments should be owned and run by a separate team that is composed of people other than the development team.
This environment typically involves more than one application server instance running on more than one piece of physical hardware. The environment should also use the same operating system and software versions as production. The system test team uses this environment to observe how the application works in a load-balanced environment. This testing might include, for example, testing to see how the application responds in a failover situation.
4. Performance and load test
Performance and load testing is performed to find load-related problems in applications. This testing requires highly specialized skills and equipment in order to be optimally performed. Hence, this is a dedicated environment and team.
Like the system test environment, the performance test environment is a carefully controlled formal test environment. Development teams run their applications on this environment on an even less frequent basis. A performance test environment mirrors the production environment in complexity, but it does so at the smallest possible scale. (User acceptance testing might also occur in the pre-production environment or the performance/load test environment.) Figure 5 shows an example of an ideal performance test environment.
Figure 5. The ideal performance test environment
Load related problems can be found in several areas: validation of ability to meet response and scalability criteria, determination of scaling factors, and the search for latent bugs. For more information on performance testing see [Joines]
During load testing runs, the testing team will carefully monitor various aspects of performance: CPU, memory usage, disk usage, response time, etc. They will use this information to determine if the system meets the response and scalability criteria and determine how well the system scales. This second piece of data is useful for predicting future hardware needs as the system is scaled in production. Finally, the testing team will work to push applications as close as possible to the breaking point to find latent bugs. Sadly, many applications contain subtle bugs that either occur very rarely or are caused by concurrent access. Only through large scale load testing can these bugs be found before your customers find them.
Ideally, this environment is owned and operated by a dedicated performance testing group whose members have specialized load testing skills. Each development team schedules time with this group. Typically, an application is load tested less frequently than it is run on the system test environment. As with a system test environment, one application is tested at a time.
The performance test environment is very much like a system test environment with a specialized purpose. This environment also mirrors the production environment, but at a generally larger scale than a system test environment. This environment should use the same hardware and software configuration as the production environment (including operating system versions) and runs on a dedicated network (or subnet) containing dedicated firewalls, Web servers, load balancers and other required resources (like databases).
In addition, the load testing environment will include tools for generating the load needed, such as Mercury Interactive's LoadRunner. These tools require dedicated high performance client machines for generating load.
The purpose of pre-production is to mimic production as much as possible (with exactly being the norm). This is the final chance to ensure that things will really work in production.
This environment serves three purposes:
- It gives the operations team a final place to familiarize themselves with the application and its procedures.
- It provides the opportunity to test unrelated applications running together. This is crucial with shared deployment environments. Prior to this point, the applications have been tested and built independently.
- It provides the operations team with a chance to test their operational procedures (backup, failover, problem resolution, etc.).
As mentioned earlier, pre-production might also be used for User Acceptance Testing. In any case, testing on a pre-production staging environment generally coincides with an application's release schedule. Each external release of the application is tested on the pre-production system before it is finally moved into production. This environment is generally used to prove that the application works well with other applications.
This environment may be used by quality assurance testers as part of your release cycle.
Ideally, the pre-production environment exactly mirrors the production environment in complexity and in scale (including dedicated firewalls, Web servers, load balancers and other required resources). This environment is used to test multiple -- possibly unrelated -- applications running together as may be done in production.
There isn't much to say here. Production is of course, production. This is where you really run your applications. The key point is that if you have carefully followed procedures up to this point, the actual roll into production will be boring and predictable, rather than exciting and scary, since everything will have already been tested.
As the system test, performance test and pre-production environments are used for testing, there is a set of tools that no ideal environment should be without. These tools provide the ability to collect data that assists in defining baseline characteristics of the application. As changes and new versions are applied in these environments, the tools both recreate the scenarios and provide data showing whether or not there was an improvement. Some of these tools are also helpful in troubleshooting and problem determination.
No environment can predict how their application will perform under load without performing some load testing in the Performance Test environment. Load tools are the way to apply volume. There are open source tools like Apache's JMeter4 and IBM Alphawork's Web Performance Tools and high end commercial tools like Mercury Interactive's LoadRunner. The high end tools generally provide features such as graphing, response validation and generators, which the open source tools do not provide, and leave you to manually correlate data.
Response validation is very important. We have encountered applications that under load return the wrong answer perhaps 0.1% of the time. It is not feasible to scan output logs looking for such results, so the automated tool needs to check that the response is the response that is expected.
To understand how your application behaves under load, you need to be able to see into the application server environment itself. This is accomplished with application monitoring tools that provide statistics, such as servlet response time and the number of JDBC connections currently open. A variety of tools exist in this space from the low end (and free) Resource Analyzer from IBM to the high end, such as Wily's Introscope and Tivoli's performance tools. Again, the high end tools tend to provide extensive features, such as providing a view of the entire cluster, which tools like Resource Analyzer do not.
While you will be using both your load tools and application monitors in typical troubleshooting exercises, other lower level tools also help in problem determination and root cause analysis. Some tools, such as an HTTP proxy tunnel (WebSphere Studio includes a built-in function like this, but there are also freeware tools that perform similar functions), allow quick analysis of HTTP traffic between a browser and the Web server. For really difficult problems, sophisticated tools like Ethereal can be used to provide network level packet sniffing.
In an ideal world, all organizations would strictly follow the recommendations made in this document. However, most organizations have very real cost constraints. Besides, not all applications are "enterprise class." Thus, there are legitimate reasons for trying to reduce the costs implied by the previous sections. In this section, we will briefly describe possible ways of reducing the costs and risks involved in doing this.
While certainly desirable, a staging environment is quite expensive to build and maintain. For smaller companies and environments with low complexity, it is quite common not to have a staging environment.
While system testing should not be eliminated, there are several things that can be done to reduce the cost.
First, the system test environment can be shared by multiple teams. As described earlier, each application will probably only use the environment for a few days during each test cycle. Thus, if multiple applications are deploying to similar hardware, sharing this makes a great deal of sense. In large organizations with many concurrent development projects underway, it may make sense to have shared system test environments. In any case, access to these environments should be carefully managed and scheduled. However, keep in mind that sharing inevitably leads to scheduling conflicts. It is simply impossible to guarantee that applications won't have conflicting needs. As a result, while substantial money will be saved on hardware and possibly staffing, application test cycles will be longer.
Second, there is no need to exactly mimic production. In particular, there is no need to equip the system test environment with large scale high performance hardware. In fact, it is somewhat desirable to use lower powered hardware in system test in order to force load related bug discovery earlier.
For small applications with low performance requirements, there is no need for formal performance testing. Some simple load tests can be performed elsewhere. However, keep in mind that without at least some basic load testing, latent bugs may never be found.
Also, while we recommend a dedicated performance testing environment and team, this is only practical for some organizations. Thus, in this case, performance testing could be done on system test hardware using the system test team. However, bear in mind that your system test environment then needs the hardware (meaning that system test will require large scale machines mimicking production, firewalls, and load balancers) and skills to support this type of testing. In particular, do not forget the importance of a dedicated (or at least stable) network; otherwise, performance test numbers are meaningless.
Development integration test
It is quite common for a development team to be provided with a large scale and expensive machine for development. This is often a waste of money. There is no need for development to have a top of the line machine with all of the bells and whistles. An older or slower machine is quite appropriate, as long as the operating system and basic features are the same. For example, if production will use a large AIX box with 20 CPUs, development should be able to get by with a smaller box with 4 CPUs (do not use a uniprocessor). Of course, bear in mind that if the development team is large, a large machine will be needed just to support all of the developers, as mentioned earlier.
Ideally, every development team has their own development integration run time environment. If money is very tight, with appropriate scheduling, multiple development teams can share the environment. However, this can result in fairly serious conflicts if different development projects on different schedules have conflicting hardware and software requirements, conflicting versions being the most serious problem. Also, keep in mind that if sharing is taken too far, rigorous and cumbersome change control procedures will be required. This is inappropriate for development. Developers need to get work done ... now. Later stages require formality. Earlier stages should be as loose as feasible. Where appropriate, multiple teams can share the same physical hardware but use separate WebSphere cells to alleviate this issue. However, there are limits.
Finally, and in the most extreme case, the development integration run time and system test environments can be combined. However, doing this is extraordinarily risky and borders on reckless. It is very difficult to ensure that system test is really testing a proper build when development has access to the same machine. Instances have been witnessed where developers, completely innocently, modified configuration files that were being used by the System Test organization. A high degree of physical separation helps to prevent these types of errors.
This article discussed the various sub-environments, or stages, that are appropriate when developing complex systems using enterprise class software. We have described why each stage is necessary and have also described ways of reducing costs when appropriate. Applying these recommendations to your enterprise may cost more up front, but in the long-term we believe it will result in fewer sleepless nights and fewer projects that suddenly fall months behind schedule without warning -- or even worse, systems that fail in production when your customers can see the problems.
Remember the old adage: it costs far more money to fix a bug found by your customer than to fix (or prevent) one earlier.
Appendix: Automated Unit Testing
No paper on development is complete without a discussion of automated unit testing. Our experience has shown that unit testing, and in particular, automated unit testing, helps greatly during development. Automated unit testing, using a tool such as JUnit accelerates the iterative development process by giving developers confidence to make changes to the application with the knowledge that those changes and the effects of those changes can be adequately tested.
Ideally, the application developers themselves build the test cases for the code they are writing. In the best scenario, unit tests are actually built before the code that is to be tested. This has a number of benefits. To start, if unit tests are built first, then there is some guarantee that the unit tests will actually be built. Further, unit tests can capture how code is supposed to work (what the code is supposed to do) which makes creating that code easier; it is easier to build the code when you have a solid mental picture of what it is supposed to do. Creation of unit tests is a time consuming process. Typically upwards of 25% of all development time can be spent building these tests. The tradeoff can be enormous. Unit tests make it easier to find and resolve problems early which can translate into huge savings over the life of a project.
Writing good tests is hard work. If you are just starting to look at unit testing, get some help. Unit tests give developers confidence in their work; unfortunately, bad unit testing also gives confidence, often with disastrous results. Unit tests must be managed as first-class application code. Unmaintained or incorrect unit tests are worse than no testing at all. The chapter on testing in Martin Fowler's "Refactoring" provides a good introduction to unit testing.
We would like to thank Alex Polozoff for his input.
- Refactoring: Improving the Design of Existing Code by Martin Fowler (Addison-Wesley, 1999).
- Performance Analysis for Java Web Sites by Stacy Joines, Ruth Willenborg, and Ken Hygh (ISBN 0-201-84454-0, 2002).