It was very common in the early days of e-business and IBM WebSphere Application Server for users to deploy a single application in each application server instance -- or even to cluster multiple application server instances -- all dedicated to host a single application. However, with the popularity and growth of e-business, more and more users now have many applications that they want to host on Web application servers, with a growing trend of running multiple (relatively small) independent applications within a single application server instance or application server cluster.
This server consolidation or application co-location approach has both advantages and pitfalls:
Lower cost. The main advantage of this approach is the economy of scale that it permits. A small number of physical server machines and server processes hosting a large number of applications will typically cost less in terms of hardware resources, infrastructure, software costs, and system administration costs.
Resource sharing. The notion of sharing the resources within one server instance between multiple applications opens the possibility of better resource utilization. An idle period for one application may be a peak period for another. Of course, this is also possible with multiple independent servers sharing a common hardware resource pool, but it can be done to an even greater degree within a single application server process (for example, sharing memory pool, database connection pools, and so on). This is, in effect, the essence of the on demand computing principle.
Problem identification. If a server crashes or starts to exhibit some abnormal behavior (such as poor response times, error messages, and so on), it might be difficult to determine if one particular application, among the several currently deployed in the same server, is responsible for the problem, either directly or indirectly. In fact, defects or mis-configurations within one application might not be your only concern; a defect or mis-configuration in the underlying WebSphere Application Server runtime might affect different applications differently -- or might even be triggered by the presence of a particular application, and then affect other applications. In these cases also, the presence of multiple co-located applications might make it more difficult to drill down to the root cause.
Problem isolation. Because multiple applications share many common resources in the environment, any problem that starts within one application, could end up affecting other applications running on the same server. This might range from subtle interactions related to performance tuning, to the complete failure of a server, and would generally require application deployers and administrators to coordinate their activities as they each work on different applications.
In this article, we will examine techniques and best practices for maintaining the advantages of application co-location, while reducing the impact of the potential pitfalls.
Troubleshooting approach and practices
The necessary approach for dealing with many of the issues associated with co-located applications is to troubleshoot problems when they occur, so it might be useful to begin with a general overview for troubleshooting problems in WebSphere Application Server.
There are, of course, many details and variations that are specific to a particular problem. Still, it is useful to recognize two distinct types of troubleshooting approaches that are used in nearly every case:
Analysis. You start by obtaining diagnostic information from the system (an error message, trace, or state dump) and attempt to analyze and understand its meaning. If necessary, you iterate by obtaining more detailed diagnostic information and "drilling-down" into the analysis until it becomes so specific that the cause of the problem is obvious. This is essentially a "white box" approach, based on a deep understanding of the internals of the system being investigated
Isolation. You try to identify the smallest possible subset of the overall system under investigation that still exhibits the problem. This subset might be among the physical components of the system ("the problem only occurs when this part of the application is enabled..."), or it might be among the execution flows in the system ("the problem only occurs when we attempt to perform this particular function"). If necessary, you iterate through this process by testing and eliminating variables until you've narrowed down the problem to the smallest possible area, at which point hopefully the actual cause of the problem is much easier to see. In contrast to the analysis approach, where you try to understand in deeper and deeper detail what the system does, this approach has you focusing only on the interaction between the separable components and flows, often without needing to know much about how any of these components or flows actually work. This is mostly a "black box" approach.
In practice, of course, a complete investigation of a problem might often incorporate elements from both approaches (such as isolating the problem to one component, and then analyzing the diagnostics produced by that one component), but it is useful to recognize these two types of investigation steps, as each is supported by various tools, coding techniques, deployment strategies, and administration techniques.
Although both approaches are used in the presence of co-located applications, as well in the case of a single application per server, the isolation approach may become particularly relevant in the case of co-located applications, since much of the initial goal of troubleshooting might be to isolate which application is causing (or primarily affected by) a problem. For example:
- Try to enable or disable one application at a time in a test server, to see which applications need to be active for the problem to occur.
- Temporarily redistribute the applications across different servers, to see which application(s) the problem follows.
- Try modifying WebSphere Application Server configuration parameters in one application at a time (such as HTTP cache, dynacache, and so on) to see how the changes affect the problem.
These steps, though occasionally challenging to implement in some real-world environments, are sometimes the only practical recourse when other, simpler analysis or isolation techniques prove insufficient.
There are several sources of information from which to find instructions for how to troubleshoot various types of problems, whether specific to co-located applications, or more general. See The Support Authority for a complete list of troubleshooting tools and resources.
Common techniques and best practices
Here are some common techniques and best practices that can help you resolve or avoid problems associated with co-located applications. Although some of these might seem trivial or obvious and others very sophisticated, all are equally valid and worth considering in all cases:
- Use a different name for every object and resource associated with each application
- Implement clear and easily understandable application logs and error messages
- Carefully assess all resources and other objects that might be shared between applications
- Perform end-to-end monitoring on each application
- Separate administrative functions between applications
- Always consider the big picture, system-wide
1. Use a different name for every object and resource associated with each application
In the course of attempting troubleshooting, you will usually encounter various named objects or resources that are involved in the problem. One key task will be to determine which application these objects or resources belong to or are associated with. Therefore, whenever possible, you should use a clear naming convention for assigning different and easily recognizable names for everything associated with each application.
There are many useful things that can be associated with an identifiable name within a running application server:
Java™ packages for all classes that implement each application. Whenever possible, different applications should use different Java package names. This is useful in a number of situations, for example:
When the server reports an unexpected exception, you can examine the classes and methods of the associated stack trace to determine which application caused, or was involved in, the exception.
If a server crashes or hangs, you often need to examine one or more stack traces in thread dumps obtained from javacores or system dumps. Again, the classes and methods on these stack traces can be mapped back to an application.
If you have a memory leak, you can often (but not always) determine the source of the leak by identifying the class of objects that contribute significantly to the growth of the Java heap, and trace these objects back to a given application.
The WebSphere Application Server runtime uses a similar strategy to help distinguish between the various internal components that could be involved in a given failure scenario or defect. For example, most implementation classes for the WebSphere Application Server system management component are in packages such as com.ibm.ws.sm; implementation classes for the WebSphere Application Server connection management component are in packages such as com.ibm.ws.j2c, and so on.
Administrative names for each application, each module, each servlet, each EJB component that belongs to a given application, as well any resources that are specifically associated with a given application. The administrative name is the name specified when creating an application component or resource using the J2EE™ development tools or the WebSphere Application Server administration tools. This name might appear in logs and trace files during execution, and is also typically used in commands and displays from various administrative and monitoring tools.
JNDI names associated with components or resources that belong to each application. These names might appear in various trace files during execution, and can also be seen when using special problem determination tools or scripts to list the contents of the JNDI name space, and to monitor the health of the various items that it refers to.
The last step after having assigned well-defined, well-differentiated names for each element in each application is to create and continuously maintain a central document that lists these names and their associated application(s). When investigating a problem, a system administrator, can then refer to this document to quickly identify the affected application(s), and contact the appropriate people who can help further.
2. Implement clear and easily understandable application logs and error messages
The logs that reflect the operation of the WebSphere Application Server runtime, are usually contained in a few well-defined files: SystemOut.log, SystemErr.log, native_stdout.log, native_stderr.log, and so on. But most applications also need to be able to produce a stream of logging information to help monitor their normal operation and to diagnose problems encountered at the application level.
For the application log messages to be as useful as possible for monitoring and problem determination, each message should include several key pieces of information in addition to the text of the message itself:
An unambiguous indication of which application emitted the message. This can be achieved by sending the logs from each application to a different file, or by prefixing each message with a tag corresponding to the application. The WebSphere Application Server runtime uses this approach; every message emitted by WebSphere Application Server starts with a unique prefix and message identifier that directly reflects which subsystem produced it. This information is routinely used by IBM Support to track down the source of the messages and problems reported in the runtime logs.
A precise timestamp of when the message was emitted. This makes it possible to relate the occurrence message to other external events that may have been observed (for example, some particularly heavy load period, or a system crash), and to other messages emitted by WebSphere Application Server and possibly by other applications.
An indication of which thread in the server was responsible for emitting this message. This information is automatically provided with all messages emitted by the WebSphere Application Server runtime; having the same information with application log messages may make it easier to correlate some events reported from WebSphere Application Server with events reported by the application to track down the source of a problem.
An indication in each application message to help identify a particular request that the application was processing when the message was emitted, when possible -- and this is not always possible. This is particularly useful when the application emits multiple messages during the processing of a single request, making it possible to identify which messages do in fact relate to the same request, and which don't, and thus "follow" the path of a request through the system.
The question is often asked whether it is preferable to use a separate log file for each application, or to combine the logging from all applications into a single output file. There is no strong preference either way, provided that all log messages contain all the information suggested above. But if you're looking for guidelines:
Using a single combined log file makes it somewhat simpler to collect and scan data for anomalies when something happens, and to manage the growth and rotation of logs files. A single log also provides an immediate view of the timeline of all events that occurred in the server. If it is necessary to extract the set of events associated with one particular application, this can be done by matching all entries against a particular tag or prefix associated with that application.
Conversely, using multiple separate log files makes it somewhat more difficult to keep track, manage, and collect all the data in different logs, but it also provides a ready-made view of events per-application. If it is necessary to determine a complete timeline of events server-wide, the logs can be merged based on the timestamps associated with each entry in each log.
In practice, it is often easiest to use the built-in logging facilities provided by WebSphere Application Server to also write application log messages. These facilities can be invoked easily by using the standard Java primitives to write to the System.out or System.err stream, or by using the methods in the java.util.logging package with the default logger defined by WebSphere Application Server . In both cases, the application log messages will appear in the standard server SystemOut or SystemErr files, already pre-formatted with a timestamp and thread identifier (but the application is still responsible for providing a unique application name prefix in each message and, if applicable, a request identifier).
Alternatively, each application can use the java.util.logging facilities with its own custom-defined logger and handler, which, for example, could write to a different output file for each application. In that case, the application itself is responsible for ensuring that all log messages have the proper formatting, as suggested above, and for ensuring that the output files are properly managed.
For performance reasons, whichever mechanism is used, the application should be coded to first check if a given level of logging is enabled before constructing the actual log messages. Creating a complex log message and writing it to a logger or some other facility only to have it thrown away can have a negative impact, both directly on the execution time of each application request, and indirectly on the garbage collection behavior of the server.
3. Carefully assess all resources and other objects that might be shared between applications
As noted in the introduction, deploying and managing co-located applications is an exercise in trade-offs between sharing various resources associated with each application to maximize utilization and performance, and minimizing this same sharing to enhance problem isolation and problem determination.
Although one of the stated goals of co-locating applications is to share a variety of server resources, there are a number of items associated with an application where there is little, if any, benefit to sharing or combining that item with another application.
On the other hand, many of the troubleshooting techniques mentioned above, including the general analysis approach and the isolation approach, can benefit from as much separation between applications as possible. For example:
The ability to identify quickly that a given resource belongs to one particular application or another can quickly focus the error message analysis and behaviors associated with that resource.
The ability to enable, disable, or reconfigure a given resource that affects one application only greatly facilitates the isolation approach.
Generally, any shared item creates a channel through which a problem in one application could affect the correct behavior of another application. For example, applications might compete for threads, backend connections, and so on. From this perspective, then, it pays to limit sharing whenever possible.
Let's review some of the main resources and other items that exist within each application server from this application sharing perspective.
In a typical WebSphere Application Server V6.x server, there are four main types of thread pools to be considered:
Web Container Thread Pool. Each thread in this pool is responsible for handling one inbound HTTP request, and for performing all the processing associated with that request, such as executing servlets, JSPs, and any other objects that are invoked indirectly from these servlets or JSP, such as EJBs that are located in the same server, and so on. Except in very special configurations, there is a single Web container thread pool server-wide, shared between all applications.
ORB Thread Pool. Each thread in this pool is responsible for handling one inbound request coming through RMI/IIOP, which is typically directed at one EJB. This same thread will execute the code in the first EJB and any other EJBs or other objects and resources that it invokes. There is a single ORB thread pool server-wide, shared between all applications and all EJBs deployed on that server.
JCA Resource Adapter Thread Pool(s). Each thread in one of these pools is responsible for handling one inbound message received on a JCA Resource Adapter configured in the server; executing the code in an associated message driven bean (MDB), and any other EJBs or other objects invoked indirectly from that MDB. By default, all JCA Resources Adapters are created to share a single thread pool; however, it is possible to explicitly define a different thread pool for each JCA Resource Adapter.
Message Listener Thread Pool. Each thread in this pool is responsible for handling one inbound message received on a listener port, typically associated with a JMS provider that does not use the JCA specification, such as the WebSphere MQ provider. As in the JCA case, the thread will execute the code in an associated MDB and any other EJBs or other objects indirectly from that MDB. There is a single message listener service and a single corresponding thread pool in each server for all the listener ports defined in that server and used by all the applications.
Although they are distinct, all these thread pools are implemented with a common pooling facility in WebSphere Application Server, and thus they can each be controlled with the same standard set of configuration parameters (minimum size, maximum size, inactivity timeout and "growable" flag).
Proper management of the thread pools is important, because threads are fairly expensive resources in terms of memory consumption in the server, as well as other low-level operating system resources. So we normally want to keep the number of threads small, though large enough to meet the performance and capacity requirements of the server. This implies that it is often desirable to enable the same group of threads to sometime service requests associated with one application, and sometime with another, rather than reserve a full distinct group of threads for each application. But at the same time, this approach can lead to serious performance problems: if two applications need a large number of threads from the same pool at the same time, they will clearly affect each other and you will see performance degradation. In extreme cases, if one application takes up most or all of the threads in the pool during a particular period of time, all other applications may effectively be "starved," leading to very slow performance or even "hangs."
Given that most thread pools in WebSphere Application Server are server-wide, their threads are in fact generally shared between all applications. To minimize competition between applications, which can potentially lead to poor performance and starvation, each thread pool should be sized to be large enough to accommodate the combined average steady-state requirements of all the applications in the server. When there is a temporary above average spike in load for one application, there might be competition between applications. Threads will simply be allocated from the pool on a first-come-first-serve basis; there are no standard mechanisms available in WebSphere Application Server to give priority to one application over another.
However, to avoid potentially severe thread starvation between applications, it is possible to configure a thread pool to allow thread allocation beyond maximum thread size. This means that, in situations when there is heavy activity, the thread pool will temporarily create additional threads, so that no application will ever be prevented from running on as many threads as it needs. This mechanism, however, should be used with caution:
It effectively removes any upper bound on the number of threads that the server is allowed to create, so that under conditions of extreme load, the server might create so many threads that it may be unable to function effectively, run out of memory, or even crash.
The additional threads created under these conditions are not pooled at all; they are created and destroyed each time a new request is processed. This may cause performance degradation, when compared to the normal operation of the server.
For these reasons -- although this mechanism can provide an extra level of safety against thread starvation -- it is usually preferable to carefully tune the minimum and maximum pool sizes to deal with most situations. Also, be aware that even when the number of threads in a pool is configured to be very large or unlimited, there may be other constraints that effectively limit how many requests the server can process in parallel, such as the number of HTTP connections, ORB connections, or JCA or JMS connections available to receive inbound requests.
In newer versions of WebSphere Application Server, the asynchronous I/O facility (AIO) also provides one additional level of relief to absorb spikes in traffic. Incoming requests (especially Web requests) can be queued in the AIO facility, rather than be rejected immediately if there are not enough threads available to service them. This helps the system react more smoothly during excessive load periods. However, as in the case of "growable thread pools" above, this solution is not perfect: queuing in the AIO facility introduces additional overhead and latency. This might be acceptable during short periods of heavy activity, but it should not be relied on for the steady-state operation of the server and its applications. The queues in the AIO facility, like the thread pools to which it dispatches requests, are also server-wide, and are thus shared between all applications.
Finally, in the case of connection pools associated with JCA resource adapters, it is in fact possible to use a different thread pool for different resource adapters. For example, if two applications access different JCA resources, and therefore use two different resource adapters, they could use separate thread pools or a single shared thread pool. Even if two applications access the same JCA resource, they could also define two distinct resource adapters for that same JCA resource, and thus be able to use two distinct thread pools. The trade off on whether or not to use multiple thread pools will have to be evaluated on a case-by-case basis, subject to the same considerations about thread-related resource consumption and possible thread starvation discussed above.
The entire discussion in this section has been focused only on keeping the server functioning as efficiently and robustly as possible, and not on problem determination. For the large majority of problems that involve an investigation of the threads in the server (hangs, crashes, and so on), your ability to troubleshoot these problems is largely independent of whether these threads are shared between applications. The key troubleshooting technique for these problems is typically to obtain a stack trace of the various threads involved, and to determine, from looking at the methods and classes involved in that stack trace, which application is currently executing on that thread and exactly what that application is doing. This depends on your ability to recognize the methods and classes in question, not to recognize the thread itself, per se.
JCA and JDBC connection pools
Each J2C connection factory and each JDBC datasource has its own pool of connections to the corresponding remote system (Enterprise Information System or database). These connections are shared between all applications that make use of the same J2C connection factory or JDBC datasource, and they are allocated from each pool on a first-come-first-served basis across all requesting applications.
The considerations and trade-offs associated with connections are very similar to those associated with threads and thread pools discussed in the previous sections. Each connection typically consumes a significant number of resources in the application server itself, but also in the backend system that is at the other end of each connection. Therefore, it is desirable to pool and share connections as often as possible. But just as with threads, connections are also a finite resource, subject to issues of contention and starvation when they are shared between multiple applications.
The techniques for mitigating these concerns are also the same as they are for thread pools: configure the size of each shared connection pool carefully, based on the expected combined demands from all applications, or arrange for each application to use a separate J2C connection factory or JDBC datasource, even if they all ultimately connect to the same backend system. Many customers tend to allocate too many connections "just to be safe." Most applications, however, fit well with the traditional "funnel threading" model (applications spend more time processing data and building the presentation layer than they do retrieving the data). Therefore, most applications require more threads in the Web container than they do in the JDBC/JCA layer. Of course, the most accurate measurements come from a load test that simulates production conditions to correctly evaluate the size of these pools.
In the case of using multiple J2C connection factories or JDBC datasources that connect to the same backend system, special attention must be given to the possibility that a single transaction, perhaps spanning multiple applications in the server, might end up using multiple connections to the same backend system. This situation is not necessarily incorrect, but it may force more expensive two phase commit coordination of the transaction. When all the connections allocated from one or several applications come from a single pool, the WebSphere Application Server connection management subsystem can, if desired, be configured to enable the J2EE shareable connections facility, which cannot be used across multiple independent connection pools.
When using shareable connections, WebSphere Application Server will ensure that, even if a given transaction requests multiple connections to the same backend system, it is in fact always given a copy of the same (single) connection. For some application usage patterns, this can obviously reduce the total number of connections in use, and thus the potential for contention and starvation. But know that, conversely, shareable connections also carry the risk of increased connection contention in other application usage patterns. Since the system cannot know in advance if a given transaction will request the same connection several more times before it completes, that connection is kept allocated for that transaction for the entire duration of the transaction, even if the application appears to explicitly release it. Therefore, use of the shareable connection facility might reduce the total number of connections used per transaction, but may increase the interval during which each connection is in use, and thus make these connections unavailable for allocation to other applications or transactions. In balance, whether or not the use of shareable connections will, in fact, result in an overall decrease or increase in contention for connections will depend considerably on the particular connection usage pattern of each specific application.
Also, know that WebSphere Application Server V6.x also provides a facility for defining application-scoped resources, which are defined from the beginning to be associated with -- and deployed with -- a single application. In this case, any questions of sharing between applications are obviously moot. This may greatly simplify the management of many resources.
Common libraries and other common code
Sometimes, multiple applications make use of the same library of utilities, packaged either as a JAR file (for Java code) or as a native library (for code accessed through JNI). These can be embedded in the application EAR file (in which case they are necessarily private to that application), or they can be managed through shared library objects in WebSphere Application Server. Shared libraries can be configured either at the server level (in which case a single copy of the library is accessible to all the applications on the server), or at the application or module level (in which case each application or module that references the shared library has its own copy of the library, through a different Java classloader).
Similarly, in some other cases, there might be common functions provided by a Web or EJB module that are used in more than one application. We then have a choice of deploying a separate copy of the common module in each application that requires it, or to arrange for all applications to remotely access a single copy of the common module that is physically deployed in a single application (perhaps in a separate application whose sole purpose to is to host the common modules for the benefit of all other applications on the server).
This presents a non-trivial trade-off. In favor of using a single copy of the most common code components, you must know that the memory used by many common components (code and data) is often significant, so that using multiple copies of the same component can often substantially increase the memory footprint of the server. But, on the other hand, using multiple independent copies of a common component could in some cases prevent a failure from affecting multiple applications. The most common components include not just code, but also state information, in the form of Java static variables, singleton objects, and native memory buffers, as well as any other resources that are indirectly referenced by that common component. If any of that state information becomes corrupted (typically because of a defect or unexpected circumstance in the implementation of the component), all applications that share the same copy of that component will most likely be affected. In addition, if the code in a common component contains a bottleneck (such as a single monitor or critical section that must be traversed by many concurrent application requests), all applications that share the same copy will compete against each other, thereby magnifying the effect of the bottleneck. Finally, of course, having multiple separate copies makes it easier to replace the component with a different implementation or version when required for a particular application, without affecting the stability of others.
In practice, this trade off will have to be evaluated on a case-by-case basis, for each application and each common library or module.
The WebSphere Application Server runtime creates a sophisticated hierarchy of Java classloaders to orchestrate the loading of all Java classes in the various Web modules, EJB modules, and dependent libraries of each application. Classloaders are important from the perspective of sharing or isolation between various classes loaded inside the application server JVM. Two classes that have been loaded by different classloaders have strictly limited visibility to each other, and therefore limited potential for interfering with each other in undesired ways. Furthermore, it is possible to load multiple copies of the same class (same implementation or different implementation) if they are loaded by different classloaders. The previous section on managing common libraries and other common code provides further elements for consideration about the advantages and disadvantages of sharing class-related objects.
A complete discussion of all the subtleties of the various classloader policies is outside the scope of this article, but from the perspective of co-located applications, we should note one particular aspect of the classloader policy that directly affects sharing between multiple applications: In the global configuration for each application server, you must choose a specific server-wide classloader policy of Single or Multiple. When the Single server classloader policy is in effect, a single classloader is used to load most of the classes for all the applications that are deployed on the server. Conversely, when the Multiple server classloader policy is in effect, a separate classloader is used for each application.
Therefore, to enable any kind of separation between classes in different applications, including shared libraries as discussed above as well as any other potential interactions between applications, the server classloader policy needs to be set to Multiple. As a rule of thumb, unless there are strong constraints due to a specific application or environment, it is usually preferable to use the Multiple server classloader policy.
JVM heap memory
Except in some very specialized new JVMs used for real time processing, there is a single JVM heap in each server that contains all the instances of all Java objects allocated by all applications and by all components of the WebSphere Application Server runtime. This heap is thus, by definition, shared between all applications, and one application that allocates an excessive number of objects can easily affect other applications by not leaving enough space on the heap for them to function, or by causing frequent garbage collections.
There is no intrinsic mechanism in a standard WebSphere Application Server JVM to limit access to the heap by each application. Your strategy, therefore, must rely on ensuring that each application is well behaved, with respect to its use of JVM heap resources. Each application should be carefully tested to ensure that it does not suffer from memory leaks.
If something does go wrong and the heap does become overloaded, the best strategy to facilitate the analysis of its contents and determine the application responsible is to rely on the names of application specific classes whose instances take up space in the heap. If each application uses different classes, it will be easier to pinpoint the source of the problem.
Application deployment artifacts
These include artifacts such as EAR files, WAR files, Web and EJB modules, and so on. Except in very special cases, there is usually no reason to combine multiple, logically-different applications within a single deployment artifact, and no great benefit in terms of server resource usage and performance. Each application should have its own EAR file and its own collection of modules within that EAR file. Doing so will enable you to deploy, update, administer, and monitor each application independently. This will enable you to start and stop each application independently, which can be very handy when trying to isolate the source of a problem.
Virtual hosts associated with each Web application
There is little or no savings to be gained by sharing a virtual host between applications (other than possibly using fewer ports for HTTP connections). On the other hand, assigning a different virtual host to each application, along with a distinct and readily-identifiable context root, can make it much easier to track down the flow of a Web request through the system in trace files and match it to a given application. Therefore, if the organization of the Web sites that host the various applications allows it, it is a good idea to assign a different virtual host to each application.
This includes resources such as service integration buses, JMS providers, queues, topics, destinations, and so on. These types of resources are allocated server-wide (possibly attached to some other resource that it also server-wide, such as a JMS provider), and can be freely accessed by one or several applications within the server.
There is typically little concern for contention and starvation with respect to these types of resources, but like any other type of objects in the server, they do consume some amount of memory, which might dictate whether it is preferable to crate a single shared copy of some entities or multiple independent copies.
It might be especially beneficial to define separate messaging resources for each application, to enable them to be started, stopped, and monitored separately to facilitate various troubleshooting tasks.
4. Perform end-to-end monitoring on each application
Carefully planning application packaging and messaging is best realized in the benefits it brings the monitoring environment. Monitoring tools, such at the IBM Tivoli® ITCAM suite of tooling, can report overall system health with the ability to show the performance of individual elements in the environment. Good, enforced naming conventions enable the administrator and system team to easily identify the problem component and its owner.
Basic problem determination with Tivoli Performance Viewer
The Tivoli Performance Viewer, provided with WebSphere Application Server, supports a basic display of PMI data from within an application server. The Tivoli Performance Viewer enables the administrator to view statistics on internal WebSphere Application Server resources, such as the JDBC connection pools, the Web container threads, and other resources controlled by the application server and not readily accessible by more traditional system tools.
The metrics displayed by Tivoli Performance Viewer range from lightweight measurements, such as pool activity, to more intensive measurements, like EJB method-level response times. While not all measurements are appropriate for a production, high-volume application, they do provide useful insight into application behavior, especially in a test environment.
For example, in the test environment, the tester or administrator might enable method-level metrics to show response times within the components of an application (or set of applications) under test. If the servlets, JSPs, and EJBs all use consistent naming standards, the administrator can easily match an element with a long response time to a particular owner.
This same principle of consistent naming also provides benefits with a more sophisticated tooling set. The Tivoli ITCAM for WebSphere products enable you to monitor your applications at a high level, with subsequent drill-down for more detailed problem determination analysis. The detailed analysis functions of this tool enable the administrator to see detailed flows and metrics, including a leak detection analysis that points out the line of code responsible for the leak. Clearly, at this level of analysis, a good class naming scheme supports debug and problem reporting for environments shared by multiple applications. This tool also supports filtering of monitors by class name. For example, during detailed analysis, you might want to simplify the data gathering and reduce the overhead by focusing the metrics on the classes belonging to one specific application. Again, an enforced package naming scheme makes this easier.
WebSphere Application Server also supports the Application Response Metrics (ARM) standard. Many of the internal elements of WebSphere Application Server (the Web container, the EJB container, and others) are ARM-instrumented. Tools, such as the ITCAM for Response Time Tracking, can report the various components of a transaction's response time as it arrives at the HTTP Server, through the application server, and even into the database tier. Typically, ARM instrumentation is filtered to a synthetic transaction fired at regular intervals to reduce measurement overhead.
You can increase the granularity of the metrics reported by instrumenting your code to the ARM standard. IBM provides an easy solution for this through the Build To Manage (BTM) Toolkits, now available on developerWorks. BTM provides a simple Eclipse interface that enables users to select the classes for instrumentation. BTM then generates the ARM code for those classes automatically (Figure 1). Again, by differentiating the classes by a strong naming standard, ARM helps you better pinpoint the problem classes and their owners.
Figure 1. Build To Manage instrumentation example
5. Separate administrative functions between applications
In environments where there are several fairly distinct applications -- possibly developed and maintained by different teams, there is often a requirement to separate the administrative functions so that each team can administer the deployment and operation of its own application without having or requiring access to other applications. This type of separation is only possible to a limited degree in a configuration where the multiple applications are co-located on a single application server. There are a number of server-wide resources that must, by their nature, be administered by someone who has administrative access to at least the entire server. However, it is possible to separate some administrative functions, such a starting/stopping/configuring an application on a per-application basis. The techniques available for doing so are essentially the same in a single-server environment as in a traditional multi-server environment:
Create a number of wsadmin scripts for each of the most common administrative operations. Ensure that each script is configured to access sufficient credentials to operate on the server, but restrict access to the scripts themselves to each appropriate group of users. This approach, though relatively complex, is possible is all current versions of WebSphere Application Server.
Alternatively, WebSphere Application Server V6.1 includes a feature called fine-grained administrative security, with which it is possible to define a different authorization group for each application, and grant access to that authorization group to only certain users in the WebSphere Application Server authentication domain. Be aware that this facility only works for access through the wsadmin tool. For effective use of the admin console, a user needs broad access to the entire system.
6. Always consider the big picture, system-wide
Finally, although most of the discussion in this article has focused on ways to identify and isolate information pertaining to one particular application, no article on troubleshooting is complete without a reminder that troubleshooting always remains a holistic exercise. Even in the best designed system, there is always the potential for unforeseen interactions between applications, between servers, and between each server and various external services or resources. Therefore, a good troubleshooter should never forget to "keep an eye on the big picture" when attempting to puzzle-out the cause of a problem and to understand the behavior of the system.
Deploying multiple co-located applications within a single instance of WebSphere Application Server provides valuable benefits in some environments, but it can also create some unique difficulties in the areas of problem isolation and problem determination. In most cases, these difficulties can be effectively addressed by carefully designing and managing the sharing of each type of resource within the server, by keeping good track of the identification of each application-related object or resource, and by using good discipline and sound techniques for system monitoring and troubleshooting.
- Co-hosting multiple versions of J2EE applications
- One or Many Applications per Application Server?
- IBM developerWorks WebSphere application server zone