IBM WebSphere Virtual Enterprise helps to optimize IBM WebSphere Application Server Network Deployment (hereafter referred to as Network Deployment) environments by intelligently managing the workload in a Network Deployment topology. In addition, WebSphere Virtual Enterprise provides capabilities to easily manage the deployment and health of Network Deployment applications, thereby creating a more resilient and efficient WebSphere Application Server environment.
WebSphere Virtual Enterprise primarily acts a Quality of Service (QoS) enhancer for an existing Network Deployment environment. One of the areas in which WebSphere Virtual Enterprise enhances QoS is describing and enforcing service level agreements. An on demand router (ODR) component introduced in WebSphere Virtual Enterprise acts as a smart reverse proxy for managing the flow of traffic. The ODR also helps to enforce the service level agreements defined for the applications.
WebSphere Virtual Enterprise improves the efficiency of Network Deployment environments by shifting capacity between nodes in a cell to meet workload demands. This feature enables WebSphere Virtual Enterprise to tap into spare capacity on idle nodes, and to dynamically make that idle capacity available to applications that are in need of CPU resources to meet their service level agreements -- thereby increasing your hardware return on investment (ROI).
WebSphere Virtual Enterprise also has features to easily manage the update of Network Deployment applications while minimizing downtime. An application edition manager can manage multiple versions of the same application in a Network Deployment cell and manage the rollout of application updates across a Network Deployment cell, all while being sensitive to incoming traffic flow and the application’s service level agreement.
Health management features in WebSphere Virtual Enterprise improve resiliency of an application server environment. WebSphere Virtual Enterprise can automatically detect health conditions (such as memory leaks) and take automated actions, like notifying administrators or shifting traffic past unhealthy servers, once a health situation is detected.
WebSphere Virtual Enterprise V6.1 is typically deployed on top of WebSphere Application Server Network Deployment V6.1 or 7.0, although it is capable of managing other middleware servers as well. Leveraging WebSphere Virtual Enterprise features requires no code changes to existing Network Deployment applications, and no new APIs need to be implemented by applications to benefit from WebSphere Virtual Enterprise’s QoS features.
This article describes best practices and limitations learned from working with many users who have deployed WebSphere Virtual Enterprise into existing Network Deployment infrastructures. See Resources for more information on the capabilities of WebSphere Virtual Enterprise.
To begin, it is useful to understand the requirements for applications that are going to be deployed on dynamic clusters in WebSphere Virtual Enterprise. Some of the requirements listed here apply to the ability to deploy an application in a cluster. Usually, those who migrate to WebSphere Virtual Enterprise already run their application in clusters. If this describes you, then your applications already meet most of these requirements.
Application stability and performance testing
One of the common misconceptions about WebSphere Virtual Enterprise is that it can improve the performance characteristics of an application. This is not the case. Suppose, for example, that an application is assigned an unrealistic service level agreement parameter, such as an aggressive response time. WebSphere Virtual Enterprise cannot do anything to make the application respond faster than it inherently can, and so setting an aggressive response time can lead to an inefficient environment. Conversely, if the response time goals are too lax, WebSphere Virtual Enterprise will be less responsive in allocating resources to the target application. This is because WebSphere Virtual Enterprise adjusts resources that are available to an application by shaping the flow of traffic into a Network Deployment cell. By prioritizing the flow of requests into a cell, WebSphere Virtual Enterprise can control the amount of CPU and memory resources an application gets in relation to other applications.
Additionally, WebSphere Virtual Enterprise can tap under-utilized nodes in a Network Deployment cell to meet workload demands for an application by starting additional JVMs that host application instances (this is a feature of WebSphere Virtual Enterprise dynamic clusters).
However, if an application is non-performant or has an unrealistic service level agreement defined, then no amount of resources will make it perform to the desired level. In fact, it could lead to a less efficient environment as WebSphere Virtual Enterprise expends resources to try and achieve goals that are not possible.
Therefore, deploying an application into WebSphere Virtual Enterprise does not absolve you from load testing an application, or from ensuring an application’s scalability and stability. In fact, stable applications with known and well understood performance characteristics are the best candidates for a WebSphere Virtual Enterprise deployment. In addition, less stable applications can be better managed through the WebSphere Virtual Enterprise Health Management subsystem.
Reliable data from performance tests, in combination with historical statistics collected from live runs of the application where applicable, provides accurate input to the process of defining realistic service level agreements for applications deployed to WebSphere Virtual Enterprise.
Network Deployment provides session failover capabilities through its support for distributed sessions. You can choose from database session persistence or memory-to-memory session replication for applications running in a cluster. (You could also use IBM WebSphere eXtreme Scale to provide a highly scalable and cost effective mechanism for session persistence.) When moving from a Network Deployment cluster environment to a WebSphere Virtual Enterprise dynamic cluster environment, session failover becomes an even more important requirement. In a WebSphere Virtual Enterprise environment, the application placement controller can make run time decisions on how many dynamic cluster members need to run and where. When dynamic clusters are running in automatic mode, the application placement controller can bring down a cluster member without the direct involvement of an administrator. Because applications in a WebSphere Virtual Enterprise environment can therefore lose their sessions without an explicit application server failure, the frequency of session loss can be much higher.
Although this is an important observation, keep in mind that not all applications use sessions. Most Web services applications are stateless and do not depend on sessions; this makes them ideal candidates for deployment in a WebSphere Virtual Enterprise environment.
Self-contained EAR files
A self-contained EAR is one that does not require any modification after being installed. Some users do directly modify the expanded artifacts (such as properties files) of an EAR in the installedApps directory after application installation. Though not recommended, there are situations in a Network Deployment environment when this can be acceptable.
In a WebSphere Virtual Enterprise dynamic cluster environment, however, those manual modifications to expanded EAR files become a challenge to manage. WebSphere Virtual Enterprise normally installs the application binaries in the appropriate profile of every node that is a member of the target dynamic cluster. Federating a new node into the cell could make that node a new member of the dynamic cluster (assuming that the membership policy matches). WebSphere Virtual Enterprise would automatically deploy the application to the new node, but any manual changes to the expanded EAR would not be made. Given this issue, be sure to use only self-contained EAR files in a WebSphere Virtual Enterprise environment.
Availability of application specific resources on nodes
Applications that rely on specific resources, such as a particular shared file system or a resource manager definition, must have those resources available on all nodes that are part of the membership policy of the dynamic cluster. This becomes particularly important when new nodes are brought into the dynamic cluster due to a change in the membership policy. For example, if an application connects to IBM WebSphere MQ or IBM CICS® Transaction Gateway, then WebSphere MQ client libraries or a CICS Resource Adapter, respectively, must be available on every node in the application’s target dynamic cluster.
WebSphere Application Server resource definitions, such as data sources and JDBC providers, can be defined at a cell or cluster level, which works well with dynamic clusters. However, some applications depend on external resources to be available on the nodes where they run. Examples include the installation of native libraries for access to IBM DB2® or WebSphere MQ, but also shared application libraries that the application depends on. The system administrator has to ensure that all nodes that match the dynamic cluster membership policy have those resources installed.
Although WebSphere Virtual Enterprise does not help automate this process, you should set custom node properties for external resources. The membership policy of the dynamic cluster can then be changed to ensure that only nodes with those custom properties set will match the membership policy. This ensures that no dynamic cluster members will be defined on those nodes that do not have the required external resources installed.
Support for hardware virtualisation
WebSphere products today are often deployed on virtualised hardware. Even though WebSphere Virtual Enterprise provides a different quality of service through application virtualisation, many users wish to deploy on a virtualised hardware platform. In this context, it is important to understand the support that WebSphere Virtual Enterprise provides for virtualised hardware.
- AIX/system p: WebSphere Virtual Enterprise V6.1.1 and higher provide full support on this platform, including the support of uncapped LPARs using micro-partitioning.
- VMware: WebSphere Virtual Enterprise V6.1 uses the VMware Infrastructure SDK (VI SDK) to communicate with the VMware Infrastructure 3 platforms through Web services. Hence, any such platform that exposes the VI SDK as a Web service can work with WebSphere Virtual Enterprise. Examples include VMware ESX Version 3.5, Virtual Center 2.5 and vSphere 4.0.
See the list of supported server virtualization environments in the Information Center.
Temptation of large cells
Application virtualisation in WebSphere Virtual Enterprise environments is most effective within the boundaries of a cell, which suggests that WebSphere Virtual Enterprise mandates large cells. For example, application prioritization and application placement work within the context of a cell. Therefore, to fully benefit from goals-driven application virtualization -- and increased ROI on hardware -- it is tempting to deploy a large number of applications in one large cell across common hardware. However, basic principles of operational risk should not be ignored. Large cells can increase operational risk and become single points of failure for components or applications that reside within them.
To mitigate the risk of large cells, typical Network Deployment techniques for high availability and disaster recovery should be applied to WebSphere Virtual Enterprise as well. For example, if multiple identical cells are part of your high availability plan for Network Deployment environments, the same technique will also apply to WebSphere Virtual Enterprise environments.
Core group considerations
By default, all WebSphere Application Server processes within a single cell are a member of the same core group. Core group best practices that apply to Network Deployment also apply to WebSphere Virtual Enterprise. Key points to consider when deploying (relatively) large Network Deployment or WebSphere Virtual Enterprise cells include:
- Do not exceed a total of 50 processes in a single core group. If you anticipate more processes within your Network Deployment or WebSphere Virtual Enterprise cell, you should define additional core groups and distribute the processes among all core groups.
- The processes that belong to a cluster (or dynamic cluster) cannot span multiple core groups. In other words, all members of a cluster should also be a member of the same core group.
- Each core group should contain at least one WebSphere Application Server administrative process; for example, a nodeagent or deployment manager process.
Prior to WebSphere Virtual Enterprise V6.1.1, there was an additional requirement that all core groups within the same cell be bridged together. WebSphere Virtual Enterprise V6.1.1 introduced an option to use the Bulletin Board over Service Overlay Network (BBSON) implementation. Although not the default, you should use BBSON because it can dramatically lower the CPU overhead of WebSphere Virtual Enterprise, plus it removes the requirement for the core groups to be bridged together, which simplifies WebSphere Virtual Enterprise deployment.
Avoid cells that span two data centers
From a topological point of view, best practices for Network Deployment topologies also apply to WebSphere Virtual Enterprise. Cells that span data centers are not recommended for Network Deployment. Tom Alcott describes in detail why in the third and fifth articles in his WebSphere Application Server series of frequently asked questions.
By the same token, you should not have WebSphere Virtual Enterprise cells that span multiple data centers. The topology shown in Figure 1 provides the best stability and reduces the operational risk involved in case of a data center outage. Be aware that the cells are completely independent; no core group bridging has been setup between the cells. Hence, the ODR tiers only route requests to application servers in their own data center.
In this topology, it’s obvious that the application placement controller in one data center will not be able to tap into idle capacity in the other data center. However, the improved resilience of the solution in case of a data center outage should not be underestimated.
Figure 1. Avoiding WebSphere Virtual Enterprise cells that span two data centers
Another option contemplated by users is that of cross-data center routing. That is, should HTTP plug-ins only route to ODRs in the same data center, or should they be allowed to route to foreign ODRs?
On principles of operational independence of data centers, HTTP plug-ins should not be routing to foreign ODRs. However, there are cases where such an option might be beneficial. For example, if all ODRs local to the HTTP server’s data center are down, then foreign ODRs could be a valid target for the purposes of failover to a functioning data center. Therefore, routing to foreign ODRs could be considered an exception for the purpose of making applications highly available in case all ODRs in a data center fail. This can be achieved by making foreign ODRs secondary targets in the HTTP plug-in.
Another reason for routing to foreign ODRs might be session affinity. Some users experience scenarios where session-sensitive traffic arrives at an HTTP server in the wrong data center (that is, the session exists in another data center). In this case, it also becomes necessary for the plug-in to route traffic to a foreign ODR for the purpose of maintaining session affinity.
Avoid topologies that involve cross-cell bridging
Though technically possible, do not create topologies where ODRs route traffic across multiple WebSphere Application Server cells. In practice, the required cross-cell bridging often results in an unstable solution, not to mention increased complexity. In the case where the ODR routes to generic server clusters -- that is, non-WebSphere Application Server clusters or foreign WebSphere Application Server clusters whose lifecyle is not fully managed by the cell in which the ODR resides -- cross-cell bridging is not required. In fact, it is best to route to multiple cells if the target servers are defined as generic server clusters.
Avoid co-locating ODRs and application servers
While technically there are no restrictions on co-locating ODRs and application servers, you should not implement this configuration if the shared machine does not have sufficient CPU resources to handle peak ODR and peak application loads.
The ODR runs the application request flow manager (ARFM), which uses CPU resources that can no longer be used by the co-located application servers. At the same time, ARFM collects CPU (and memory) statistics from the share machine in order to make decisions on flow control. As the processing power consumed by the ODR varies greatly with workload, CPU consumption on the shared machine will vary with the workload driven to the ODR. In other words, the ARFM statistics of the co-located application server are distorted by ODR workload. ARFM will treat the ODR workload as highly variant background noise on the shared application server machine, which could lead to oscillations in application server throughput if the shared machine does not have the CPU resources to handle both the ODR load and application load. However, if the shared machine has been sized appropriately, such a scenario could work well.
Dedicated ODRs for administrative traffic
WebSphere Virtual Enterprise provides a highly available deployment manager feature out of the box. This works on an active/passive model, where a hot standby deployment manager takes over if the active deployment manager fails.
In a highly available deployment manager setup, administrative traffic should also flow through an ODR. The reason is that the ODR is aware of which deployment manager is active at a given time; hence, an admin client (such as an admin console user or wsadmin script) only needs to know the ODR address, and the ODR forwards admin requests to the active deployment manager.
The next question is: Should application and administrative traffic share ODRs?
The answer is that it depends. Some users consider it a security risk to have administrative requests and application requests flow through the same JVM; for security reasons, they do not want threads with administrative credentials in the same JVM as threads with non-administrative or application specific authority. If this is the case, then having dedicated ODRs for administrative traffic is the solution. Know that at least two ODRs will be needed for high availability, as shown in Figure 2.
Figure 2. Dedicated ODR tier for administrative traffic in an HA deployment manager environment
Challenges resulting from cell consolidation
Consolidating cells across organizations
After moving to a virtualized infrastructure, application teams or organizations will end up sharing resources. In particular, multiple teams might end up sharing the same WebSphere Application Server cell. These challenges can arise as a result:
- Since multiple teams will share the same cell, administrative
responsibility will have to be changed. If cell administration is not
centralized (via a central operations team), then administrators from
each organization will have full access to each other’s applications,
which isn’t always acceptable. An administrator from organization A,
for example, can stop organization B’s application servers.
WebSphere Application Server Network Deployment V7.0 provides a partial solution to this situation, in the form of a fine-grained security model that enables administrative roles to be set at the cell, node, node group, cluster, server, and application level. This feature can be used to disallow personnel from other organizations to control your application servers or applications. Documentation on this feature is available in the WebSphere Application Server Information Center.
Remember that only WebSphere Virtual Enterprise V22.214.171.124 and V126.96.36.199 are supported on Network Deployment V7.0. Check the complete list of supported softwrae for more details.
- One of the components of a service policy (service level agreement) in WebSphere Virtual Enterprise is relative business priority. In a consolidated cell, service level agreements determine which applications get priority on a shared environment. Determining relative business priority of applications can be politically challenging. To deal with this issue, service level agreements should be determined by the business teams, rather than application teams, so that priorities can be set by business level decision makers based on the relative priority of different parts of the business.
Security considerations in large consolidated cells
Many organizations possess several user repositories (such as LDAP). This could be a direct result of technical strategy or, perhaps, a side effect of previous mergers with other companies. In the context of cell consolidation, multiple user repositories can become a problem.
With the federated repository feature in Network Deployment V6.1, multiple registries can be federated and viewed as a single registry. This can be a solution when you are trying to consolidate cells. However, the problem is that with a federated repository there can only be one security realm, which could be unacceptable. For example, if you intend to maintain user isolation and different security policies for different user populations, you might want two separate security realms, one for internal users (employees) and one for external users (customers).
Network Deployment V7.0 provides a solution to this with a feature that enables multiple independent user registries, with independent security realms, to be connected to the same cell. If applications between two realms need to communicate, trust can be established between the two security realms. This feature enables cells with two separate registries to be consolidated -- while still maintaining principles of user and security policy isolation.
For more information, see the WebSphere Application Server V7.0 Information Center.
Application deployment in large consolidated cells
Cell consolidation typically leads to more applications sharing the same cell. Users with high application deployment or update volumes might find that application deployment becomes a bottleneck because Network Deployment only permits one application to be deployed at a time.
To alleviate this, you must strike a compromise so that parallel deployment is minimized; for example, your application teams could agree upon a deployment schedule. In a worst case scenario, applications that tend to get updated or deployed at the same time will need to be placed in difference cells under the pretext of isolation requirements.
Other considerations for WebSphere Virtual Enterprise deployments
Dealing with secondary requests routed through ODR
Today’s enterprise applications tend to touch multiple components or services when handling a single request. For example, a client (browser) might perform an HTTP POST, which first hits a servlet in a Web container, which in turn makes a Web services call to an application hosted in a separate application server or cluster. Figure 3 illustrates this example.
Figure 3. HTTP request routing in a Network Deployment environment
When deployed in a WebSphere Virtual Enterprise environment, this example would look similar to Figure 4. The client hits the ODR, which routes the request to an application server hosting the Web application. The Web application makes a call to the Web services application (in what is referred to as a secondary request).
Figure 4. Routing HTTPs through the ODR in a WebSphere Virtual Enterprise environment
When deploying WebSphere Virtual Enterprise, it can be tempting to deploy both the Web application and the Web services application in separate dynamic clusters. This would enable traffic to the Web services application to be prioritised and load-balanced. However, this would require all secondary requests to be routed through the ODR as shown in Figure 5.
Figure 5. Routing secondary HTTP requests through the ODR in a WebSphere Virtual Enterprise environment
This is not a recommended architecture and should be avoided. Implementing this configuration requires that all secondary requests be associated with their own service policy, that the Web services application be running on its own cluster, and that the service policy have the highest priority configured. However, even in a best-case scenario, this architecture can cause deadlock issues.
Autonomic request flow manager rejection policies
The autonomic request flow manager (ARFM) is responsible for ensuring that requests are classified, prioritized, and queued before they are dispatched to their target application servers. ARFM will try to avoid overloading nodes with the dynamic clusters running those applications. However, there are peak load scenarios where ARFM cannot meet the service policy goals and avoid overloading the nodes. WebSphere Virtual Enterprise provides you with different rejection policies that determine the behaviour of ARFM under such conditions. You can find these policies under Operational Policies > Autonomic Managers > Autonomic Request Flow Manager in the Integrated Solutions console.
Figure 6. ARFM settings panel in a WebSphere Virtual Enterprise enabled administrative console
The two basic rejection policies are:
- Reject no incoming request, regardless of the impact they may have
on service level agreements.
When enough workload is driven to stress the system to the configured CPU threshold and cause resource contention between transaction classes, then service policies will be breached. ARFM will continue to accept requests regardless of the impact on those policies; in other words, you are in danger of violating some the your application service level agreements.
However, ARFM ensures that lower priority service policies are breached by a greater extent than higher priority ones. That is, lower priority work will suffer more than higher priority work. ARFM will also honor the configured CPU threshold.
- Reject incoming requests if their acceptance will have a negative
impact on service level agreements.
When enough workload is driven to stress the system to the configured CPU threshold and reach a point where service policies might potentially be breached, then ARFM will start rejecting requests that are not part of an existing session. By rejecting certain requests, ARFM ensures there is ample capacity available to ensure that those requests not rejected can be handled within the service policy goals.
A critical side effect of this rejection policy is that it can also start rejecting requests associated with the highest priority service policies. While ARFM will reject more low priority requests than higher priority ones, many business scenarios simply cannot tolerate the rejection of any request associated with the highest priority service policies. Unfortunately, no ARFM policy currently exists that guarantees that the highest priority requests will never suffer rejection. Therefore, ARFM rejection policies should only be used if some rejection of high priority requests is acceptable to the business.
Using ODR to serve custom error pages
The ODR in WebSphere Virtual Enterprise is capable of intercepting standard HTTP error codes, which can be generated by the ODR or received from target application servers, and then forwarding the HTTP error code and offending URL to a custom error page. Custom error pages can be deployed on application servers or Web servers; the latter is shown in Figure 7.
Figure 7. Custom error pages hosted on Web servers behind the ODR tier
If you already serve static custom error pages from your Web servers or have an existing custom error page strategy in place, you will not derive much benefit from using this ODR feature.
Rolling out applications with Edition Manager
The rollout feature of the WebSphere Virtual Enterprise Edition Manager is a powerful tool for deploying updates to an application. It is possible to deploy an updated application without interrupting the service, meaning the application is continuously available.
However, sometimes application updates also involve changes in external resources at the same time, such as database schema changes, a new shared library, or changes in queue manager definitions. In these cases, the Edition Manager cannot provide continuous application availability because these are changes outside the context of the application EAR file. However, Edition Manager does simplify the process by providing a simple mechanism to rollback to the previous release. Edition Manager can also be used to rollout the new edition to a test cluster in the production environment, which improves the chances of a successful application upgrade.
Monitoring a WebSphere Virtual Enterprise environment
The additional capabilities and intelligence of WebSphere Virtual Enterprise requires that you perform a thorough review for how to monitor the new environment. Figure 8 shows the different types of monitoring that can be found in a WebSphere Virtual Enterprise solution. As you will notice, some of them are unique to a WebSphere Virtual Enterprise environment (purple), and several can also be found in Network Deployment environments (blue).
Figure 8. Monitoring in context of a WebSphere Virtual Enterprise environment
- End-to-end application monitoring
WebSphere Application Server administrators often have monitoring in place for application availability and response time. This helps you detect problems before your customers do. Although the health management capabilities in WebSphere Virtual Enterprise can detect response time violations, true end-to-end monitoring can obviously cover additional parts of the infrastructure (for example, firewalls, load balancers, and so on). Therefore, you should continue end-to-end monitoring with your WebSphere Virtual Enterprise solution.
- Resource monitoring
The IBM Tivoli® Performance Viewer in Network Deployment enables you to monitor a many WebSphere Application Server resources. You should have a more robust solution that can store data for a prolonged period of time, such as IBM Tivoli Composite Application Manager.
- Operating system monitoring
Given the dynamic nature of WebSphere Virtual Enterprise, simply monitoring the operating system processes is no longer sufficient. Process monitoring still makes sense for components that are not controlled by the application placement controller, such as node agents, ODRs, and possibly the deployment manager processes. In addition, monitoring system resources such as CPU utilisation, disk and network I/O, and file systems is still highly recommended in a WebSphere Virtual Enterprise environment.
- WebSphere Virtual Enterprise run time tasks
When WebSphere Virtual Enterprise generates run time tasks, you should know about it. Rather than logging on to the Integrated Solutions Console and waiting for tasks to appear, you can configure WebSphere Virtual Enterprise to send out e-mail notifications when tasks are generated. Other interfaces such as SNMP are not currently supported.
- WebSphere Virtual Enterprise health management
The health management component in WebSphere Virtual Enterprise enables you to define specific conditions and actions in health policies. During run time, the health management component will automatically monitor the system for the conditions you have defined and take the necessary actions when required. This can be configured to be completely autonomic such that no administrator intervention is required. WebSphere Virtual Enterprise will generate a run time task whenever the health manager performs a specific action. If the health policy is configured to take action automatically, then the run time task acts as a notification of the action that was taken; if the health policy is configured to ask the system administrator for approval before taking action (known as supervised mode), then the run time task will wait for approval before performing the action.
Health policies are very useful for gracefully managing known health problems (such as automating the restart of servers at certain intervals), as well as for managing unexpected problems (such as detecting a memory leak in a newly deployed application and generating a heap dump before the JVM becomes unresponsive.)
For more on WebSphere Virtual Enterprise health management policies, see the WebSphere Virtual Enterprise Information Center.
IBM WebSphere Virtual Enterprise has powerful features that can help you optimize an IBM WebSphere Application Server Network Deployment environment. Techniques and best practices on how to use those features were summarized here, along with information about what environment and application prerequisites need to be in place for a successful deployment, and what architectural decisions need to be made when considering a WebSphere Virtual Enterprise deployment.
The authors thank Brian K. Martin, Keith B. Smith, and John P. Cammarata for their input and feedback on this article.
- WebSphere Extended Deployment Information Center
- Information Center articles
- Everything you always wanted to know about WebSphere Application Server but were afraid to ask
- List of supported software for WebSphere Extended Deployment 6.1.1 and all separately orderable component products
- IBM developerWorks WebSphere