In this article, Jason answers your questions about WebSphere® Extended Deployment 5.1, including details about the On Demand dynamic operations features and the application partitioning features provided through the WebSphere Partitioning Facility (WPF). WebSphere Extended Deployment is a new product that was introduced in 4Q2004. It extends the capabilities of the WebSphere software platform to include new notions of dynamic operations, high performance computing, and extended management.
Question: Can the On Demand Router run a cluster for scalability/high availability?
Answer: Certainly. The On-Demand Router's (ODR) job is to manage traffic into a WebSphere cluster. The ODR is more intelligent than the base WebSphere plug-in because it manages traffic against a set of performance and priority policies. The ODR attempts to control the flow of traffic to meet the set of goals that you have defined for the applications running on the WebSphere cluster. It does this by controlling many factors about the traffic, including the flow rate for different classes of work, the priority or order of request dispatch, and the concurrency of a given class of work against the back-end servers. Additionally, the ODR protects the backend, such as overload protection. This prevents a single application from consuming the entire set of cluster resources. The end result of the ODR's actions is a scalable and highly available environment that responds to real world traffic, synchronizing with your business goals and priorities.
Question: Can I limit certain applications to certain nodes?
Answer: Yes. WebSphere Extended Deployment introduces a couple of key new concepts to the WebSphere environment. The first is the notion of a Node Group. A Node Group is a collection of machines that form a resource pool. The second concept is Dynamic Cluster. A Dynamic Cluster (DC) is similar to a regular WebSphere cluster, except the members of the DC are created and controlled by WebSphere Extended Deployment. When you create a regular cluster in WebSphere, you have to create all of the cluster members on particular nodes in the cell. That topology is then static. With WebSphere Extended Deployment, you create a DC and assign it to run on a Node Group. WebSphere Extended Deployment then creates all of the cluster members and controls how many are active at any given moment in time. The DC members are constrained to run on only the machines in the assigned Node Group. When you install an application, you assign it to run on a DC. If you wish to constrain an application to run only on a particular subset of nodes, define a Node Group that includes just that set of nodes. You then create a DC on that Node Group and assign your application to run on that DC. That limits WebSphere Extended Deployment to choose from only that limited node list when making resource allocation decisions.
Question: It looks like the configuration of the WebSphere plug-in has to change. Is this the case? How does it change?
Answer: Yes, this is the case. In a standard WebSphere Extended Deployment topology, where the On-Demand Router (ODR) is sitting between the WebSphere plug-in and the application servers, the configuration of the plug-in has to be modified to tell the plug-in to route all traffic to the ODR tier instead of directly to the application server tier. This is accomplished through rewriting the rules in the plugin-cfg.xml file that drives the plug-in. In WebSphere Extended Deployment, the ODR contains a feature that enables the ODR to take over the plugin-cfg.xml file generation. Essentially, the ODR generates a new XML file that contains rules to route all WebSphere traffic to the ODR tier. The ODRs do not have a plugin-cfg.xml file. They automatically learn the configuration of the WebSphere cells to which they are connected to and are notified of any configuration changes in the application server tier.
Figure 1. Standard WebSphere Extended Deployment topology
Question: What are the recommendations when a customer wants to share their applications for all the servers regarding the limits? We had some customers using almost 200 enterprise applications (.ear).
Answer: I believe you are asking for limits as far as how many applications can run on a single machine and therefore, how many machines do you need in an environment. Of course, the answer is "it depends". There are a number of factors at play. First is the static characteristics of the application. How much memory does the application require? What type of isolation assumptions does it make? What is the capacity of the server machine? These answers will help you to understand how many applications would fit on the server.
The second consideration is the dynamic characteristics of the application. What type of load is the application expected to receive? How much compute capacity is required for each request? What is the processing power of the machine? These answers will help you understand how many applications can be processed with an acceptable response time on a server. WebSphere Extended Deployment attempts to manage many of these decisions for you. The application placement function of WebSphere Extended Deployment will decide on how many applications to run on a box, based on both the static and dynamic characteristics of both the application and the servers. Of course, you still need enough boxes to process your average load. WebSphere Extended Deployment's monitoring and visualization functions can determine where that lower bound is and therefore, how many machines you should provide in your environment.
Question: Is it a full integration between WebSphere XD and TIO, or do we need human intervention to add new resources?
Answer: WebSphere Extended Deployment ships with full integration with IBM® Tivoli® Intelligent Orchestrator (TIO). The supported scenario is as follows. When WebSphere Extended Deployment is managing the WebSphere environment, it is limited to the set of machines allocated to it, as defined by the membership of the Node Group. WebSphere Extended Deployment will use the functions of the On Demand Router (ODR) and the Application Placement Manager to attempt to meet the set of performance policies defined by the administrator. However, when the load is sufficiently high, WebSphere Extended Deployment may not be able to meet the defined goals with the machines it accesses.
Without TIO, WebSphere Extended Deployment will gracefully degrade the performance of the applications in accordance with the defined importance levels for the applications. With TIO, WebSphere Extended Deployment can inform TIO (through an Objective Analyzer provided with WebSphere Extended Deployment), that WebSphere Extended Deployment is going to breach its performance goals. TIO can then look across all of the machines it is managing and decide to move a machine into the WebSphere Extended Deployment environment. That provisioning action may require a full install of the box. The logic to handle the install is provided by TIO and is outside the domain of WebSphere Extended Deployment. The only WebSphere Extended Deployment part of the workflow is adding the machine to the Node Group.
WebSphere Extended Deployment provides the TIO scripts to accomplish the addToNodeGroup action. So, with WebSphere Extended Deployment and TIO together, TIO basically decides, with the help of WebSphere Extended Deployment, how many machines to allocate to WebSphere Extended Deployment. WebSphere Extended Deployment decides what to do with them. The TIO integration is optional, but powerful if you are a TIO customer.
Question: When we need to add new machines in a node, then we must load JVMs, configure the system and the applications, set network addresses, etc. How long does it take usually? What can be pre-installed and what needs to be configured at real time?
Answer: There are two parts to the answer to this question. The first is the configuration part. In a WebSphere Extended Deployment environment, when you add a node (machine) to a Node Group, XD will handle the configuration of the application servers (JVMs) on that node. This includes definition of the server, installation and distribution of the applications, setup of the network information, and so on. WebSphere Extended Deployment does this through a server template that is stored with each Dynamic Cluster (DC). This DC template allows the user to make changes to the environment in a single location and have all of those changes propagated to the cluster members. In WebSphere Extended Deployment, we configure servers on all machines in the Node Group, essentially defining the maximum possible size for the Dynamic Cluster. This is done when you define the Node Group or Dynamic Cluster. However, we do not start any of the DC members until needed.
This brings us to the second part. When the load determines that WebSphere Extended Deployment needs to bring a new server online, it simply starts an already configured server. So the delay is simply the server startup time delay, not configuration time plus server startup time. Of course, the server startup time is mostly dependent on the initialization time of your application.
Question: What is asymmetric clustering in WebSphere Extended Deployment?
Answer: Traditional J2EE applications work well for a large class of applications. This class can be broadly categorized as applications that run in a stateless cluster in front of a database. I call this a symmetric cluster:
- All the cluster members can perform any task at any time.
- The application is stateless.
- The application is modal, which means it only performs work synchronously in response to a client request that is received using HTTP/IIOP or JMS.
There are other applications that do not work well in such an environment, for example, an electronic trading system in a bank. Such applications typically use tricks that can greatly improve performance, such as partitioning, multi-threading, and write through caching. These are applications that can exploit asymmetric clustering. An asymmetric cluster is practically the opposite of a symmetric cluster:
- Applications can declare named partitions at any point while its running partitions are highly available, mobile within the cluster, and usually run on a single cluster member at a time.
- Incoming work for a partition is routed to the cluster member hosting the partition.
- The application is amodal. Partitions have a lifecycle of their own and can start background threads/alarms as well as respond to incoming events whether they are IIOP/HTTP or JMS/foreign messages.
WebSphere Extended Deployment offers a new set of programming APIs called the WebSphere Partitioning Facility. To my knowledge, these APIs allow applications that require an asymmetric cluster to be deployed on a J2EE server for the first time.
Question: In the WebSphere Partitioning Facility, how are the partitions made highly available?
Answer: WebSphere has a new component called the High Availability Manager or HAManager. This component is included in both WebSphere Extended Deployment 5.1 and WebSphere 6.0. This provides high availability services in both products. WebSphere Extended Deployment uses the HAManager to manage partitions and ensures that a partition only runs on exactly one cluster member. Administrators can use policies to specify preferred server lists and fail back options for partitions. For example, the administrator specifies that the partition for trading IBM stock runs primarily on server A, with a failover to server B. If server A restarts after a failover, it has a manual failback to server A. We can demonstrate failover times in seconds using this technology.
Question: How can partitioning improve application performance?
Answer: A stateless cluster will only scale so far once cluster members start competing for database access. If the workload is a read mostly workload, then solutions like caching with optimistic locking work well. However, as the write rate increases, this starts to break down because of collisions. Some workloads also require incoming work to be executed in order. For example, buy and sell orders for a stock symbol need to be processed in order. When the work arrives, spread over a cluster, then this is made more complex and while it can be done, it's not easy. Partitioning allows the application to partition itself and then route incoming work exclusively to a single partition. This incoming work needs to be classified into partitions, and then routed to the cluster member hosting that partition using either IIOP or messaging. Once the work arrives, the cluster member can aggressively cache read/write data specific to that partition because the developer knows this data is only modified on this cluster member. The cluster member can also order the work and ensure that it's processed in the correct sequence in memory, independently of any other cluster members. The database is now offloaded as it gets writes from the cluster, all reads are satisfied using the cache in the cluster member. So long as there are more partitions than boxes, then adding boxes will make this application run faster.
Question: Why is it best for a WPF partition to run on one node versus multiple ones?
Answer: The key to horizontal scalability is eliminating cross talk and contention between servers. The ideal situation is that partitions do not need to interact with each other at all. If this is the case, then you get linear scalability. All applications that use a database will experience some cross talk within the database. This happens because the index locks or latches within the database (such as those surrounding the transaction log) until the application is using a partitioned database. Application architects should strive to use this method for best performance.
Question: I've a client who has their entire platform in IBM (WebSphere Application Server 5.2). At this moment, it has many projects in WebSphere, but it wants to do a standard platform for all new projects. The technology area told me that they were thinking about using Struts for the presentation layer, hibernate for the data access layer, and WebSphere Authentication Service for the Security Module (Authentication and Authorization). I need to do a proposal about the new standard schema. So, I saw in your presentations of IBM and it always spoke about the use of JSF and JDO in the new tools of IBM Rational, and I need to explain what technology to use and why. This is a very important decision because it is a really big and important company and I want to know you think about this. What is better for the future? What is better according to the new technologies of IBM? What do you think about Spring?
Answer: While this is a very interesting discussion, it is not related to WebSphere Extended Deployment and therefore, not applicable to this Q&A article. We can take it offline.
The author would like to thank Billy Newport for his help in preparing this article.
- Access to the WebSphere Extended Deployment product page. WebSphere Extended Deployment site.
- Access to online help for WebSphere Extended Deployment. WebSphere Extended Deployment 5.1 Information Center.
- Why you need WebSphere Extended Deployment. IBM WebSphere Developer Technical Journal: Comment lines from Kyle Brown.
- Article on high-level overview of the autonomic features of WebSphere Extended Deployment. Understand WebSphere Extended Deployment.
- Download a free trial version of Application Edition Manager for WebSphere Extended Deployment. Application Edition Manager Technical Preview.
- Redpaper on when to use z/OS based solutions or distributed solutions. Redpaper: Scaling for High Availability: WebSphere XD and WebSphere Application Server for z/OS.
- Access to technical resources for the WebSphere platform. developerWorks WebSphere Application Server zone.
Meet the experts is a monthly feature on the developerWorks WebSphere Web site. We give you access to the best minds in IBM WebSphere, product experts who are waiting to answer your questions. You submit the questions, and we post answers to the most popular questions.
Jason McGee is a Distinguished Engineer and Chief Architect for WebSphere Extended Deployment (XD). Previously, Jason was the Chief Architect of the Base and Network Deployment versions of WebSphere Application Server. He is a senior architect on the WebSphere Foundation Architecture Board and an associate member of the IBM Software Group Architecture Board. Jason serves as the Director of WebSphere Advanced Technologies, responsible for the productization new technologies into the WebSphere platform. Jason joined IBM in 1997 and has been a member of the WebSphere Application Server product since its inception. He helped to define the concepts of Servlets and JavaServer Pages (JSP) for processing Web presentation logic on WebSphere. Jason was responsible for the design and implementation of the Web Container in WebSphere Application Server. Mr. McGee has been heavily involved in leading the architecture for key parts of the WebSphere Application Server, including the server runtime and the XML-based systems management architecture. Jason graduated with a B.S. degree in computer engineering from Virginia Tech in 1995.