Implementing workload management principles across the enterprise
IBM WebSphere Application Server Network Deployment provides workload management-type capabilities for multiple protocols. This enables the distribution of an application’s workload across Network Deployment application servers. While the majority of new Java™ EE applications employ HTTP protocol workloads, there are still many Java EE applications that employ IIOP protocol workloads, with EJB applications deployed in dedicated application servers or clusters. Sometimes the client even resides in a different cell, which complicates the deployment and configuration.
This article looks at the mechanics of EJB workload management (WLM) in WebSphere Application Server Network Deployment where the client runs in a different cell. The configuration options that control the behaviour of the runtime will be described. Because it can be difficult to find all applicable best practices for this type of scenario, this article also discusses several application development recommendations for EJB client applications to both consolidate and update these recommendations which may have changed over time.
Before looking at the EJB workload management mechanism, let’s revisit revisit a typical flow for a (remote) EJB call. Figure 1 shows two different WebSphere Application Server cells, one hosting the EJB client application and the other one the EJB (server) application. Both applications run in a (WebSphere Application Server) Java EE container, but in this case the client application cannot rely on the naming service in its own cell to lookup the EJB.
This is what happens when the client makes its first remote EJB call:
- The EJB client application first needs to create an InitialContext before it can perform a lookup. Creating the InitialContext requires the use of a CORBA object URL, which includes one or more servers and their corresponding bootstrap ports. When creating the new InitialContext, a request is send by the client’s ORB to the server’s bootstrap port and returns a JNDI context for the server’s naming service.
- The actual lookup will use the JNDI context of the server and return a home object of the bean. This is an indirect IOR pointing to the location service daemon (LSD) of the nodeagent on the same node.
- Now, the client can create a new bean using the home interface. A request is sent to the LSD which selects one of the cluster members hosting the EJB workload management plug-in in the LSD.
- A direct IOR pointing to the selected cluster member is returned to the client, which can now be used to call a method on the bean’s remote interface.
- When the client calls a method on the bean’s remote interface, a request is sent to the actual bean instance on the cluster member selected by the LSD.
- The response of the above request contains the cluster configuration information. The EJB client workload management plug-in stores this information so it can use this data for subsequent requests.
Figure 1. Different steps for a remote EJB call that involves the EJB workload management mechanism
EJB client development best practices
Ensuring high availability of EJB client applications is best achieved by taking a number of application development best practices into account. Below is a list of best practices specific to EJB client applications:
- Use cluster members when bootstrapping from outside the cell
When making remote EJB calls to another cell, the EJB client application needs to create an InitialContext before it can perform a lookup. Creating the InitialContext involves the use of a CORBA object URL. To avoid a single point of failure, you should specify multiple servers for the remote cluster using a CORBALOC provider URL for the InitalContext() method.
Unless you are running WebSphere Application Server on z/OS®, nodeagents should not be employed for the InitialContext call. Moreover, it is a requirement to bootstrap to cluster members rather than the node agent. This is because bootstrapping to a nodeagent returns a non-WLM-enabled initial context, while boot strapping to the cluster members returns a WLM-enabled initial context. This is very important because the client side naming cache is built on a per host/port basis. If the client tries to bootstrap to the nodeagent, and the nodeagent subsequently fails, a re-bootstrap will reuse the naming cache, which now contains an unusable Initial context. If the client tries to bootstrap to more than one cluster member, as long as one cluster member is alive, it will find it.
The cluster used for bootstrapping can be the EJB target cluster, or a separate cluster just for the purpose of bootstrapping. Using a dedicated cluster for bootstrapping can simplify system administration. The cluster members hosting the EJB server components can now be restarted individually (that is, ripple start) without impacting the bootstrapping of the EJB clients. However, using a dedicated cluster obviously does require additional system resources.
An example of a corbaloc URL with multiple addresses is shown in Listing 1.
import java.util.Hashtable; import javax.naming.Context; import javax.naming.InitialContext; ... Hashtable env = new Hashtable(); env.put(Context.INITIAL_CONTEXT_FACTORY, "com.ibm.websphere.naming.WsnInitialContextFactory"); // All of the servers in the provider URL below are members of // the same cluster. env.put(Context.PROVIDER_URL, "corbaloc::myhost1:9810,:myhost1:9811,:myhost2:9810"); Context initialContext = new InitialContext(env);
- Leverage WebSphere Application Server JNDI cache
When establishing a new InitialContext, you have to make sure that the WebSphere Application Server JNDI cache is enabled. As long as this cache on the EJB client is enabled, there is no need to cache home or bean references in the EJB client application, which greatly simplifies the EJB client application.
Although the WebSphere Application Server JNDI cache is enabled by default, EJB client applications can explicitly disable the WebSphere Application Server JNDI cache. Below are examples of how this cache can be disabled; all of these items should be avoided:
- Java commandline argument:
- Explicit call from application to set the following system property:
System.setProperty( "com.ibm.websphere.naming.jndicache.cacheobject", "none" );
- Using jndi.properties anywhere in the classpath of the client:
- Explicit call when creating a new InitialContext:
env = new Hashtable(); env.put(PROPS.JNDI_CACHE_OBJECT, PROPS.JNDI_CACHE_OBJECT_NONE); // env.put("com.ibm.websphere.naming.jndicache.cacheobject", "none"); env.put(Context.INITIAL_CONTEXT_FACTORY, "com.ibm.websphere.naming.WsnInitialContextFactory"); env.put(Context.PROVIDER_URL, "corbaloc::myhost1:9810,:myhost1:9811,:myhost2:9810"); ctx = new InitialContext(env);
At one time, caching home or bean references brought substantial performance benefits. However, with the introduction of the runtime JNDI cache in WebSphere Application Server V4, this no longer enhances application performance and only increases complexity.
In summary, with the WebSphere Application Server JNDI cache enabled, there is no need to cache home or bean references in the EJB client application.
- Java commandline argument:
Configuration best practices
Running EJB client applications that are inline with the best practices above is important. However, there are situations where certain configuration best practices also need to be applied. Below is a list of the most important parameters and their impact. Figure 2 shows these various configuration parameters and where they are set.
Figure 2. Overview of important configuration parameters
Configuration recommendations include:
- Change the EJB WLM unusable interval
When one of the EJB target cluster members is stopped or crashes, the EJB workload management plug-in on the client will mark that server as unavailable. It will not send any requests to that server for a period of time. This period is also referred to as the unusable interval and is set by default to 300 seconds (5 minutes).
It is vital that administrators be aware of the existence of this unusable interval. For example, when restarting an EJB cluster member it can take up to the unusable interval before it starts processing EJB client requests from outside the cell. The default unusable interval of 300 seconds will keep the number of EJB client calls to an unavailable server at a minimum. In many cases, however, the interval can be safely lowered to ensure a faster discovery of newly restarted cluster members.
You can adjust the unusable interval by setting the JVM custom property com.ibm.websphere.wlm.unusable.interval to a value more suitable to your environment. This would need to be set on each application server that hosts the EJB client application.
- Client ORB configuration
Configuration best practices for remote EJB calls obviously involve the Object Request Broker (ORB). Below are a number of client ORB timeouts that should be reviewed. The default ORB timeouts are rather generous; many clients prefer to configure more aggressive ones, taking into account typical application response times.
- Connect timeout
Before the client ORB can even send a request to a server, it needs to establish an IIOP connection (or re-use an existing one). Under normal circumstances, the IIOP and underlying TCP connect operations should complete very fast. However, contention on the network or another unforeseen factor could slow this down. The default connect timeout is indefinite, but the ORB custom property com.ibm.CORBA.ConnectTimeout (in seconds) can be used to change the timeout.
- Locate request timeout
Once a connection has been established and a client sends an RMI request to the server, then LocateRequestTimeout can be used to limit the time for the CORBA LocateRequest (a CORBA “ping”) for the object. As a result, the LocateRequestTimeout should be less than or equal to the RequestTimeout because it is a much shorter operation in terms of data sent back and forth. Like the RequestTimeout, the LocateRequestTimeout defaults to 180 seconds.
- Request timeout
Once the client ORB has an established TCP connection to the server, it will send the request across. However, it will not wait indefinitely for a response, by default it will wait for 180 seconds. This is the ORB request timeout interval. This can typically be lowered, but it should be in line with the expected application response times from the server.
- Connect timeout
- Server ORB configuration
Of course, configuration best practices for remote EJB calls involve the object.
Use the com.ibm.websphere.orb.threadPoolTimeout custom property to specify the length of time in which the ORB waits for an available thread from the ORB thread pool before rejecting a request. Unless this custom property has been set, the client ORB will wait until the request timeout threshold is reached. This custom property is available in WebSphere Application Server Network Deployment V184.108.40.206 and later.
- Enable preload of workload management cluster data and disable callback timeout
New EJB client workload management behaviour was introduced with WebSphere Application Server Network Deployment V6.0. This can cause undesirable behaviour when none of the target EJB cluster members are running. The EJB client will wait for 30 seconds to enable any starting EJB cluster members to complete their startup cycle. This timeout interval was initially hardcoded but can now be set through a cell custom property called IBM_CLUSTER_CALLBACK_TIMEOUT.
When the target EJB cluster is down, many clients prefer to have the EJB client fail immediately instead of waiting for 30 seconds. This would involve setting the IBM_CLUSTER_CALLBACK_TIMEOUT to 0. Unfortunately, the workload management code in the EJB cluster cell uses a lazy mechanism to load the workload management cluster information. In order to use an aggressive timeout of 0, you would need to ensure that the workload management cluster information is loaded immediately upon startup of the cluster member. This can be achieved by setting the cell custom property IBM_CLUSTER_ENABLE_PRELOAD to true.
You should have both these two cell custom properties in place for the cell hosting the EJB servers. The minimum version required for this is WebSphere Application Server Network Deployment V220.127.116.11 or 18.104.22.168. Take special care when setting IBM_CLUSTER_CALLBACK_TIMEOUT to 0 in V7.0. There is a known issue that can cause nodeagent instability; PM08450 resolves this and is included with V22.214.171.124 and later. Know that there are no specific requirements for V8.0.
- Control EJB workload management cluster feedback mechanism
The EJB workload management plug-in distributes requests across the different EJB target cluster members. By default, the plug-in uses a combination of the cluster member weights and the number of outstanding requests for each cluster member.
This feedback mechanism can be changed by setting the cell custom property IBM_CLUSTER_FEEDBACK_MECHANISM on the target cell. The following options (and their values) are available, even though they are typically not necessary:
- 0: Use only the configured weights to determine routing.
- 1: Use blending of weights and outstanding requests (default behavior).
- 2: Use only the outstanding requests to determine routing.
- 3: No extra feedback mechanism, does not take configured weights or outstanding requests into account. This is functionally equivalent to routing based on all servers having equal weights; any changes to the configured weights would be ignored.
- Configure core group bridges when using multiple core groups in the EJB server cell
Beginning with WebSphere Application Server V6, EJB workload management uses the HA manager bulletin board to aggregate and propagate the run time cluster description information. Each cluster member posts data to the bulletin board about its cluster description information and all servers with access to the bulletin board information are informed of these posts. As a result, without core group bridges, some portion of the node agents will not have cluster data for the EJB clusters, and EJB client invocations will result in an CORBA NO_IMPLEMENT No Cluster Data Available Exception. This is discussed in more detail in the article Best Practices for Large WebSphere Topologies.
Achieving high availability for EJB applications running on WebSphere Application Server Network Deployment is affected by a number of aspects. First, the EJB client applications should be developed with best practices in mind. But a number of best practices to fine tune the configuration can be required in order to achieve the best possible result.
I would like to thank Tom Alcott, Jason Durheim, and Michael Cheng for their input and feedback for this article.
- WebSphere Application Server 7.0 Information Center: Using a CORBA object URL with multiple name server addresses
- Setting Java Virtual Machine custom properties
- PK20304: NO_IMPLEMENT ON FIRST REQUEST TO A CLUSTER
- PM08450: WLM CALLBACK TIMEOUT INFINITE WAIT
- IBM developerWorks WebSphere