IBM Support

Recommended values for web server plug-in config

Question & Answer


Question

In the web server plug-in, what do the LoadBalanceWeight, MaxConnections, ConnectTimeout, ServerIOTimeout, RetryInterval, IgnoreAffinityRequests, and GetDWLMTable options mean and what are the recommended settings for these options? What effect does Session Affinity have? How are connections handled during plug-in fail-over? What is the effect of using more than one web server child process?

Answer

To understand how load balancing works in the web server plug-in, see Understanding IBM HTTP Server plug-in Load Balancing in a clustered environment.

To understand how fail-over works in the web server plug-in, see Understanding HTTP plug-in failover in a clustered environment.


The property LoadBalanceWeight is a starting "weight". The value is dynamically changed by the plug-in at run time. The "weight" of a server (or clone) is decreased each time a request is assigned to that clone. When all weights for all servers drop to 0 or less, the plug-in has to readjust all of the weights so that they are greater than 0. Using a starting value of only 2 (default), means that the weights get to 0 quickly and the plug-in constantly recalculates the weights. Therefore, It is recommended to start with a higher LoadBalanceWeight. The IBM WebSphere Application Server administrative console allows a value up to 20. However, it is possible to manually edit the plugin-cfg.xml file and specify some other value for LoadBalanceWeight that is higher than 20.

Note: At run time, the LoadBalanceWeight of each application server in a cluster are normalized by their highest common factor. For example, 100, 90, 80 have a common factor of 10. So, these configured weights would be divided by 10 at run time, resulting in actual starting weights of only 10, 9, 8. Setting all clones to the same starting LoadBalanceWeight (for example: 20, 20, 20) results in an actual starting weight of only 1 for each because of normalization. So, it is recommended to set the weight of at least one of the clones to be off by a value of 1. For example, if there are 3 clones, you might choose the starting LoadBalanceWeights to be: 20, 20, 19. After normalization, the weights are unchanged.
Recommended values = all clones the same, except one clone off by one (for example: 20, 20, 19)


The property MaxConnections is used to gauge when a server is "starting to become overwhelmed". It is not used to determine when to fail over (mark the server "down"). When a request is sent from the plug-in to the WebSphere® Application server, it is called a "pending request" until the response comes back. If the application running in WebSphere Application Server is handling requests quickly, each request is PENDING for a short time. Therefore, under ideal conditions, MaxConnections is not needed and the default is (-1), meaning unlimited, is sufficient. However, sometimes an application is overwhelmed and the application is not able to handle the requests quickly.  Therefore, pending requests start to build up. The MaxConnections property can be used to put a limit on the number of PENDING requests per server. When the MaxConnections limit is reached, the plug-in stops sending requests to that application server, but it is not marked down. The optimal value for MaxConnections depends on how quickly the application and appserver respond to each request. If normal responses are returned in less than one second, it can be appropriate to set a low value for MaxConnections, like 20 or so. However, if it normally takes several seconds to get a response from the application, then it would be prudent to use a higher value for MaxConnections, like 100. If the MaxConnections limit is reached, the plug-in does not send ANY more requests to that server. Plug-in waits until responses come back for the current PENDING requests, and the pending requests count decreases lower than the MaxConnections limit.
Recommended value = 20 - 100 depending on application response times

Best Practices: With MaxConnections="-1" use LogLevel="Stats" to monitor the pending requests numbers in the plug-in log, under normal conditions. Then, choose a value for MaxConnections that is significantly higher than the highest number shown in the log. This method helps you to determine a MaxConnections value that is right for your specific environment.


The property ConnectTimeout means "how long does the plug-in wait for an open socket to the Application Server"? If there are streams open and available to the Application Server, the plug-in uses one. However, sometimes the plug-in needs to open a new stream to the Application Server. The time to open a socket is not long, so the value for ConnectTimeout needs to be small. A ConnectTimeout value of 0 means never timeout. In that case, the timeout is left up to the OS TCP layer, which is NOT ideal. It is much better to specify a small positive number (like 5 seconds).
Recommended value = 5


The property ServerIOTimeout means "how long does the plug-in wait for a response from the application". After the socket is opened, the plug-in sends the request to the WebSphere® Application Server. The application processes the request and a response is sent back to the client, through the plug-in. This property depends on the application. If the application is quick to respond, then you can use a lower value for ServerIOTimeout. However, if the application requires more time to process the request (maybe to retrieve data from a database), then use a higher number for ServerIOTimeout. Using a value of 0 means that the request NEVER expires from a timeout. A positive value means that the plug-in does NOT mark down the application server after a ServerIOTimeout pops. So, if you want the plug-in to continue sending requests to the timed-out application server, use a positive value. A negative value means that the plug-in marks down the application server after a ServerIOTimeout pops. So, if you want the plug-in to immediately mark down the application server and fail-over to another application server in the same cluster, use a negative value.
Recommended value = -900 (that is negative 900)

Note: The ability to use a negative ServerIOTimeout value was introduced in plug-in apar PK72097.

Best Practices: Use traces to determine the amount of time it takes for your application to respond to requests under normal conditions. Be sure to include the longest running requests that take the most time to respond. Choose a value for ServerIOTimeout that is much larger (2X or 3X or more) than the longest response time. This method ensures that your ServerIOTimeout is high enough to allow adequate time for the application to respond normally. Make it a negative value so that if the ServerIOTimeout pops, the plug-in immediately marks down the server, and retry the request to a different application server.


The property ServerIOTimeoutRetry can be used to decrease the number of retries after ServerIOTimeout fires. By default, the plug-in retries a request equal to the number of members in the cluster. For example, if the cluster has four members, and the the ServerIOTimeout is set to 900, the plug-in retries it a second time. If the retry fails with ServerIOTimeout fired, then it retries a third time, and a fourth time if needed. But after four attempts, the plug-in ceases and stops retrying. With four timeouts occurring, the user would not receive the error response until an hour after the request began (4 * 900s = 1h). If you want to override this default behavior and reduce the number of retries after ServerIOTimeout, you can set ServerIOTimeoutRetry to a value that is less than the number of members in the cluster. The default value is -1 when the property is created in the plug-in configuration file.
Recommended value = 1
Notes: 
  • If the ServerIOTimeoutRetry setting is NOT present in the plugin-cfg.xml configuration file, no retries are attempted (value of 0).
  • If ServerIOTimeout is set to a negative value such that a server is marked down when the timeout occurs, retrying all cluster members (-1, the default value) can  cause all servers in the cluster to be marked down.
  • This property was introduced by plug-in apar PM70559.

The property RetryInterval is the time that the plug-in waits to retry use of an application server is marked down. The optimal value for RetryInterval depends on the number of application servers in the cluster, and the value used for ServerIOTimeout. You can use the following formula to determine the maximum RetryInterval value for your plug-in config:

(number of appservers in cluster - 1) x (absolute ServerIOTimeout) - 1

For example, if there are two application servers in the cluster, and the value of ServerIOTimeout is -900, then the maximum RetryInterval setting would be:
(2 - 1) x (900) - 1 = 899 seconds or less

Another example, if there are four appservers in the cluster, and the value of ServerIOTimeout is -900, then the maximum RetryInterval setting would be:
(4 - 1) x (900) -1 = 2699 seconds or less

Warning: Setting RetryInterval to a value higher than the recommended maximum, based on the formula provided, can lead to an undesirable situation. All of the application servers in the cluster can be marked down simultaneously resulting in all requests temporarily failing.
Recommended value = 60 (default)


Affinity requests are requests that contain a session cookie (example: JSESSIONID). The session cookie is set by the Session Manager in WebSphere® to ensure that all subsequent requests from the same client return to the same application server in the cluster. The session cookie contains the clone ID (or partition ID) of that specific application server. The web server plug-in looks for the session cookie and uses the clone ID to send the request to that specific WebSphere® application server. An affinity request is not load balanced.

In the plug-in config, there is a property called IgnoreAffinityRequests. This property determines whether affinity requests affect the load balance weights, or not. The default value for IgnoreAffinityRequests is true, which means that affinity requests do not have any effect on the load balance weights. A value of true is best for most environments and does not need to be modified.  When IgnoreAffinityRequests is to set false, plug-in attempts to even the overall distribution among the servers over the entire plug-in process execution time. All traffic to be directed to a small subset of servers at any particular time thus leading to server overload.


Fail-over occurs when the plug-in marks a cluster member application server (or clone) as "down", and then sends the pending requests to other members of the same cluster. If the plug-in is unable to open a new connection to the application server within the ConnectTimeout, affinity fails over to a new server. If the plug-in sends the request to the application server, but a response from the application is not received within ServerIOTimeout, a fail-over occurs. When the plug-in marks down a cluster member application server, it handles the PENDING requests in one of two ways: before plug-in apar PM12112, the plug-in would send all of the pending requests to the next application server in the cluster. However, plug-in apar PM12112, the plug-in randomly sends the pending requests to any of the available application servers in the cluster. When the application server is marked "down", the plug-in no longer sends any requests to it.  When the RetryInterval expires, the plug-in checks the availability of the application server. If so, the "down" flag is removed and the application server is used again.

Note: By default, the number of attempts to handle a request is limited by the number of application servers in the cluster. For example, if there are only two application servers in the cluster, and the request fails once, the plug-in attempts that request one more time (total of two attempts). Or another example, if there are five application servers in the same cluster, and the request fails once, then the plug-in attempts to retry that same request up to four more times (total of five attempts). That number includes retries sent to the same application server (session affinity), or attempts sent to different application servers (fail-over).

Update: The plug-in apar PM70559 introduced a new setting called "ServerIOTimeoutRetry" that can be used to control the number of retries due to ServerIOTimeout.


If Memory-to-Memory (M2M) session replication is enabled in WebSphere Application Server, then the GetDWLMTable setting in the plug-in config must be changed to "true". Memory-to-Memory replication uses partition IDs rather than clone IDs. If GetDWLMTable is set to false (default), broken session affinity is experienced. Set the property GetDWLMTable="true" when M2M replication is used in WebSphere® Application Server.
Recommendation = GetDWLMTable="true" when M2M is used in WebSphere® Application Server.


Each web server child process loads a separate instance of the web server plug-in. And multiple running instances of the web server plug-in do not share information with each other. For example, if the IBM HTTP Server web server is configured to start 3 child processes (StartServers 3), then there are 3 instances of the web server plug-in running (one for each IBM HTTP Server child process). The dynamically changing LoadBalanceWeight of each cluster member is not shared between the plug-in instances. So, in one instance of the plug-in "member1" might be considered UP with a weight of 5, when in another instance of the plug-in "member1" might be considered DOWN and unusable. The result is different depending on the child process / plug-in instance handling the incoming request. It is recommended that you configure the web server to use only a few web server child processes with many threads on each. See Tuning IBM HTTP Server to maximize the number of client connections to WebSphere Application Server.

If you choose to use more than one web server child process, keep in mind that the plug-in settings are handled on a per instance basis. For example, MaxConnections means the number of pending requests that are allowed on that server, for each plug-in instance. If MaxConnections = 20, and there are 3 web server child processes (3 plug-in instances), then each instance allows 20 pending connections to that application server for a total of 60 pending connections.


Related information
Understanding plug-in Load Balancing
Understanding plug-in Fail-over
Tuning IBM HTTP Server processes and threads
web server plug-in configuration
Modifying plug-in properties from the WebSphere Application Server administrative console
How do the properties ServerIOTimeout and PostBufferSize affect plug-in behavior?

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Plug-in","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"8.5;8.0;7.0","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]

Document Information

Modified date:
10 December 2020

UID

swg21318463