Web server plug-in tuning tips

Important tips for web server plug-in tuning include how to balance workload and improve performance in a high stress environment. Balancing workloads among application servers in a network fronted by a web server plug-in helps improve request response time.

[z/OS] This product uses the z/OS® native Workload Management (WLM) functionality to dynamically balance the workload of application servers defined to a z/OS HTTP Server, Version 5.3, or an IBM® HTTP Server for WebSphere® Application Server on z/OS See the z/OS publication HTTP Server Planning, Installing and Using for more information about the z/OS HTTP Server, Version 5.3. Information about the IBM HTTP Server for WebSphere Application Server on z/OS is contained in this documentation.

Balancing workloads

During normal operation, the backlog of connections pending to an application server is bound to grow. Therefore, balancing workloads among application servers in a network fronted by a web server plug-in helps improve request response time.

You can limit the number of connections that can be handled by an applications server. To do this:

Go to the Servers > Server Types > WebSphere application servers > server_name.
In the Additional Properties section, click Web Server Plug-in properties .
Select Use maximum number of connections for the Maximum number of connections that can be handled by the Application server field.
Specify in the Connections field the maximum number of connections that you want to allow.
Then click Apply and Save.

When this maximum number of connections is reached, the plug-in, when establishing connections, automatically skips that application server, and tries the next available application server. If no application servers are available, an HTTP 503 response code will be returned to the client. This code indicates that the server is currently unable to handle the request because it is experiencing a temporary overloading or because maintenance is being performed.

The capacity of the application servers in the network determines the value you specify for the maximum number of connections. The ideal scenario is for all of the application servers in the network to be optimally utilized. For example, if you have the following environment:

There are 10 application servers in a cluster.
All of these application servers host the same applications (that is, Application_1 and Application_2).
This cluster of application servers is fronted by five IBM HTTP Servers.
The IBM HTTP Servers get requests through a load balancer.
Application_1 takes approximately 60 seconds to respond to a request
Application_2 takes approximately 1 second to respond to a request.

Depending on the request arrival pattern, all requests to Application_1 might be forwarded to two of the application servers, say Appsvr_1 and Appsvr_2. If the arrival rate is faster than the processing rate, the number of pending requests to Appsvr_1 and Appsvr_2 can grow.

Eventually, Appsvr_1 and Appsvr_2 are busy and are not able to respond to future requests. It usually takes a long time to recover from this overloaded situation.

If you want to maintain 2500 connections, and optimally utilize the Application Servers in this example, set the number of maximum connections allowed to 50. (This value is arrived at by dividing the number of connections by the result of multiplying the number of Application Servers by the number of web servers; in this example, 2500/(10x5)=50.)

Limiting the number of connections that can be established with an application server works best for web servers that follow use a single, multithreaded process for serving requests.

[Windows] IBM HTTP Server uses a single, multithreaded process for serving requests. No configuration changes are required.

[AIX HP-UX Solaris] [z/OS] IBM HTTP Server typically uses multiple multithreaded processes for serving requests. Specify the following values for the properties in the web server configuration file (httpd.conf) to prevent the IBM HTTP Server from using more than one process for serving requests.

ServerLimit           1
ThreadLimit           1024
StartServers          1
MaxClients            1024
MinSpareThreads       1
MaxSpareThreads       1024
ThreadsPerChild       1024
MaxRequestsPerChild   0

Improving performance in a high stress environment

[Windows] If you use the default settings for a Microsoft Windows operating system, you might encounter web server plug-in performance problems if you are running in a high stress environment. To avoid these problems, consider tuning the TCP/IP setting for this operating system. Two of the keys setting to tune are TcpTimedWaitDelay and MaxUserPort.

To tune the TcpTimedWaitDelay setting, change the value of the tcp_time_wait_interval parameter from the default value of 240 seconds, to 30 seconds:

Locate in the Windows Registry:
```
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\tcpip\Parameters\TcpTimedWaitDelay
```
If this entry does not exist in your Windows Registry, create it by editing this entry as a new DWORD item.
Specify, in seconds, a value between 30 and 300 inclusive for this entry. (It is recommended that you specify a value of 30. )

To tune the MaxUserPort setting:

Locate in the Windows Registry:
```
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\tcpip\Parameters\MaxUserPort
```
If this entry does not exist in your Windows Registry, create it by editing this entry as a new DWORD item.
Set the maximum number of ports to a value between 5000 and 65534 ports, inclusive. (It is recommended that you specify a value of 65534,)

See the Microsoft website for more information about these settings.