RHEL 5.0 scalability

The purpose of these test runs was to measure the scalability of RHEL 5.0 with one, two, four, and eight CPUs on the WebSphere® Application Server LPAR. For each CPU configuration we scaled the number of clients to reach the maximum throughput. The database for each run remained constant at eight CPUs.

Note: The WebSphere Application Server CPUs and the database CPUs were in separate LPARs and were never used as a shared resource.

Test case description

We ran the Trade workload in our 3 tier environment using the tuning parameters, we had determined, which produced our best throughput, utilizing different databases (see RHEL 4.5 Trade tuning variations). Tuning variations we used included:
  • Increased network buffer count
  • Hardware checksumming

We ran CPU scaling runs using RHEL 5.0 for both the WebSphere Application Server and the DB2® for Linux® v9.2 LPAR. For each CPU scaling step, we scaled the number of workload generating clients to reach the maximum throughput.

Table 1. CPU scaling runs with WebSphere Application Server and database on RHEL 5.0
  WebSphere Application Server Trade Results DB2 Database
Distro #CPU %CPU #Users Resp %ETR %ITR #CPU %CPU
RHEL 5.0 1 99.96 5 11 ms 100 100 8 6.52
RHEL 5.0 2 99.60 10 11 ms 199 200 8 13.11
RHEL 5.0 4 99.30 20 11 ms 391 395 8 25.58
RHEL 5.0 8 99.07 60 17 ms 750 757 8 47.4
Figure 1. RHEL 5.0 ETR, WebSphere Application Server CPU percent, database CPU percent scaling results
This chart shows the shows the maximum throughput value (ETR) for each client while scaling the number of CPUs.

Observations

Figure 1 shows the maximum throughput value (ETR) while scaling the number of CPUs. WebSphere Application Server CPU utilization and database CPU utilization were measured during our scaling runs. Our WebSphere Application Server CPU utilization was within 1% of 100% utilization for every run. The ETR slope is very linear and the scaling factors (see below) for the ETR shows this.

1 WebSphere Application Server CPU = 1.0x
2 WebSphere Application Server CPUs = 1.99x
4 WebSphere Application Server CPUs = 3.9x
8 WebSphere Application Server CPUs = 7.5x

The slope of the database CPU utilization is also very linear and the WebSphere Application Server is always fully utilized.

This data shows a very consistent trend for the amount of database transactions per database CPU used from one WebSphere Application Server CPU to eight WebSphere Application Server CPUs Trade runs.

Conclusion

These results show linear scaling for ETR and database CPU utilization for our environment on RHEL 5.0 with the DB2 for Linux database. We were able to saturate the WebSphere Application Server CPUs for each scaling point. The RHEL 5.0 system with the DB2 for Linux database was also able to handle the increase in transactions in a linear manner. This workload scaled very well on the RHEL 5.0 distribution with the hardware and software used.

Figure 2. RHEL 5.0 ITR scaling results for Trade 6 on Linux DB2 and the regression line to depict the linear behavior
This chart shows the ITR measured during our scaling runs.

Observations

Figure 2 shows the ITR measured during our scaling runs. The ITR results can be shown as scaling factors to illustrate their linearity. We started with the transactions per second measured with one WebSphere Application Server CPU and gave that a weight of 1x. The remainder of the measurement points are calculated by taking the ITR and dividing it by the number of transactions measured for the 1 CPU run.

1 WebSphere Application Server CPU = 1x
2 WebSphere Application Server CPUs = 2x
4 WebSphere Application Server CPUs = 3.95x
8 WebSphere Application Server CPUs = 7.57x

Conclusion

The CPU scaling results for WebSphere Application Server on RHEL 5.0 with the DB2 for Linux database show a linear scaling rate for the CPU cost of the transactions with the Trade benchmark. The ideal behavior is a scaling factor identical to the CPU scaling factor. Our eight CPU run is only 3.5% lower than a perfect scaling factor of eight. This demonstrates that the CPU power of the additional CPUs can be used to drive the workload without increasing SMP overhead.