Summary

Edit online

This Part 2 white paper shows the advantages of an ECM cluster in a WebSphere® Application Server environment for Linux™ on z Systems™. Spectrum Scale was used as the underlying parallel filesystem to enable cluster-wide access to the shared FileNet® File Storage Area that was populated with documents. The basic ECM application, WAS, and Spectrum Scale related tuning for a single node ECM cluster is described in the Scale-Out Case Study (Part 1) - Single ECM Node with XFS and Spectrum Scale 4.2.

This white paper focuses on the performance of a four-node ECM cluster.

At the start of the case study, the first question was the load balancing algorithm. For this study, the Web server IBM® HTTP Server was used to route requests in Round Robin mode to the four ECM WAS cluster members. This resulted in a slight improvement in load balancing among the WAS cluster members, compared to the other load balancing option Random.
Another important point to consider was the Spectrum Scale cluster configuration mode. We compared the two basic Spectrum Scale cluster configurations:
- Shared Disk (SD): The filesystem disks were directly attached to each of the four ECM nodes. As a result, any application data disk I/O was done directly over the SAN.
- Network Shared Disk (NSD): The filesystem disks were directly attached to dedicated NSD servers that propagate the filesystem via TCP/IP to the four ECM nodes. This Spectrum Scale cluster configuration theoretically allows a very high number of cluster members independently from the scope of the SAN network.
The outcome for the above question was that a tuned Spectrum Scale NSD setup performed as well as a SD setup for our workload used. Only the initial out-of-the-box NSD configuration shows a performance degradation.
Spectrum Scale Network Shared Disk (NSD) tuning
- To reach a balanced load for the available NSD servers (where there is more than one NSD server), it is important to vary the order of servers in the list of the NSD servers for the NSD disk definitions. When all NSD disks have the same order of NSD servers, the result is that only the first NSD server in the list is used and receives all NSD disk I/O requests. Therefore, it can be fully loaded while the others are waiting in failover mode.
- The page pool plays a minor role for the NSD servers and can be even relatively small.
- The page pools on the NSD clients are more important. This is where the ECM application data processing occurs and cache usage is effective for the NSD cluster configuration.
Linux Network MTU sizes
- An important NSD tuning parameter is the MTU size.
- When using the default MTU size of 1492 for the entire SUT, the NSD setup showed a significant degradation compared to the SD setup.
- When using jumbo frames (MTU 8192) for the entire SUT, the response times for the NSD setup became only slightly worse than the SD setup with jumbo frames.
- Because it is often difficult to run in an enterprise LAN environment exclusively with jumbo frames, a mixed MTU setup was also tested. In this setup, the remote machines in the LAN used a default MTU of 1492 to access the ECM environment but the virtual network inside the z Systems server used jumbo frames. This is relatively straightforward to implement and provided nearly the same performance compared to an environment that ran exclusively with jumbo frames.
Scale-out study with Spectrum Scale SD cluster configuration. The most relevant performance metrics for this performance test were the response time and the z/VM® CPU load (hypervisor load), which reflected the total load caused by all virtual machines for the SUT. To analyze the scaling behavior, the workload was scaled from a lower load level (around 5 IFL processors used) to a high load level (more than 20 IFL processors used).
- The single-node SD setup with Spectrum Scale was mostly better than XFS, and only at the highest workload level was it worse than XFS, but still good! Spectrum Scale for Linux on z Systems demonstrated excellent performance for an ECM-like type of workload, even in a non-clustered node configuration.
- When scaling out to a four-node ECM WAS cluster with Spectrum Scale as underlying parallel filesystem, the response times were significantly better across all load levels than the single ECM WAS node performance. The higher the load, the bigger were the differences between response times. Finally at the highest load level, the four-node ECM WAS cluster with Spectrum Scale was 40% faster than the single ECM WAS node with XFS.
- The CPU load for the four-node ECM WAS cluster was at the lower load levels higher than for the single ECM WAS node. However, the four-node ECM WAS cluster generally provided better response times. The higher the load level, the less was the difference between both setups in terms of CPU load. At the highest load level, the four-node ECM WAS cluster provides 40% better response times compared to the single ECM WAS node at the same CPU load!

Overall, the four-node ECM WAS cluster together with a disk-I/O-intensive workload and Spectrum Scale as parallel filesystem for the FileNet File Storage Area was a very successful combination in regard to overall performance. The CPU load overhead for the clustered setup compared to the single-node environment reduced when the load level increased.

For ECM systems with high workload levels, it is highly recommended to consider the use of a clustered ECM setup when transaction response times are important. In addition to the performance wins, a clustered ECM setup provides in any case the high availability feature.