Summary

This paper describes how to set up the cryptographic environment on IBM® System z® to obtain the benefit of the additional power of special purpose features Central Processor Assist for Cryptographic Functions (CPACF) and IBM Crypto Express2 Accelerator (CEX2A). The workload is a client-server based Java™ application communicating through SSL with different cipher suites using the IBMPKCS11Impl provider (openCryptoki). This workload was constructed with self-developed Java classes based on Java 5 loading the security provider directly. This workload is not possible for standard applications.

Note: At the time of publishing this paper, the full support for the IBM System z cryptographic hardware for Java is generally available with:
  • SUSE Linux® Enterprise Server (SLES) Version 10 SP 3
  • IBM WebSphere® Application Server Version 7.0.0.7
  • IBM Software Development Kit (SDK) 1.6 SR 6

For more information and support enhancements from Java and WebSphere, see:

https://www.ibm.com/support/pages/enabling-and-configuring-cryptographic-technology-websphere-application-server-linux-system-z-hardware

Java 5 has a number of known issues that are fixed in Java 6. If you are using PKCS11, move to Java 6.

The impact of the cryptographic hardware depends on the size of the data transferred. In all cases, a significant reduction of the CPU utilization (up to 50%) was observed. The throughput increases significantly with packets sized approximately 20 KB, up to a fourfold throughput increase when compared to software encryption. It is recommended to run the system without the polling thread, which checks if data is available on the CEX2A card (which is the default). An alternative to the polling thread mechanism is the AP interrupts mechanism, which is not currently available for all configurations. AP interrupts are available with Linux RHEL 5.4. For more information, go to this Web site, and refer to the section 'Using AP adapter interrupts':

http://download.boulder.ibm.com/ibmdl/pub/software/dw/linux390/docu/ lk31dd03.pdf

With small packet sizes of 2 KB and 20 KB, cryptographic hardware provides a 10% higher throughput, but the CPU cost per unit of transferred data also increases. When throughput is more important than CPU load, the polling thread is helpful.

During the normal logon process, the server authenticates to the client using certificates. To increase the security level, client authentication can be added. This CPU-intensive process also has a significant reduction in CPU load when using the cryptographic hardware on IBM System z.

Caching of SSL sessions means that for consecutive requests from one specific client, the SSL handshake is issued only for the first request. Because the client and server have already been identified and the keys are already exchanged, the server decides not to perform a handshake for subsequent requests. This is the default behavior of an SSL session, the server accepts an established session for a specific time interval (often about 10 minutes — the value can be configured in the server). For that particular period, additional handshakes are avoided. This optimization improved the speed of the tests by a factor of two, and saved approximately 30% of the CPU utilization.

Finally, results were compared from Java Version 5.0 SR 9 with Java Version 5.0 SR 7 and Java Version 6.0 SR4. All results are very similar. The only difference was that Java Version 6.0 SR 4 did not support the hardware encryption, so for this test only software encryption was used.