... and the question was, does the CPACF run at the speed of the CP (i.e. it runs sub-capacity if the CP is sub-capacity) or does it run at full speed like an IFL, zIIP or zAAP. If the latter, the result after the upgrade should be the same as before -- that would indicate the speed of crypto operations does not change with the CP capacity, and that CPACF is always full speed. If the former, we should see an improvement between pre- and post-upgrade, indicating that the speed of CPACF follows the speed of the CP.
Place your bets... Okay, no more bets... Here's the chart:
The graph compares the results from the first chart in blue (when the machine was at capacity setting F01) with the full-speed (capacity setting Z01) results in red.
Okay, so did you get it right? If you know your z/Architecture you would have! As the name suggests, the Central Processor Assist for Cryptographic Function (or CPACF) is pretty-much an adjunct to each CP, just like any standard execution unit (like the floating point unit, say). It is not like the Crypto Express cards, which are actually an I/O device and totally separate from the CP. Because it is directly associated with each CP, for sub-capacity CPs its CPACF is bound to the speed of that CP.
If you look closer, further evidence that CPACF performance scales with capacity setting can be seen in the respective growth rates of each set of data points. To see this a little clearer (because I don't know the right mathematical terms to describe the shape of the curve, so I'll just show you) I drew a couple more graphs:
Looking at the left graph (which is the same as the bar graph above, just drawn in lines) you can see that in both the software and the CPACF case the lines for before and after the upgrade follow the same trend with respect to the block size. If these lines followed different trends -- for example if the Z01 CPACF line was flat across the block size range instead of a gently falling slope like the F01 line -- I'd suspect something else was affecting the result. Looked at a different way, the right-hand graph above shows the "times-X" improvement between software and CPACF. You can see that the performance multiplier (i.e. the relative performance improvement between software and hardware; CPACF speed is 16x software at 8192 byte blocks) was the same for each block size.
Now, just to confuse things... Although I've used OpenSSL on Linux as the testing platform for this experiment, most Linux customers will never see the effects I've demonstrated here. Why? Because Linux is usually run on IFLs, and the IFL always runs at full speed! Even if there are sub-capacity CPs installed in a machine with IFLs, the IFLs run at full speed and so to does the CPACF associated with the IFLs. I'll say again: CPACF follows the speed of the associated CP, so if you're running Linux on IFLs the CPACF on those IFLs will be full capacity just like the IFLs themselves. If you have sub-capacity CPs for z/OS workload on the same machine as IFLs, the CPACF on the CPs will appear slower than CPACF on the IFLs.
As far as the actual peak number is concerned, it looks like a big number! If I understand it right, 250MB/sec would be more than enough speed to have a server doing SSL/TLS traffic driving a Gigabit Ethernet at line speed (traffic over connected sessions, NOT the certificate exchange for connection establishment; the public key crypto for certificate verification takes more hardware than just CPACF, at least on the z9 anyway). And that's just one CP! Enabling more CPs (or IFLs, of course) gives you that much more CPACF capacity again. Keep in mind that these results are using hardware that is two generations old -- I would expect z10 and z196 hardware to get higher results on any of these tests. Regardless, these are not formal, official measurements and should not be treated as such -- do NOT use any of these figures as input to system sizing estimates or other important business measurements! Always engage IBM to work with you for sizing or performance evaluations.