*** STOP PRESS *** - See the update in red at the bottom.
I was asked this lots of times at the recent POWER Technical Universities in Orlando and Athens.
Compared to a low numbers of local disks
The LUNs on a SAN disk subsystem with caching are very fast - even a simple test like AIX install times have halved. In addition, you don't have to fiddle about with stripping AIX Logical Volumes across disks or retro fit this to rootvg Volume Group filesystems after the install.
Compared to a system with a hot bottlenecking NPIV LUN
In this case the SSP is a lot faster. A single NPIV LUN is likely to be spread across a single Rank of disks in the disk subsystem (something like 8 disks depending on the internal details) but a single SSP LU (virtual disk) is spread across all the LUNs and underlying disks in the Shared Storage Pool i.e. 100's of disk spindles.
This assumes a sensible approach to the number of LUNs in the pool. Here is my Rule of Thumb: minimum of 16 LUNs. Above 2 TB have 8 LUNs per TB. Not just a handful, of very large LUNs. Education and demonstrations configurations tend to have small numbers of disks for simplicity but I many recommend more.
Compared to VIOS LUN disk over vSCSI
The VIOS has to convert virtual disk id and block number into LUN id and block number. This mapping is simple and quick and made simpler by the underlying 64 MB chunk size (which reduces the map size). This will take a tiny amount of compute time to do compared to the actual I/O.
Does the Thin Provisioning add overhead?
If you are sequentially writing a large file at 4 KB pages then every 16 blocks the VIOS SSP will have to space manage and allocate a new block. Yes this will take time but it is all meta data management and assigning blocks from the free list. It does not generate extra disk I/O. If this is a problem or you know the workloads will quickly write large volumes of data you should be creating the LU virtual disk with Thick Provisioning.
Does performance get worse, if you later add disks to the pool due to unbalanced disk use?
No SSP will in the back ground redistribute the blocks of virtual disks across additional disks in the pool - automatically. No user interaction is needed - in fact you can't as there is no command to start this action
The SSP Developers tell me "simple to manage" and "fast" are the two prime goals of Shared Storage Pools.
If my Logical Partitions / Virtual Machines have very large disk spaces requirements (4 TB plus) or very high disk I/O rates then I might use other technology like NPIV or even investing direct adapters for complete isolation.
For small LPARs / VMs or low / moderate disk I/O - I now prefer SSP due to the speed of implementation and flexibility.
I hope this helps, Nigel Griffiths
Update: after testing the SSP4 Performance
This was part of the Early Ship Program (ESP) for beta testing new releases. I was actually using the VIOS code level that later becme the GA release.
There are a billion disk I/O permutations so I limited testing to:
80% Read & 20% Write
Client AIX 7 LPAR using DIO at the filesystem level to remove AIX caching effects
Sixteen 4 GB files
Sixteen slave programs using 1 file each
At the VIOS back end I have 4 Gbit FC adapters to two V7000 Thick provisioned LUNs
Also the SSP Logical Unit (LU) virtual disks were Thick Provisioned - thick provisioned so test can be rerun without strange side effects
The tests ran for 1 minute but the performance was consistent throughout that minute as viewed on the VIOS by nmon (of course). I also created and mounted a LUN directly to the VIOS and ran the same test there for a comparison with directly attached FC disks.
A) Direct I/O from the VIOS to the LUN = 118 MB/s roughly 5.2 ms.
B) SSP from the client over vSCSI = 81 MB.s roughly 7.7 ms.
C) vSCSI to VIOS whole LUN = 68 MB/s roughly 9.2 ms.
I tried other combinations of block sizes, read:write ratios etc. but the limiting factor is the random seek disk response time and largely independent of the block size. Bigger blocks means more data but roughly the same IOPS. Sequential I/O is much faster but fairly rare these days.
Queue_depth concerns - particularly on AIX client:
The VIOS seems to set all hdisks queue_depth for 20 = Sensible but might need tuning on larger VIO servers.
The AIX 7 client VM (7100-02-02-1316) set the vSCSI LU hdisk to a queue_depth of 8 and the VIOS Pool LUN hdisk has a queue_depth of just 3 which is basically BONKERS (IMHO) - Given Client VMs will tend to use just a few large LUs virtual disks as there is no point to having many LU virtual disks to spread out I/O and all the 100's of client VMs disk I/O will end up accessing the VIOS Pool LUNs but each disk only 3 concurrent I/O per LUN.
Example: 100 Clients each with 10 I/O's outstanding = 4000 I/Os that need to be performed on the VIOS with with say 16 Pool LUNs with 3 I/Os concurrent = 48 physical I/O at a time. Which means the I/Os are queued up.
I set the Client VM vSCSI LU hdisks to a Queue Depths of 16 and the VIOS Pool LUN hdisks to a Queue Depths of 16 for this test.
Every one should watch out for this one on all SSP based client VMs.
Particularly, if you have gone for small numbers of larger LUs or even just the one large LU for everything!
1) SSP disk I/O speeds are in line with expectations.
2) SSP out guns vSCSI to a LUN = that was a surprise. I presume, it is the VIOS SSP spreading I/O across the pool LUNs that helps - full marks to the developers for the ~20% boost we get. Of course, you can add more vSCSI LUNs or more SSP LU for higher disk I/O rates.
What about the performance costs of Shared Storage Pools phase 4 Failure group (failgrp) mirroring?
I added a SSP4 failgrp mirror. And the performance was a shocker!! The SSP ran at just about the same speed. My Storwize V7000 caches writes so write I/O is extremely fast and does not wait for the data to hit the disks. Writes are much faster than uncached reads. This means the double write for the two mirror copies has near zero performance costs. Note in SSP4, the read I/O is only done on one mirror copy, so it is the same speed as no mirror.
You never read this here: It was hinted that the SSP developers already have the code to spread the read activity across both mirrors but it is not in this release they want more time for testing. I guess this would mean something like double the read performance as it would engage twice the number of spindles on the two back end disk subsystems.
So SSP is a performance boost
The question is not "What is the performance costs when using SSP disks?" but "How large is the SSP performance boosts?" And its very nice to run our OS images on a single virtual disk knowing the underlying technology is widely spreading the disk I/O out and mirroring behind the covers, so the OS administrators life is nice and simple - once you get the Queue Depth right!
I was then going to test the NPIV performance on my configuration - until I notices NPIV was impossible. Can you see why NPIV is impossible in my setup?
Until next time, cheers, Nigel Griffiths