VIOS Shared Storage Pool Single Repository Disk = Not a Problem
nagger 100000MRSJ Comments (6) Visits (10491)
Odd. I thought I have put this on this blog before but I can find it now so just in case ... here it is or here it is again.
The SSP4 can have mirrored failure group FC disks to handle adapter, FC cable, FC switch, entire FC disk sub-system, site or VIOS failure.
But there is still a single SSP Repository disk. Isn't this a single point of failure?
The answer is "No because you can quickly rebuild the contents of the Repository Disk after it has failed."
You are not meant to know this as it is internal to the VIOS cluster aware AIX Shared Storage Pool software but there is a saved copy of the Repository Disk on every node but even those are not needed most of the time. Actually, if there was more than one Repository disk there would be greater problems. If the disks where different how would you work out which is best or which is right. Even if you find a way to handle that you have a further problem - what is the network breaks and each half had a different Repository disk and carried on running - thinking its half was the master.
To prove the point I took my "crash and burn" demo SSP cluster and tried to destroy it via the Repository disk.
Attempt 1: Pretend I was moving the SSP to new super fast IBM disks :-)
Attempt 2: Oops silly me, I had an accident with the dd command and dd-ed zeros all over my SSP Repository disk
Attempt 3: Replaced Repository disk with a node shutdown = it can never start &join my SSP.
Attempt 4: Nuts,I unmapped the Repository LUN on my V7000 - how silly of me!!
Attempt 5: Completely deleted Repository LUN on the V7000
Attempt 6: Real intermittent FC laser failure on the FC adapter VIOS. I would like to claim we somehow injected this issue but it was really a genuine GBIC failure.
Repository Disk / LUN Conclusions:
but track the VIOS errlogs for warnings