We are about to introduce SSD's into our SVC environment for the purposes of improved IOPs for critical database vdisks. Due to current configurations, it appears that documented best practice cannot fully be complied with, so I am looking for confirmation on how to make the best of a non-ideal situation.
Environment: 4-node / 2 IOgroup SVC cluster (code level 126.96.36.199)
Planned SSD deployment: 4 x 200GB SSD's per node (i.e. fully populated) mirrored as 2 x 400GB Mdisks per IO Group
Planned Easy Tier deployment: 4 x Multi-tier Mdisk groups ( 2 per IO Group) each containing one of the SSD Mdisks.
Complication 1: Current mdisk groups contain vdisks from both IO Groups and Easy Tier extent migration across IO groups should be avoided.
Possible solution: Via Easy Tier predictive reporting over many weeks, a clear list of vdisks containing "hot" data has been generated and appears to be constant. Based on this info, the following steps are being considered - any comment or confirmation will be much appreciated:
1. Turn off Easy Tier at vdisk level for the 4 Mdisk groups that are to become multi-tiered and add SSD mdisks to them - resulting in 2 x Multi-tier MDG's per IO Group.
2. Migrate "hot" vdisks between these mdisks groups so that IO Group assignment per vdisk corresponds with IO Group ownership of the SSD's in the MDG (not on code 6.4 so cannot change IO group assignment without disruption)
3. Selectively activate Easy Tier for the "hot" vdisks that are in the Multi-tier MDG with SSD's in the same IO Group (note: some vdisks from the second IO group may remain in the MDG, but will not have Easy Tier activated - could this be a problem?)
Assuming the above solution can be implemented, the next area of concern is future code upgrades. Documentation recommends that this be done in manual mode to ensure sufficient time for the resync of SSD mirrors between node outages. As the automated code upgrade process is preferred, is there a means of "clearing out" the SSD's (even if this does result in a temporary impact on performance)? E.g Could Easy Tier be turned off on the selected subset of vdisks and would this ensure that the extents are migrated back to the HDD tier? After the code upgrade, this process could then be reversed and Easy Tier allowed to effectively start from scratch.
Any comments or advice will be much appreciated.
Pinned topic SSD Easy Tier pitfalls? IO Groups and Code Upgrades
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2012-11-14T15:20:30Z at 2012-11-14T15:20:30Z by al_from_indiana
Re: SSD Easy Tier pitfalls? IO Groups and Code Upgrades2012-10-28T16:46:46ZThis is the accepted answer. This is the accepted answer.I'm interested in knowing "complication 2" as well. The Redbooks as well IBM Support make no mention about routine maintenance on the SVC regarding SSDs with Easy Tier enabled.
SystemAdmin 110000D4XK4779 Posts
Re: SSD Easy Tier pitfalls? IO Groups and Code Upgrades2012-11-01T02:15:45ZThis is the accepted answer. This is the accepted answer.Feedback received from IBM PE (read "guru"), has confirmed the following:
re: Depopulate SSDs prior to code upgrade by de-activaing easy tier? Answer: NO. Extents will remain where they are. Deleting the SSD Mdisk is required to force migration of extents back to HDD vdisks. ( Not an option I relish... )
re: Possible vdisk dependency complications during automated Code upgrades? Answer: In the majority of cases the normal intervals between IO group node outages are sufficient for SSD mirror resync's - especially if step are taken to reduce changes to the data on the Easy Tier enabled vdisks during the upgrade process. In addition, the automated code upgrade process will check for vdisk dependency before taking the next node offline. If any are encountered (such as an out-of-sync SSD mirror), the upgrade process will stall and will need to be continued manually once the SSD resync has completed.
It was also confirmed that IO extent migration between HDD and SSD vdisk in different IO groups will work, but it will have a performance impact and is not recommended best practice. My suggested control of Easy Tier at a Vdisk level is a possible solution to ensure that extent migration is contained within a particular IO group, but obviously implies administrative overhead.
Has anyone out there been through a code upgrade on an Easy Tier enabled array?
Re: SSD Easy Tier pitfalls? IO Groups and Code Upgrades2012-11-02T02:52:23ZThis is the accepted answer. This is the accepted answer.
- SystemAdmin 110000D4XK
We were under the assumption that we could turn of Easy Tier at the volume level so that the data could be migrated back from the SSDs - imo it doesn't seem like an elegant solution if one has to delete the SSD mdisk to force a copy back.
"In the majority of cases the normal intervals between IO group node outages are sufficient for SSD mirror resync's - especially if step are taken to reduce changes to the data on the Easy Tier enabled vdisks during the upgrade process. In addition, the automated code upgrade process will check for vdisk dependency before taking the next node offline. If any are encountered (such as an out-of-sync SSD mirror), the upgrade process will stall and will need to be continued manually once the SSD resync has completed."
When they mention SSD mirror resyncs do they mean the mirroring of data between the SVC nodes within a IO group?
"Has anyone out there been through a code upgrade on an Easy Tier enabled array? "
We haven't but after your thread about these "complications" we'd like to test it out within our cluster to see how it behaves. If there is a new release (hopefully with IP-link capability) within the next couple of weeks we'll test it out and definitely report back with results.
On another note, Jacques have you tried the IO group migrate functionality? It's been most helpful for us when migrating database servers been IO grps for better performance (less contention) but a pain in terms of OS stale device entries clean up.
Re: SSD Easy Tier pitfalls? IO Groups and Code Upgrades2012-11-14T15:20:30ZThis is the accepted answer. This is the accepted answer.
- al_from_indiana 2700052DDG
According to Barry instead of destroying the SSD mdisks you could migrate the data off to another mdisk in another storage pool then back:https://www.ibm.com/developerworks/mydeveloperworks/blogs/storagevirtualization/entry/open_forum_q_a_58?lang=en:
"For Easy Tier volumes, you'd need to turn off easy tier on that volume, then the only way to migrate the extents back to HDD only would be to migrate the volume to another stroage pool, and back to this pool if you wanted it back in the original place. Repeat for each volume on that host "