<   Previous Post  Introducing the IBM...
Next UK (EU) SVC...  Next Post:   >

Comments (72)
  • Add a Comment
  • Edit
  • More Actions v
  • Quarantine this Entry

16 al_from_indiana commented Permalink

Hi Barry, <div>&nbsp;</div> Thanks for the reply - do you know if its possible to enable email alerting for any events (path failure, failing battery, hardware errors..etc.etc) on the SVC? We have "remote alerting" enabled on our SVC but to be honest with you IBM Remote Support has been horrible in giving us a call back (usually over 24 hours) for any event and are very vague in giving the error or following up with an email for the PMR number for tracking purposes. <div>&nbsp;</div>

17 orbist commented Permalink

You can add multiple email servers / email users, and each one can be defined to have its own notification level (info, warning, error) - any combo thereof. <div>&nbsp;</div> See the GUI advanced notification settings pane, or the CLI : <br /> svctask mkemailuser -h <br />

18 scott.fuhrman commented Permalink

Are there any plans to implement mdisk mirroring? With the use of Easytier, it can make sense to purchase one storage subsystem with all SSD, then divvy that SSD backend up to multiple mdisk groups. This of course violates the general rule of 1 mdisk group to one storage subsystem, however purchasing a seperate SSD subsytem for each mdisk group is unreasonable. Mdisk mirroring would mitigate risk of a shared SSD subsystem failure bringing down multiple mdisk groups. <br />

19 orbist commented Permalink

Scott, <div>&nbsp;</div> For SSD, we'd "hope" they are more reliable than HDD and in addition, most all SSD boxes are enterprise class. So while there is a risk of taking multiple mdiskgrps offline if the entire SSD controller fails, I'd hope this was low risk. <div>&nbsp;</div> We do sort of support this today, with the RAID layer - but only for internal SSD in the SVC nodes. At present there is no plan to offer "RAID" over controller mdisks.

20 al_from_indiana commented Permalink

Barry, <div>&nbsp;</div> Since a SVC node's cache is dynamically partitioned by the number of mdisk groups it has would it make sense to create separate x number of mdisk groups x number of SVC-attached database hosts? <div>&nbsp;</div> -Al

21 devildod commented Permalink

Hi Barry; <div>&nbsp;</div> Will we be able to monitor SVC cache usage in the near future with or w/o TPC? Except SVC's automatic cache partition mechanism, any cache management options for the customers be available? I think SVC GM needs some cache management improvements, where huge number of volumes replicated; am i right? <div>&nbsp;</div> Thanks, <br /> Omer

22 orbist commented Permalink

Al, <div>&nbsp;</div> The cache partitioning was really designed to protect SVC from misbehvaing controllers. Thus, going beyond 4 mdisk groups (storage pools) doesn't really benefit you. Since each pool is allowed to grow its cache partiion to 25% <div>&nbsp;</div> Have a read of the redpaper I wrote a few years ago, while some things have changed (and I need to update the paper some day!) the concept is still the same. <div>&nbsp;</div> The main addition is that we do "per partition destage calculations" now - i.e. we ramp up or down destage rates for each partition, based on the performance feedback of I/O to that partition. <div>&nbsp;</div> When the paper was written, we did this on the overall cache performance feedback, which could still cause a bad parition to contaminate the overall cache performance. <div>&nbsp;</div> Paper is here : http://www.redbooks.ibm.com/abstracts/redp4426.html

23 orbist commented Permalink

Devildod, <div>&nbsp;</div> Some of the cache statistics are now picked up by TPC (latest releases) <div>&nbsp;</div> There are no customer tunables on the cache, other than turning it on/off per volume. <div>&nbsp;</div> Internally we have the concept of "GM Fast Path" - so GM writes are prioritiesed especially at the secondary. <br />

24 al_from_indiana commented Permalink

Barry, <div>&nbsp;</div> Why is it that when a battery fails in the SVC's UPS it completed powers off the node? The DS4000 and DS5000 series have internal controller batteries but when they fail it doesn't offline the controller (cache is disabled though). <div>&nbsp;</div> Also are there any plans to add IP ports to the SVC node for replication? (ie..N series snapmirror)

25 KevinGil commented Permalink

Barry, <div>&nbsp;</div> Are there any plans for allowing a extra node not in the cluster to be used as a "hot spare"? For example we could have a 8 node cluster but have 9 nodes in the rack, if any node failed the 9th one would be automatically brought into the cluster as a replacement. <div>&nbsp;</div> Thanks, <br /> Kevin

26 eeqmc2 commented Permalink

Hi Barry, <div>&nbsp;</div> Can you tell something about performance impacts on primary site when replicating to remote site using Global Mirror? In asynchronous replication scenarios, performance on primary site should not be seriously impacted, no matter what is happening on remote. Is this the case with v7000? <div>&nbsp;</div> Thank you in advance. <div>&nbsp;</div>

27 orbist commented Permalink

Al, <div>&nbsp;</div> SVC/V7000 are clustered systems, and much more than just the user cache data is held in memory. We have quite a lot of cluster meta-data that is critical. This is held NV by the UPS/batterry, so if we lose power, we need the UPS/battery to hold the node up long enough to dump the entire memory contents to disk. On DS4/5 they had minimal data and less redundant copy services etc, so a small NV space was used.

28 orbist commented Permalink

Kevin, <div>&nbsp;</div> There are quite a few good technical reasons for having a spare node, or complete set of spare nodes (for example when running a split cluster) - so yes we've thought about it, but I can't discuss roadmap dates etc on here.

29 orbist commented Permalink

eeqmc2 <div>&nbsp;</div> The SVC/V7000 async replication is designed with a very short RPO. This means if you don't have a big enough pipe between sites, then you can end up slowing down the primary, as we can oly buffer so many writes. The pipe needs to be sized to cater for max write peaks. The RPO objective is seconds. <div>&nbsp;</div> If you want to reduce the impact when you have a poor / undersized pipe, then the new GM with change volumes function should be used. This takes snapshots and replicates them to the secondary, allowing the RPO to be minutes or hours - based on how quickly you can send the incremental changes to the secondary,

30 al_from_indiana commented Permalink

Hi Barry, <div>&nbsp;</div> Hmm - I understand that but would it be possible if the node that has the failed battery latch on to the node with the active batteries for power? It seems like a lot of hassle to power down a node just for a failed battery... <div>&nbsp;</div> Also anything in the near future for replication over IP?