Topic
  • 5 replies
  • Latest Post - ‏2010-12-28T19:01:18Z by SystemAdmin
DS_lover
DS_lover
3 Posts

Pinned topic SVC 1920 Global Mirror

‏2010-07-31T03:50:01Z |
Hi All ,
Need advice on Global Mirror.

One of IBM customer having frequent GM -1920 problem recently .
However the same customer do have metro running fine no issue .

Attached TPC raw data

Can someone advice what should look for .

Configuration
4F2-DWDM-1GB-4F2
4318 Code level

Shanker
Updated on 2010-12-28T19:01:18Z at 2010-12-28T19:01:18Z by SystemAdmin
  • DS_lover
    DS_lover
    3 Posts

    Re: SVC 1920 Global Mirror

    ‏2010-07-31T04:31:29Z  
    Additional information ..
    Those mdisk participate in the Copy Service do have high response time (Backend)
  • TMasteen
    TMasteen
    341 Posts

    Re: SVC 1920 Global Mirror

    ‏2010-07-31T10:37:03Z  
    Shanker,

    I have no direct answer for your question. I did have the same issue in the past. In our case it was intercluster bandwidth.
    You could take a look at this Redbook
    There is a chapter about Diagnosing and fixing 1920 errors.
  • SystemAdmin
    SystemAdmin
    4779 Posts

    Re: SVC 1920 Global Mirror

    ‏2010-08-01T07:05:14Z  
    I would suggest you place a service call.
    You should also review:

    http://publib.boulder.ibm.com/infocenter/svcic/v3r1m0/index.jsp?topic=/com.ibm.storage.svc.console.doc/svc_mirrorgmlinktolerance_3istus.html

    One thing that stands out to me is this metric:

    Port to Local Node Send Response Time
    9.5
    7.5
    7.2
    6.9
    5.8
    3.6
    3
    3
    3
    3
    2.7
    2.7
    2.6
    2.2
    Your local nodes should be able to get responses in less than 1ms.
    Suggests local congestion.
    Then I look at the matching queue times:

    Port to Local Node Send Queue Time
    14.7
    13.4
    8.1
    9.2
    10.9
    6
    2.9
    3.5
    2.8
    1.6
    4.3
    0.9
    1.2
    This shows buffer shortages in your local switching.
    Regards, Anthony Vandewerdt

    Visit my blog: http://tinyurl.com/23wgvkw
    Follow me on twitter: http://twitter.com/anthonyvdwerdt
  • Jose.Burmester
    Jose.Burmester
    1 Post

    Re: SVC 1920 Global Mirror

    ‏2010-12-08T22:38:47Z  
    TMasteen wrote:
    Shanker,

    I have no direct answer for your question. I did have the same issue in the past. In our case it was intercluster bandwidth.
    You could take a look at this Redbook
    There is a chapter about Diagnosing and fixing 1920 errors.

    Thanks for your sharing! It's very valuable, Now I understand more about it.
  • SystemAdmin
    SystemAdmin
    4779 Posts

    Re: SVC 1920 Global Mirror

    ‏2010-12-28T19:01:18Z  
    TMasteen wrote:
    Shanker,

    I have no direct answer for your question. I did have the same issue in the past. In our case it was intercluster bandwidth.
    You could take a look at this Redbook
    There is a chapter about Diagnosing and fixing 1920 errors.

    Thanks for your sharing! It's very valuable, Now I understand more about it.
    We too are plauged with 1920's when the FCIP link is fine and underutilized. I think we finally figured out that the inter-node communications were killing the gm relationships. Quite difficult to diagnose. We moved some of the workload to a different IO grp, after noticing that the iogrp in question was pushing 400MB/s over ALL it's 4Gbit FC connections!

    The really bad part about this is that it will randomly kill off relationships, without figuring in the workload for the different GM relationships. We would see very active luns continue to replicate, when low activity ones would 1920. Or, see that some heavy lifting was going on (high writes) on a given lun, and others with not nearly as much activity would get 1920's.

    Also, when you have an indivdual lun exceed write rates of 20MB/s you will see them die, even when you have plenty of bandwidth on the WAN side to accomadate the workload. There needs to be more info on how exactly global mirror works, at least for the SVC side of the house.

    We plan on migating from 8G4's to CF8's and also from 48K's to DCX 8Gb switches, hopefully this will make the inner-node communication issues less common.