Topic
3 replies Latest Post - ‏2012-12-23T08:46:38Z by Novikov_Alexander
VeenaEdattale
VeenaEdattale
2 Posts
ACCEPTED ANSWER

Pinned topic DS3524 RDAC Error

‏2012-11-28T15:15:50Z |
I have a 3 Blade centre with 14 Blades in two Blade centres and 7 Blades in the third Blade centre. The are all connected to the 2 * DS3524 with GPFS replication on it. Since the new environment has been setup, the Blades (random) are loggin errors with the following messages
Nov 27 12:49:16 chbs-bia1-39 chbs-bia1-39 kernel: http://453132.377435 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:1:7 Controller IO time expired. Delta 401 secs
Nov 27 12:49:16 chbs-bia1-39 chbs-bia1-39 kernel: http://453132.377629 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:1:7 Failed controller to 0. retry. vcmnd SN 2998403 pdev H2:C0:T3:L7 0x00/0x00/0x00 0x00080000 mpp_status:14
Nov 27 12:49:16 chbs-bia1-39 chbs-bia1-39 kernel: http://453132.378045 10 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:0 Failover command issued
Nov 27 12:49:22 chbs-bia1-39 chbs-bia1-39 kernel: http://453138.382873 801 http://RAIDarray.mppFailover succeeded to PROD_DS3524_DC165_SN13K0LCK:0
Nov 27 12:49:18 chbs-bia1-53 chbs-bia1-53 kernel: http://526535.308214 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Cmnd-failed try alt ctrl 0. vcmnd SN 6126908 pdev H2:C0:T3:L11 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:48:59 chbs-bia1-44 chbs-bia1-44 kernel: http://415821.420928 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:7 Cmnd-failed try alt ctrl 0. vcmnd SN 3077427 pdev H4:C0:T3:L7 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:49:26 chbs-bia1-43 chbs-bia1-43 kernel: http://748305.151465 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Cmnd-failed try alt ctrl 0. vcmnd SN 13094046 pdev H1:C0:T3:L11 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:46:16 chbs-bia1-20 chbs-bia1-20 kernel: http://862882.021256 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Cmnd-failed try alt ctrl 0. vcmnd SN 11110701 pdev H2:C0:T3:L10 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:50:00 chbs-bia1-24 chbs-bia1-24 kernel: http://862867.570190 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Cmnd-failed try alt ctrl 0. vcmnd SN 14322040 pdev H2:C0:T3:L10 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:46:19 chbs-bia1-41 chbs-bia1-41 kernel: http://853466.630992 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:9 Cmnd-failed try alt ctrl 0. vcmnd SN 17767623 pdev H2:C0:T3:L9 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:49:37 chbs-bia1-09 chbs-bia1-09 kernel: http://860292.552718 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Cmnd-failed try alt ctrl 0. vcmnd SN 1756437 pdev H1:C0:T0:L10 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:49:57 chbs-bia1-55 chbs-bia1-55 kernel: http://855374.201136 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Cmnd-failed try alt ctrl 0. vcmnd SN 11964475 pdev H1:C0:T2:L8 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:49:07 chbs-bia1-14 chbs-bia1-14 kernel: http://860479.249298 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Cmnd-failed try alt ctrl 0. vcmnd SN 6787826 pdev H1:C0:T0:L8 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:52:46 chbs-bia1-23 chbs-bia1-23 kernel: http://862861.684977 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Cmnd-failed try alt ctrl 0. vcmnd SN 12020863 pdev H2:C0:T3:L11 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:47:01 chbs-bia1-08 chbs-bia1-08 kernel: http://860517.132333 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:7 Cmnd-failed try alt ctrl 0. vcmnd SN 1617695 pdev H1:C0:T1:L7 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:50:13 chbs-bia1-11 chbs-bia1-11 kernel: http://860515.586985 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:7 Cmnd-failed try alt ctrl 0. vcmnd SN 1493182 pdev H2:C0:T1:L7 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:51:14 chbs-bia1-07 chbs-bia1-07 kernel: http://860480.813466 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:6 Cmnd-failed try alt ctrl 0. vcmnd SN 1429170 pdev H1:C0:T1:L6 0x05/0x94/0x01 0x08000002 mpp_status:1
Can someone explain what this signifies. No errors have been generated in the SAN logs. The RDAC Version installed in the newest one rdac-09.03.0C05.0638. The Blades are running SLES 11 SP1 . This has also resulted in a Blade unmounting the gpfs filesystem

tus:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401095 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401101 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Failed controller to 0. retry. vcmnd SN 14149223 pdev H2:C0:T3:L8 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401159 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:9 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401165 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:9 Failed controller to 0. retry. vcmnd SN 14149231 pdev H2:C0:T3:L9 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401226 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401231 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Failed controller to 0. retry. vcmnd SN 14149168 pdev H2:C0:T3:L8 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401306 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401312 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Failed controller to 0. retry. vcmnd SN 14149184 pdev H2:C0:T3:L10 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401373 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401379 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Failed controller to 0. retry. vcmnd SN 14149190 pdev H2:C0:T3:L11 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:23 chbs-bia1-18 kernel: http://885703.179935 801 http://RAIDarray.mppFailover succeeded to PROD-DS3524_DC110_SN13M012T:0
Nov 27 19:09:24 chbs-bia1-18 kernel: http://885704.453886 801 http://RAIDarray.mppFailover succeeded to PROD_DS3524_DC165_SN13K0LCK:0
Nov 27 19:09:24 chbs-bia1-18 mmfs: Error=MMFS_DISKFAIL, ID=0x9C6C05FA, Tag=11862484: Disk failure. Volume gpfsFS1. rc = 5. Physical volume gpfs9nsd

Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083680 492 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 IO FAILURE. vcmnd SN 14149184 pdev H2:C0:T3:L10 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083693 sd 3:0:0:10: sdf Unhandled error code
Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083694 sd 3:0:0:10: sdf Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083696 sd 3:0:0:10: sdf CDB: Write Verify(10): 2e 08 00 e9 6b 80 00 00 80 00
Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083701 end_request: I/O error, dev sdf, sector 15297408
Nov 27 19:12:47 chbs-bia1-18 mmfs: Error=MMFS_DISKFAIL, ID=0x9C6C05FA, Tag=11862485: Disk failure. Volume gpfsFS1. rc = 5. Physical volume nsd07gpfs
Nov 27 19:12:47 chbs-bia1-18 mmfs: Error=MMFS_SYSTEM_UNMOUNT, ID=0xC954F85D, Tag=11862486: Unrecoverable file system operation error. Status code 5. Volume gpfsFS1

Thank you.
Updated on 2012-12-23T08:46:38Z at 2012-12-23T08:46:38Z by Novikov_Alexander
  • Novikov_Alexander
    Novikov_Alexander
    1404 Posts
    ACCEPTED ANSWER

    Re: DS3524 RDAC Error

    ‏2012-11-30T03:53:56Z  in response to VeenaEdattale
    Dear Veena,

    looks like non-supported configuration used. Lets check it. Could you provide AMM Service Data, DSA logs from blades and output of command supportshow from both FC Brocade blade-switches.

    Regards,
    Alexander Novikov
    Russia, Moscow
    • VeenaEdattale
      VeenaEdattale
      2 Posts
      ACCEPTED ANSWER

      Re: DS3524 RDAC Error

      ‏2012-12-19T06:28:43Z  in response to Novikov_Alexander
      Hi , the firmware on the switch was updated. We saw CRC erros on some ports and then replaced the sfp module and a cable but with no luck and the erros continue to generate which is causing a massive performance problem on the storage side.

      I am currently with no clue as to what i could try next. Any help would be appreciated.

      Thank you
      • Novikov_Alexander
        Novikov_Alexander
        1404 Posts
        ACCEPTED ANSWER

        Re: DS3524 RDAC Error

        ‏2012-12-23T08:46:38Z  in response to VeenaEdattale
        Dear Veena,

        could you provide DS Support Data, AMM Service Data, DSA logs from blades and output of command supportshow from both FC Brocade blade-switches.

        Regards,
        Alexander Novikov
        Russia, Moscow