Topic
  • 3 replies
  • Latest Post - ‏2012-12-23T08:46:38Z by Novikov_Alexander
VeenaEdattale
VeenaEdattale
2 Posts

Pinned topic DS3524 RDAC Error

‏2012-11-28T15:15:50Z |
I have a 3 Blade centre with 14 Blades in two Blade centres and 7 Blades in the third Blade centre. The are all connected to the 2 * DS3524 with GPFS replication on it. Since the new environment has been setup, the Blades (random) are loggin errors with the following messages
Nov 27 12:49:16 chbs-bia1-39 chbs-bia1-39 kernel: http://453132.377435 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:1:7 Controller IO time expired. Delta 401 secs
Nov 27 12:49:16 chbs-bia1-39 chbs-bia1-39 kernel: http://453132.377629 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:1:7 Failed controller to 0. retry. vcmnd SN 2998403 pdev H2:C0:T3:L7 0x00/0x00/0x00 0x00080000 mpp_status:14
Nov 27 12:49:16 chbs-bia1-39 chbs-bia1-39 kernel: http://453132.378045 10 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:0 Failover command issued
Nov 27 12:49:22 chbs-bia1-39 chbs-bia1-39 kernel: http://453138.382873 801 http://RAIDarray.mppFailover succeeded to PROD_DS3524_DC165_SN13K0LCK:0
Nov 27 12:49:18 chbs-bia1-53 chbs-bia1-53 kernel: http://526535.308214 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Cmnd-failed try alt ctrl 0. vcmnd SN 6126908 pdev H2:C0:T3:L11 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:48:59 chbs-bia1-44 chbs-bia1-44 kernel: http://415821.420928 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:7 Cmnd-failed try alt ctrl 0. vcmnd SN 3077427 pdev H4:C0:T3:L7 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:49:26 chbs-bia1-43 chbs-bia1-43 kernel: http://748305.151465 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Cmnd-failed try alt ctrl 0. vcmnd SN 13094046 pdev H1:C0:T3:L11 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:46:16 chbs-bia1-20 chbs-bia1-20 kernel: http://862882.021256 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Cmnd-failed try alt ctrl 0. vcmnd SN 11110701 pdev H2:C0:T3:L10 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:50:00 chbs-bia1-24 chbs-bia1-24 kernel: http://862867.570190 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Cmnd-failed try alt ctrl 0. vcmnd SN 14322040 pdev H2:C0:T3:L10 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:46:19 chbs-bia1-41 chbs-bia1-41 kernel: http://853466.630992 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:9 Cmnd-failed try alt ctrl 0. vcmnd SN 17767623 pdev H2:C0:T3:L9 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:49:37 chbs-bia1-09 chbs-bia1-09 kernel: http://860292.552718 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Cmnd-failed try alt ctrl 0. vcmnd SN 1756437 pdev H1:C0:T0:L10 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:49:57 chbs-bia1-55 chbs-bia1-55 kernel: http://855374.201136 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Cmnd-failed try alt ctrl 0. vcmnd SN 11964475 pdev H1:C0:T2:L8 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:49:07 chbs-bia1-14 chbs-bia1-14 kernel: http://860479.249298 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Cmnd-failed try alt ctrl 0. vcmnd SN 6787826 pdev H1:C0:T0:L8 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:52:46 chbs-bia1-23 chbs-bia1-23 kernel: http://862861.684977 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Cmnd-failed try alt ctrl 0. vcmnd SN 12020863 pdev H2:C0:T3:L11 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:47:01 chbs-bia1-08 chbs-bia1-08 kernel: http://860517.132333 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:7 Cmnd-failed try alt ctrl 0. vcmnd SN 1617695 pdev H1:C0:T1:L7 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:50:13 chbs-bia1-11 chbs-bia1-11 kernel: http://860515.586985 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:7 Cmnd-failed try alt ctrl 0. vcmnd SN 1493182 pdev H2:C0:T1:L7 0x05/0x94/0x01 0x08000002 mpp_status:1
Nov 27 12:51:14 chbs-bia1-07 chbs-bia1-07 kernel: http://860480.813466 494 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:6 Cmnd-failed try alt ctrl 0. vcmnd SN 1429170 pdev H1:C0:T1:L6 0x05/0x94/0x01 0x08000002 mpp_status:1
Can someone explain what this signifies. No errors have been generated in the SAN logs. The RDAC Version installed in the newest one rdac-09.03.0C05.0638. The Blades are running SLES 11 SP1 . This has also resulted in a Blade unmounting the gpfs filesystem

tus:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401095 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401101 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Failed controller to 0. retry. vcmnd SN 14149223 pdev H2:C0:T3:L8 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401159 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:9 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401165 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:9 Failed controller to 0. retry. vcmnd SN 14149231 pdev H2:C0:T3:L9 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401226 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401231 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:8 Failed controller to 0. retry. vcmnd SN 14149168 pdev H2:C0:T3:L8 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401306 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401312 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 Failed controller to 0. retry. vcmnd SN 14149184 pdev H2:C0:T3:L10 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401373 122 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Controller IO time expired. Delta 405 secs
Nov 27 19:09:20 chbs-bia1-18 kernel: http://885699.401379 497 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:11 Failed controller to 0. retry. vcmnd SN 14149190 pdev H2:C0:T3:L11 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:09:23 chbs-bia1-18 kernel: http://885703.179935 801 http://RAIDarray.mppFailover succeeded to PROD-DS3524_DC110_SN13M012T:0
Nov 27 19:09:24 chbs-bia1-18 kernel: http://885704.453886 801 http://RAIDarray.mppFailover succeeded to PROD_DS3524_DC165_SN13K0LCK:0
Nov 27 19:09:24 chbs-bia1-18 mmfs: Error=MMFS_DISKFAIL, ID=0x9C6C05FA, Tag=11862484: Disk failure. Volume gpfsFS1. rc = 5. Physical volume gpfs9nsd

Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083680 492 http://RAIDarray.mppPROD_DS3524_DC165_SN13K0LCK:1:0:10 IO FAILURE. vcmnd SN 14149184 pdev H2:C0:T3:L10 0x00/0x00/0x00 0x00070000 mpp_status:7
Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083693 sd 3:0:0:10: sdf Unhandled error code
Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083694 sd 3:0:0:10: sdf Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083696 sd 3:0:0:10: sdf CDB: Write Verify(10): 2e 08 00 e9 6b 80 00 00 80 00
Nov 27 19:12:47 chbs-bia1-18 kernel: http://885907.083701 end_request: I/O error, dev sdf, sector 15297408
Nov 27 19:12:47 chbs-bia1-18 mmfs: Error=MMFS_DISKFAIL, ID=0x9C6C05FA, Tag=11862485: Disk failure. Volume gpfsFS1. rc = 5. Physical volume nsd07gpfs
Nov 27 19:12:47 chbs-bia1-18 mmfs: Error=MMFS_SYSTEM_UNMOUNT, ID=0xC954F85D, Tag=11862486: Unrecoverable file system operation error. Status code 5. Volume gpfsFS1

Thank you.
Updated on 2012-12-23T08:46:38Z at 2012-12-23T08:46:38Z by Novikov_Alexander
  • Novikov_Alexander
    Novikov_Alexander
    1404 Posts

    Re: DS3524 RDAC Error

    ‏2012-11-30T03:53:56Z  
    Dear Veena,

    looks like non-supported configuration used. Lets check it. Could you provide AMM Service Data, DSA logs from blades and output of command supportshow from both FC Brocade blade-switches.

    Regards,
    Alexander Novikov
    Russia, Moscow
  • VeenaEdattale
    VeenaEdattale
    2 Posts

    Re: DS3524 RDAC Error

    ‏2012-12-19T06:28:43Z  
    Dear Veena,

    looks like non-supported configuration used. Lets check it. Could you provide AMM Service Data, DSA logs from blades and output of command supportshow from both FC Brocade blade-switches.

    Regards,
    Alexander Novikov
    Russia, Moscow
    Hi , the firmware on the switch was updated. We saw CRC erros on some ports and then replaced the sfp module and a cable but with no luck and the erros continue to generate which is causing a massive performance problem on the storage side.

    I am currently with no clue as to what i could try next. Any help would be appreciated.

    Thank you
  • Novikov_Alexander
    Novikov_Alexander
    1404 Posts

    Re: DS3524 RDAC Error

    ‏2012-12-23T08:46:38Z  
    Hi , the firmware on the switch was updated. We saw CRC erros on some ports and then replaced the sfp module and a cable but with no luck and the erros continue to generate which is causing a massive performance problem on the storage side.

    I am currently with no clue as to what i could try next. Any help would be appreciated.

    Thank you
    Dear Veena,

    could you provide DS Support Data, AMM Service Data, DSA logs from blades and output of command supportshow from both FC Brocade blade-switches.

    Regards,
    Alexander Novikov
    Russia, Moscow