Topic
  • 16 replies
  • Latest Post - ‏2013-07-07T11:05:51Z by Guy Kempny
SystemAdmin
SystemAdmin
3234 Posts

Pinned topic BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

‏2009-08-07T15:10:54Z |
Hi. I wonder if anyone can help me with this errors. thanks.

CONFIGURATION:

Bladecenter S, two SAS RAID CONTROLLER (Module 3 and 4),
VMWARE, WINDOWS.

ERRORS:

We have this errors two or three times a week:
Recovery An error on I/O Module 3 was detected.
I/O module 4 has restarted.
An error on I/O Module 3 was detected.
Recovery An error on I/O Module 3 was detected.
I/O module 4 has restarted.
An error on I/O Module 3 was detected.
I/O module 4 has restarted.
Recovery An error on I/O Module 3 was detected.
An error on I/O Module 3 was detected
Al the firmware are in the newest version.

Thanks,
Updated on 2010-03-05T14:54:28Z at 2010-03-05T14:54:28Z by SystemAdmin
  • cocas24
    cocas24
    7 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-09-07T19:56:30Z  
    Can anyone from IBM reply this? We've the same problem...
    All the blades lost conectivity with the disk in the Storage Module. I onle have a SAS Conectivity Module.
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-09-23T07:15:35Z  
    Hi,
    I have same problem i change one SAS RAID Controller and now is already OK. You can try this.
    Best regards,
    Josef Sumsal
    Czech republic
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-09-29T20:19:36Z  
    Hi,
    I have same problem i change one SAS RAID Controller and now is already OK. You can try this.
    Best regards,
    Josef Sumsal
    Czech republic
    We had the same problem where all the blades lost connection to both storage modules, we called IBM and were told to point the gateway of the SAS raid module to the AMM. I don't think thats the cause because the chassis has been up and running for 2 weeks and suddenly you tell me its the gateway config problem..?? anyways, we made that change and has been good since.
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-10-02T13:40:23Z  
    We had the same problem where all the blades lost connection to both storage modules, we called IBM and were told to point the gateway of the SAS raid module to the AMM. I don't think thats the cause because the chassis has been up and running for 2 weeks and suddenly you tell me its the gateway config problem..?? anyways, we made that change and has been good since.
    too happy too soon. I was right, the gateway config wasn't the problem.
    IT HAPPENED AGAIN THIS MORNING. ALL blades lost connectivity to storage.
    Reported to IBM.
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-10-04T08:19:05Z  
    too happy too soon. I was right, the gateway config wasn't the problem.
    IT HAPPENED AGAIN THIS MORNING. ALL blades lost connectivity to storage.
    Reported to IBM.
    On which RSSM firmware level?

    1.0.3.023?
  • eauque
    eauque
    19 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-10-07T19:22:25Z  
    Hi all.

    I'm having the same problem with a Blade Center S 8886E1U, 2 SAS RAID controllers. One day almost each week, all blade servers lost connection to the SAN, and got blocked due to disk access failed.

    I checked the sas raid gateway configuration and is ok (the gateway is defined to point to the AMM's ip address).

    From the Storage Configuration Manager i can see both SAS RAID switches, but i got error message accessing the SAS RAID controllers, they do not respond to ping requests, and shows disconnected in SCM.

    Each time it happens, i have to power off and on both SAS Switches, and restart al blade servers.
  • eauque
    eauque
    19 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-10-07T19:35:36Z  
    • eauque
    • ‏2009-10-07T19:22:25Z
    Hi all.

    I'm having the same problem with a Blade Center S 8886E1U, 2 SAS RAID controllers. One day almost each week, all blade servers lost connection to the SAN, and got blocked due to disk access failed.

    I checked the sas raid gateway configuration and is ok (the gateway is defined to point to the AMM's ip address).

    From the Storage Configuration Manager i can see both SAS RAID switches, but i got error message accessing the SAS RAID controllers, they do not respond to ping requests, and shows disconnected in SCM.

    Each time it happens, i have to power off and on both SAS Switches, and restart al blade servers.
    i made a mistake in my previous post. Both SAS RAID modules do respond to ping requests.

    3 SAS RAID Ctrl Mod Boot ROM S0CD01D 12/21/2007 0308
    Main Application 1 S0SW01D 01/15/2009 R103
    Main Application 2 S0CP00A 01/01/2000
    C00A
    Main Application 3 S0RC594 10/20/08
    1688
    Main Application 4 S0BT07A 09/25/2008
    0119
    Main Application 5 S0SE00C 09/04/2008
    0102
    4 SAS RAID Ctrl Mod Boot ROM S0CD01D 12/21/2007
    0308
    Main Application 1 S0SW01D 01/15/2009
    R103
    Main Application 2 S0CP00A 01/01/2000
    C00A
    Main Application 3 S0RC594 10/20/08
    1688
    Main Application 4 S0BT07A 09/25/2008
    0119
    Main Application 5 S0SE00C 09/04/2008
    0102
  • cocas24
    cocas24
    7 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-10-13T12:53:22Z  
    • eauque
    • ‏2009-10-07T19:35:36Z
    i made a mistake in my previous post. Both SAS RAID modules do respond to ping requests.

    3 SAS RAID Ctrl Mod Boot ROM S0CD01D 12/21/2007 0308
    Main Application 1 S0SW01D 01/15/2009 R103
    Main Application 2 S0CP00A 01/01/2000
    C00A
    Main Application 3 S0RC594 10/20/08
    1688
    Main Application 4 S0BT07A 09/25/2008
    0119
    Main Application 5 S0SE00C 09/04/2008
    0102
    4 SAS RAID Ctrl Mod Boot ROM S0CD01D 12/21/2007
    0308
    Main Application 1 S0SW01D 01/15/2009
    R103
    Main Application 2 S0CP00A 01/01/2000
    C00A
    Main Application 3 S0RC594 10/20/08
    1688
    Main Application 4 S0BT07A 09/25/2008
    0119
    Main Application 5 S0SE00C 09/04/2008
    0102
    We have the same exactly same problem. IBM support is still working with us without success......
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-10-13T15:01:37Z  
    • cocas24
    • ‏2009-10-13T12:53:22Z
    We have the same exactly same problem. IBM support is still working with us without success......
    Hi,

    i stumbled about the same problem about two weeks ago.
    All blades suddenly lost the contact to the storage and error message was as described here.

    I did an update to the most recent firmware version for the SAS/RAID modules and since then it seems to work well so far.
    Note: Update could only be successfully installed using the method described for Windows + Cygwin. For some reason all tries using the method for Linux failed during the transfer of the fw image to the controller in strange ways (of course Python version etc. were as described).

    Best regards

    rcg
  • eauque
    eauque
    19 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-10-20T14:50:57Z  
    Hi,

    i stumbled about the same problem about two weeks ago.
    All blades suddenly lost the contact to the storage and error message was as described here.

    I did an update to the most recent firmware version for the SAS/RAID modules and since then it seems to work well so far.
    Note: Update could only be successfully installed using the method described for Windows + Cygwin. For some reason all tries using the method for Linux failed during the transfer of the fw image to the controller in strange ways (of course Python version etc. were as described).

    Best regards

    rcg
    Hi, it happened again this morning.

    Looking at SAS switches i see some error counters, but the AMM does not report any problem. Attached is a print screen showing the errors. What could be happening?
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-10-28T16:20:35Z  
    • eauque
    • ‏2009-10-20T14:50:57Z
    Hi, it happened again this morning.

    Looking at SAS switches i see some error counters, but the AMM does not report any problem. Attached is a print screen showing the errors. What could be happening?
    IBM says its the sas raid subsystem firmware.
    Look at firmware VDP -> sas raid controller Mod -> application 3
    I upgraded mine to Build ID: S0RC766, Revision 1763.

    BCS now running over 2 weeks, no problem.

    IBM will tell you to use the linux + cygwin TFTP crap to do it. I got a headache just by looking at the instructions.
    This is the F**** problem w IBM, why can't you have ONE way and ONE place to keep track of your BC firmware??

    If you have 24X7 support, request that they come and do it for you.
  • tyrolit
    tyrolit
    1 Post

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2009-11-09T14:46:24Z  
    Hi all,

    we've been testing a BC-S with vSphere4 since more than 6 months and we've had exactly the same issues.
    IBM-support was engaged but not successfull for a long time. We have tried several firmware-patches etc.

    After a lot of pressure from our side they forwarded the issue to the lab somewhere in US and after 3 weeks they told us to set the path-policy in VMWare to MRU.

    have a look: http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-5081899&brandind=500000

    after applying the latest firmwer (we are on S0RC753 Revisioin 1762) and changing the path-policy to MRU (the setup-process set's it to fixed) we are stable.

    hope this helps you.

    regards
    gernot
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2010-02-24T07:13:56Z  
    Hi,

    Anybody manage to resolve the issue of 2 modules keep rebooting by itself?

    i tried updating firmware using CLI, got the following error.

    ngklj@ngklj-PC /cygdrive/c/new
    $ ./ibm_fw_bcsw_s0cl-1.0.3.025_anyos_noarch.sh -i 192.168.70.139
    sh: clear: command not found
    sh: clear: command not found
    sh: clear: command not found
    sh: clear: command not found
    sh: clear: command not found
    sh: clear: command not found
    MSG : ./SbInst.py failed in function writeMessage, rc = 1

    We replaced the module but still the same problem. firmware is at 1..0.3.023

    Anybody able to help?
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2010-03-05T14:54:28Z  
    Hi,

    Anybody manage to resolve the issue of 2 modules keep rebooting by itself?

    i tried updating firmware using CLI, got the following error.

    ngklj@ngklj-PC /cygdrive/c/new
    $ ./ibm_fw_bcsw_s0cl-1.0.3.025_anyos_noarch.sh -i 192.168.70.139
    sh: clear: command not found
    sh: clear: command not found
    sh: clear: command not found
    sh: clear: command not found
    sh: clear: command not found
    sh: clear: command not found
    MSG : ./SbInst.py failed in function writeMessage, rc = 1

    We replaced the module but still the same problem. firmware is at 1..0.3.023

    Anybody able to help?
    Hi all i managed to update the firmware of the SAS raid module to 1.3.0.025 using CLI and it will resolve the restarting of the module issue.
  • ivfibm
    ivfibm
    13 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2013-06-24T11:55:43Z  
    Hi all i managed to update the firmware of the SAS raid module to 1.3.0.025 using CLI and it will resolve the restarting of the module issue.

    We have to replace batery backup module from both sas raid controller module

  • Guy Kempny
    Guy Kempny
    18 Posts

    Re: BladeCenter S error on I/O Module 3 - 4 - SAS RAID CONTROLLER

    ‏2013-07-07T11:05:51Z  
    • ivfibm
    • ‏2013-06-24T11:55:43Z

    We have to replace batery backup module from both sas raid controller module

    Wow, this was last updated in 2010? Why post at end rather than open new posting?

    IBM batteries do have a life span. However, you should see warning msgs in the AMM prior to 100% failure. Some people have tried to fake the battery failure, but the end result is that it needs to be replaced by a genuine IBM unit.

    IF you have dual controllers, it might be wise to consider replacing both at the same time.