Topic
  • 64 replies
  • Latest Post - ‏2015-06-02T14:53:03Z by CHofmeister
JeffDomogala
JeffDomogala
23 Posts

Pinned topic HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

‏2012-08-30T21:59:20Z |
I have had a pair of BladeCenter H chassis for 6 months now. They are populated with a total of 28 identically configured HS-22 (7870) blades. The CIOv slots have a QLogic QMI2582 8Gb Fibre Channel card. The Fibre Channel switches are all Brocade 20 port 8Gb switches (in slots 3 and 4 of each H chassis). The O/S is XenServer 6.0 with the DM multipath enabled. So for each chassis, the FC switches in slots 3 and 4 are redundant.

From the very beginning of using these blades I have always had a handful of blades that have CRC errors showing up on each of these switches (so both chassis have the same problem). I can tell when this happens because the O/S (XenServer 6.0) show aborts in the kernel message log, and there is a 30 second O/S "hang" on all of the VMs on that blade until the controller aborts the I/O. It is very annoying and users do notice the problem. Usually only one of the two switches for a particular blade will experience the CRC errors, so the workaround is to disable the port in the offending switch. It means that the FC is no longer redundant for that blade and a hardware failure could knock out the FC on the blade completely.

I went through the process of updating the firmware and EDC images on the QMI2582 cards, nailed up the connection speeds to 8Gb on the switch and HBA ports, and ensured that the EDC image selection was for the Brocade switch. I have also updated the firmware on the brocade switches to the latest (Fabric O/S 6.4.2b) and have verified the FPGA versions on them. The combination of updates did settle down some of the CRC errors, but some still persist. I have also managed to eliminate some of the CRC error prone connections by playing shell games with the blades between slots. It is disturbing that this actually works, telling me that there is still some marginal signal integrity issue in the overall scheme of things for which the EDC firmware cannot totally compensate on the QMI2582.

Here is the "scli -i" output from XenServer for one port of the QMI2585 modules:

-------------------------------- Host Name                      : bceng2-xs8 HBA Instance                   : 0 HBA Model                      : QMI2582 HBA Description                : QMI2582 QLogic 8Gb Fibre Channel Expansion Card (CIOv) 

for IBM BladeCenter HBA ID                         : 0-QMI2582 HBA Alias                      : HBA Port                       : 1 Port Alias                     : Node Name                      : 20-00-00-24-FF-26-CD-92 Port Name                      : 21-00-00-24-FF-26-CD-92 Port ID                        : 03-03-00 Serial Number                  : LFD1111L54074 Driver Version                 : 8.03.07.03.55.6-k2 BIOS Version                   : 2.13 Driver Firmware Version        : 5.06.05 (90d5) Flash BIOS Version             : 2.13 Flash FCode Version            : 3.17 Flash EFI Version              : 2.38 Flash Firmware Version         : 5.06.03 Actual Connection Mode         : Point to Point Actual Data Rate               : 8 Gbps PortType (Topology)            : NPort Target Count                   : 6 PCI Bus Number                 : 36 PCI Device Number              : 0 PCIe Max Bus Width : x8 PCIe Max Bus Speed             : 5.0 Gbps PCIe Negotiated Width          : x4 PCIe Negotiated Speed          : 5.0 Gbps HBA Status                     : Online --------------------------------


I have run out of ideas regarding this and am looking for help to further diagnose and ultimately solve the issue. The support folks are quite happy just changing out QMI2582 modules until something works, but that is not an end solution, it is just a band aid. I am sure that I am not the only experiencing this issue and want to help come up with a real end solution. So folks, please weigh in.
Updated on 2013-03-24T13:04:35Z at 2013-03-24T13:04:35Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-08-31T07:37:33Z  
    Perhaps you saw this already
    http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5087103&brandind=5000020
    It's looks like all FW and driver versions are ok. Following recommendation for configuration 3 you have to configure fixed speed for HBA and internal FCSM ports and check if fill word is "1" for them.
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-08-31T08:06:47Z  
    Perhaps you saw this already
    http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5087103&brandind=5000020
    It's looks like all FW and driver versions are ok. Following recommendation for configuration 3 you have to configure fixed speed for HBA and internal FCSM ports and check if fill word is "1" for them.
    Sorry, i reread more attentively your message and saw you did the rest too.
    Try to set speed to 4Gb, at least for 1-5 slots
    http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5087397&brandind=5000020
    and set CIOv to work in PCI-E gen1 speed
    http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5084852&brandind=5000019
    Retains doesn't describe your case, but this doesn't mean (IMHO) such or similar symptoms may not appear in analogous situations.
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-08-31T13:20:52Z  
    Sorry, i reread more attentively your message and saw you did the rest too.
    Try to set speed to 4Gb, at least for 1-5 slots
    http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5087397&brandind=5000020
    and set CIOv to work in PCI-E gen1 speed
    http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5084852&brandind=5000019
    Retains doesn't describe your case, but this doesn't mean (IMHO) such or similar symptoms may not appear in analogous situations.
    Why should I have to cripple the connection speed? If it is just a test for the slots in question that is fine. However, my problems are not limited to slots 1 through 5 in either chassis. Nor is it confined to connections to just the slot 3 or slot 4 FC switches. The only clean switch is slot 4 in one of the bladecenters. When I mean clean there are no CRC errors on any of the 14 bay ports according to the corresponding FC switch, and there are never any HBA errors reported by the O/S.

    Here is the compiled list of trouble slots at the moment:
    Chassis 1:
    I/O module 3 - slots 10 and 11
    I/O module 4 - clean
    Chassis 2:
    I/O module 3 - slot 3
    I/O module 4 - slot 9
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-09-03T14:31:39Z  
    Why should I have to cripple the connection speed? If it is just a test for the slots in question that is fine. However, my problems are not limited to slots 1 through 5 in either chassis. Nor is it confined to connections to just the slot 3 or slot 4 FC switches. The only clean switch is slot 4 in one of the bladecenters. When I mean clean there are no CRC errors on any of the 14 bay ports according to the corresponding FC switch, and there are never any HBA errors reported by the O/S.

    Here is the compiled list of trouble slots at the moment:
    Chassis 1:
    I/O module 3 - slots 10 and 11
    I/O module 4 - clean
    Chassis 2:
    I/O module 3 - slot 3
    I/O module 4 - slot 9
    Really something very strange and looks like a hardware issue. Did you contact support?
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-09-04T14:57:14Z  
    Really something very strange and looks like a hardware issue. Did you contact support?
    I'm about to engage support. I'll report back once I have some information to share.
  • Fani-IBM
    Fani-IBM
    1 Post

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-09-26T07:24:02Z  
    I'm about to engage support. I'll report back once I have some information to share.
    Any update?
  • JewelGuy
    JewelGuy
    10 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-01T16:15:05Z  
    • Fani-IBM
    • ‏2012-09-26T07:24:02Z
    Any update?
    We are experiencing very similar issues.

    Have you reached a resolution to your situation?
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-02T00:44:23Z  
    Hey folks. I've been pretty busy and still intend on pursuing this issue. I can tell you that I did switch a pair of of the switches between the two bladecenters and the problem tracks different blades for both the same switch in the other blade and the other switch in the original chassis position. So this seems to be truly random behavior. Given my electrical engineering experience I am still hanging my hat on this being a signal integrity issue that the EDC aboard the QLogic adapters cannot totally correct with the Brocade switches. I am wondering if either the QLogic 8Gb switch 44X1905 or QLogic 4/8Gb switch 88Y6406 have the same problems as with the Brocade switches when connecting to the QLogic adapter cards.
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-02T02:12:07Z  
    Hey folks. I've been pretty busy and still intend on pursuing this issue. I can tell you that I did switch a pair of of the switches between the two bladecenters and the problem tracks different blades for both the same switch in the other blade and the other switch in the original chassis position. So this seems to be truly random behavior. Given my electrical engineering experience I am still hanging my hat on this being a signal integrity issue that the EDC aboard the QLogic adapters cannot totally correct with the Brocade switches. I am wondering if either the QLogic 8Gb switch 44X1905 or QLogic 4/8Gb switch 88Y6406 have the same problems as with the Brocade switches when connecting to the QLogic adapter cards.
    I now have cases open for both of the chassis involved. The first data they had me send were the management module service logs and the output of "supportshow" from all 4 FC switches. I'll update once I hear back.
  • JewelGuy
    JewelGuy
    10 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-02T12:47:53Z  
    I now have cases open for both of the chassis involved. The first data they had me send were the management module service logs and the output of "supportshow" from all 4 FC switches. I'll update once I hear back.
    We are experiencing similar situation and are very curious to see what the resolution is.
  • HajoEhlers
    HajoEhlers
    72 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-02T19:26:08Z  
    I now have cases open for both of the chassis involved. The first data they had me send were the management module service logs and the output of "supportshow" from all 4 FC switches. I'll update once I hear back.
    Please read
    * Supported configurations for 8 Gigabit Fibre Channel - IBM BladeCenter H (7989, 8852)
    https://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5087103

    * Bluescreen, Kernel panic, or PSoD with CIOv or CFFh adapter installed - IBM BladeCenter HS22, HS22V
    https://www-947.ibm.com/support/entry/myportal/docdisplay?brandind=5000019&lndocid=MIGR-5084852

    Note :
    * We have similar problems and switch to 4GB FC fixed.
    * Used Hardware:
    QLOGIC 8Gb Intelligent Pass-thru Module for IBM ?BladeCenter - 44X1907
    QLOGIC 8Gb Fibre Channel Expansion Card (CIOv) - 44X1945

    cheers
    Hajo
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-02T19:49:53Z  
    Please read
    * Supported configurations for 8 Gigabit Fibre Channel - IBM BladeCenter H (7989, 8852)
    https://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5087103

    * Bluescreen, Kernel panic, or PSoD with CIOv or CFFh adapter installed - IBM BladeCenter HS22, HS22V
    https://www-947.ibm.com/support/entry/myportal/docdisplay?brandind=5000019&lndocid=MIGR-5084852

    Note :
    * We have similar problems and switch to 4GB FC fixed.
    * Used Hardware:
    QLOGIC 8Gb Intelligent Pass-thru Module for IBM ?BladeCenter - 44X1907
    QLOGIC 8Gb Fibre Channel Expansion Card (CIOv) - 44X1945

    cheers
    Hajo
    Hajo,

    I have a supported configuration (#3) listed in the MIGR-5087103 document. I should not have to back down on performance due to some hardware design issue. IBM advertises that 8Gb is supported in that configuration, so I expect for the huge amount of cash I have invested in the hardware that I should be able to use 8Gb without any problem. IBM has to figure out how to solve the issue. Me and JewelGuy cannot be the only ones that have this configuration that have these CRC issues. I can tell you that so far I already have a man-month of my time (spread throughout 10 calendar months) invested trying to solve the issue. This includes swapping FC switch modules, shuffling blades, etc... to get a minimum problem set of FC connections. For those problematic connections, I have to disable the switch ports so that the O/S doesn't have to endure 30 second timeouts every time a CRC error occurs. By disabling these channels I have effectively disabled FC redundancy for those blades. So far I haven't gotten bitten by this, but it is only a matter of time.
  • JewelGuy
    JewelGuy
    10 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-02T20:01:37Z  
    Hajo,

    I have a supported configuration (#3) listed in the MIGR-5087103 document. I should not have to back down on performance due to some hardware design issue. IBM advertises that 8Gb is supported in that configuration, so I expect for the huge amount of cash I have invested in the hardware that I should be able to use 8Gb without any problem. IBM has to figure out how to solve the issue. Me and JewelGuy cannot be the only ones that have this configuration that have these CRC issues. I can tell you that so far I already have a man-month of my time (spread throughout 10 calendar months) invested trying to solve the issue. This includes swapping FC switch modules, shuffling blades, etc... to get a minimum problem set of FC connections. For those problematic connections, I have to disable the switch ports so that the O/S doesn't have to endure 30 second timeouts every time a CRC error occurs. By disabling these channels I have effectively disabled FC redundancy for those blades. So far I haven't gotten bitten by this, but it is only a matter of time.
    We are being asked to set the external ports as fillword = 3, along with set all the port speed to fixed 4G.
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-03T21:11:04Z  
    I now have cases open for both of the chassis involved. The first data they had me send were the management module service logs and the output of "supportshow" from all 4 FC switches. I'll update once I hear back.
    Update. They never called me back since Monday. So, I called them back just now. I exdplained to the tech that I went through all of the firmware/BIOS/settings criterion listed in the "Supported Configurations" document. He went through the logs and verified that is was all correct and then had me dump the Dynamic System Analysis logs from one of the blades to send to him. He verified that last piece of data he needed and now the cases go to level 2 support.
    Chatting with the tech while waiting for "stuff", 8Gb fibre channel has been a common problem between the internal switch ports and the HBA. He could not tell me if QLogic switches had fewer problems connecting to their own HBAs on the blades. He just said that they have seen a lot of problems in that area. Unless some upper level tech can come up with something else, I am going to press them to let me try one of the QLogic 8Gb switches. I would think they would have done this before, but I'll find out. It may be a few days before I am contacted by the level 2 tech.
    Two things about support I do not like so far that I experienced:
    1 - If a call gets dropped in the middle of a call, they do not call you back. That happened today with the first guy, I called back and no-one knows who took the call. So I had to start over again with another guy, who's name and contact information I wrote down immediately.
    2 - There is no way of tracking the cases online if you phoned in the case. You have to call to find out anything. All other companies I deal with have some way of checking status of support requests online. I don;t know why IBM is not on the cutting edge in this regard.

    I'll follow up again once I have another conversation with the level 2 tech. I was told that this will get solved. We'll see...
  • HajoEhlers
    HajoEhlers
    72 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-04T08:15:52Z  
    Update. They never called me back since Monday. So, I called them back just now. I exdplained to the tech that I went through all of the firmware/BIOS/settings criterion listed in the "Supported Configurations" document. He went through the logs and verified that is was all correct and then had me dump the Dynamic System Analysis logs from one of the blades to send to him. He verified that last piece of data he needed and now the cases go to level 2 support.
    Chatting with the tech while waiting for "stuff", 8Gb fibre channel has been a common problem between the internal switch ports and the HBA. He could not tell me if QLogic switches had fewer problems connecting to their own HBAs on the blades. He just said that they have seen a lot of problems in that area. Unless some upper level tech can come up with something else, I am going to press them to let me try one of the QLogic 8Gb switches. I would think they would have done this before, but I'll find out. It may be a few days before I am contacted by the level 2 tech.
    Two things about support I do not like so far that I experienced:
    1 - If a call gets dropped in the middle of a call, they do not call you back. That happened today with the first guy, I called back and no-one knows who took the call. So I had to start over again with another guy, who's name and contact information I wrote down immediately.
    2 - There is no way of tracking the cases online if you phoned in the case. You have to call to find out anything. All other companies I deal with have some way of checking status of support requests online. I don;t know why IBM is not on the cutting edge in this regard.

    I'll follow up again once I have another conversation with the level 2 tech. I was told that this will get solved. We'll see...
    > IBM advertises that 8Gb is supported in that configuration
    Advertising and delivering are two different words ;-)

    Regarding support from IBM
    You should get a Problem Management Record (PMR) Number where all the information should be logged.

    Best parctise for me:
    Either open a call via phone or email.
    I give the following information:
    - Our customer number
    - hw serial no. and hw type
    - short description of the problem.

    Then i get a PMR number otherwise it is not handled correctly by IBM ( Since IBM does not even knows about it)
    cheers
    Hajo
  • JewelGuy
    JewelGuy
    10 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-04T13:03:10Z  
    Update. They never called me back since Monday. So, I called them back just now. I exdplained to the tech that I went through all of the firmware/BIOS/settings criterion listed in the "Supported Configurations" document. He went through the logs and verified that is was all correct and then had me dump the Dynamic System Analysis logs from one of the blades to send to him. He verified that last piece of data he needed and now the cases go to level 2 support.
    Chatting with the tech while waiting for "stuff", 8Gb fibre channel has been a common problem between the internal switch ports and the HBA. He could not tell me if QLogic switches had fewer problems connecting to their own HBAs on the blades. He just said that they have seen a lot of problems in that area. Unless some upper level tech can come up with something else, I am going to press them to let me try one of the QLogic 8Gb switches. I would think they would have done this before, but I'll find out. It may be a few days before I am contacted by the level 2 tech.
    Two things about support I do not like so far that I experienced:
    1 - If a call gets dropped in the middle of a call, they do not call you back. That happened today with the first guy, I called back and no-one knows who took the call. So I had to start over again with another guy, who's name and contact information I wrote down immediately.
    2 - There is no way of tracking the cases online if you phoned in the case. You have to call to find out anything. All other companies I deal with have some way of checking status of support requests online. I don;t know why IBM is not on the cutting edge in this regard.

    I'll follow up again once I have another conversation with the level 2 tech. I was told that this will get solved. We'll see...
    Jeff, yesterday we set all of our external blade ports (all ISL's to another Brocade switch) to fillword = 3. We haven't seen any CRC errors on the blade switches since we made the change. We are not closing the PMR quite yet, we'll monitor over the weekend.
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-04T13:27:34Z  
    • JewelGuy
    • ‏2012-10-04T13:03:10Z
    Jeff, yesterday we set all of our external blade ports (all ISL's to another Brocade switch) to fillword = 3. We haven't seen any CRC errors on the blade switches since we made the change. We are not closing the PMR quite yet, we'll monitor over the weekend.
    JewelGuy- Was this with the speed set to 8Gb or 4Gb?
  • HajoEhlers
    HajoEhlers
    72 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-04T13:34:20Z  
    • JewelGuy
    • ‏2012-10-04T13:03:10Z
    Jeff, yesterday we set all of our external blade ports (all ISL's to another Brocade switch) to fillword = 3. We haven't seen any CRC errors on the blade switches since we made the change. We are not closing the PMR quite yet, we'll monitor over the weekend.
    In case somebody would like to know what the "fillword=3" is all about
    Extract from - http://www.chris-g.co.uk/wordpress/wp-content/uploads/2011/06/FOS-8G-Link-Init-Fillword-Behavior-v1.pdf

    
    ... To comply with the published FC standards, Brocade introduced options 
    
    for ARB/ARB and IDLE/ARB link initialization/fill word support. However, some 8G devices are not capable of properly establishing links with Brocade 8G Fibre Channel switches when ARB/ARB or IDLE/ARB primitives are used. These 8G devices require the legacy IDLE/IDLE sequence to achieve successful link initialization. To address 
    
    this issue, Brocade has provided the ability to configure any of the three possible combinations (IDLE/IDLE, ARB/ARB, or IDLE/ARB) 
    
    for link initialization and fill words. ...
    

    Other info:
    - https://www.ibm.com/developerworks/mydeveloperworks/blogs/anthonyv/entry/brocade_8_gbps_fibre_channel_switches_and_fill_words?lang=en
    - http://community.brocade.com/thread/6287?start=0&tstart=0
  • JewelGuy
    JewelGuy
    10 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-04T13:36:34Z  
    JewelGuy- Was this with the speed set to 8Gb or 4Gb?
    It was with the ports fixed at 4G. We had to go the 4G route because of our IBM XIV Gen2, which only has 4G ports, and there is a bug in the code we are at that it where the XIV play well with ports that aren't fixed at 4G.
  • JewelGuy
    JewelGuy
    10 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-09T18:07:33Z  
    • JewelGuy
    • ‏2012-10-04T13:36:34Z
    It was with the ports fixed at 4G. We had to go the 4G route because of our IBM XIV Gen2, which only has 4G ports, and there is a bug in the code we are at that it where the XIV play well with ports that aren't fixed at 4G.
    Update: We are still experiencing CRC errors on one of our servers within the blade center, even after setting the fillword to 3. IBM support is still working the case.
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-11-30T20:53:19Z  
    • JewelGuy
    • ‏2012-10-09T18:07:33Z
    Update: We are still experiencing CRC errors on one of our servers within the blade center, even after setting the fillword to 3. IBM support is still working the case.
    I have been going back and forth with level three support for over a month and now there is a customer advocate involved who is moving things along. There seems to be good news on the horizon. For those with H chassis, a midplane replacement will most likely be in your future. There is an updated midplane that is currently undergoing acceptance testing in the IBM QA labs. There have been changes to the midplane to address the slot to slot variations between the QMI-2582 FC HBAs and the switch modules in I/O bays 3 and 4. I have a conference call on Tuesday morning to find out the specifics about the change. If all goes well I will be hopefully installing the revised midplanes in my pair of chassis later this month. I'll reply back with details aftger the conference call.
  • HajoEhlers
    HajoEhlers
    72 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-12-03T09:45:35Z  
    I have been going back and forth with level three support for over a month and now there is a customer advocate involved who is moving things along. There seems to be good news on the horizon. For those with H chassis, a midplane replacement will most likely be in your future. There is an updated midplane that is currently undergoing acceptance testing in the IBM QA labs. There have been changes to the midplane to address the slot to slot variations between the QMI-2582 FC HBAs and the switch modules in I/O bays 3 and 4. I have a conference call on Tuesday morning to find out the specifics about the change. If all goes well I will be hopefully installing the revised midplanes in my pair of chassis later this month. I'll reply back with details aftger the conference call.
    Hi Jeff,
    thanks a lot for keeping us updated.

    Cheers
    Hajo
  • JeffDomogala
    JeffDomogala
    23 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-12-05T21:31:06Z  
    Hi Jeff,
    thanks a lot for keeping us updated.

    Cheers
    Hajo
    There is good news on the horizon. There is a new H chassis midplane in the final release process that specifically addresses signal integrity issues between the blade slots and I/O bays 3 and 4. They went into the gory details about it. Here's a quick summary:

    1 - Removed through vias which were essentially high frequency antennae causing interference and reflections that the HBA EDC could not totally handle. The interplane vias are now back drilled so they do not come to the surfaces, which remove the antenna affect.
    2 - Better trace length matching and routing for the signals... the lengths varied from 1.5 to 18 inches and now they range from 3 to 12 (or maybe it was 14) inches. Anyway, this is for the better.
    3 - New midplane blade connectors specifically rated for high speed frequencies. Same form factor, just better signal isolation characteristics (shielding).

    Along with the chassis midplane change, there will be updates to the QMI2582 operational firmware, EDC firmware and device drivers. As well, there are firmware updates for the blade IMMs and AMM for the chassis. And finally, there will be a code update for the fibrechannel switches in the chassis (Brocades in my case).

    There will be a new variant of the 8852 H chassis sold with this updated midplane. The model number is 8852-5xU instead of 8852-4xU. If I were to venture a guess IBM will be changing the midplanes for those who are having the same symptoms.

    I am scheduled to have both of my midplanes changed on the weekend of December 22/23. I will be the first customer getting these in the field, so this should be interesting. Considering that I have two full chassis totalling 28 blades, the time to update all of the firmware/drivers is going to be by far the long pole in the tent. If this all works out, we should be completely CRC error free in all 28 slots. And there should no longer be the need to play the shell games to find out which HBA works best with which slot.

    I will report back with updates.
  • SystemAdmin
    SystemAdmin
    3234 Posts

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-12-17T22:22:11Z  
    There is good news on the horizon. There is a new H chassis midplane in the final release process that specifically addresses signal integrity issues between the blade slots and I/O bays 3 and 4. They went into the gory details about it. Here's a quick summary:

    1 - Removed through vias which were essentially high frequency antennae causing interference and reflections that the HBA EDC could not totally handle. The interplane vias are now back drilled so they do not come to the surfaces, which remove the antenna affect.
    2 - Better trace length matching and routing for the signals... the lengths varied from 1.5 to 18 inches and now they range from 3 to 12 (or maybe it was 14) inches. Anyway, this is for the better.
    3 - New midplane blade connectors specifically rated for high speed frequencies. Same form factor, just better signal isolation characteristics (shielding).

    Along with the chassis midplane change, there will be updates to the QMI2582 operational firmware, EDC firmware and device drivers. As well, there are firmware updates for the blade IMMs and AMM for the chassis. And finally, there will be a code update for the fibrechannel switches in the chassis (Brocades in my case).

    There will be a new variant of the 8852 H chassis sold with this updated midplane. The model number is 8852-5xU instead of 8852-4xU. If I were to venture a guess IBM will be changing the midplanes for those who are having the same symptoms.

    I am scheduled to have both of my midplanes changed on the weekend of December 22/23. I will be the first customer getting these in the field, so this should be interesting. Considering that I have two full chassis totalling 28 blades, the time to update all of the firmware/drivers is going to be by far the long pole in the tent. If this all works out, we should be completely CRC error free in all 28 slots. And there should no longer be the need to play the shell games to find out which HBA works best with which slot.

    I will report back with updates.
    Has anyone come across this issue on the Qlogic 8GB pass thru modules. I seem to be having connectivity on issues on the Blade Center H with HS22 blades, whereby the FC port will go down unexpectedly and also seeing similar CRC errors in Qlogic Sansurfer. Seems to be random and cannot affect FC connected blades in anty bay. We purchased the kit back in March 2012 and have only got round to implement.

    Came across the post making reference to the issues with Brocade switches, so wondering if we have the same issue with the backplane.

    Any assistance or infromation would be helpful.