Topic
59 replies Latest Post - ‏2013-10-24T18:02:42Z by FredAoui
JeffDomogala
JeffDomogala
21 Posts
ACCEPTED ANSWER

Pinned topic HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

‏2012-08-30T21:59:20Z |
I have had a pair of BladeCenter H chassis for 6 months now. They are populated with a total of 28 identically configured HS-22 (7870) blades. The CIOv slots have a QLogic QMI2582 8Gb Fibre Channel card. The Fibre Channel switches are all Brocade 20 port 8Gb switches (in slots 3 and 4 of each H chassis). The O/S is XenServer 6.0 with the DM multipath enabled. So for each chassis, the FC switches in slots 3 and 4 are redundant.

From the very beginning of using these blades I have always had a handful of blades that have CRC errors showing up on each of these switches (so both chassis have the same problem). I can tell when this happens because the O/S (XenServer 6.0) show aborts in the kernel message log, and there is a 30 second O/S "hang" on all of the VMs on that blade until the controller aborts the I/O. It is very annoying and users do notice the problem. Usually only one of the two switches for a particular blade will experience the CRC errors, so the workaround is to disable the port in the offending switch. It means that the FC is no longer redundant for that blade and a hardware failure could knock out the FC on the blade completely.

I went through the process of updating the firmware and EDC images on the QMI2582 cards, nailed up the connection speeds to 8Gb on the switch and HBA ports, and ensured that the EDC image selection was for the Brocade switch. I have also updated the firmware on the brocade switches to the latest (Fabric O/S 6.4.2b) and have verified the FPGA versions on them. The combination of updates did settle down some of the CRC errors, but some still persist. I have also managed to eliminate some of the CRC error prone connections by playing shell games with the blades between slots. It is disturbing that this actually works, telling me that there is still some marginal signal integrity issue in the overall scheme of things for which the EDC firmware cannot totally compensate on the QMI2582.

Here is the "scli -i" output from XenServer for one port of the QMI2585 modules:

-------------------------------- Host Name                      : bceng2-xs8 HBA Instance                   : 0 HBA Model                      : QMI2582 HBA Description                : QMI2582 QLogic 8Gb Fibre Channel Expansion Card (CIOv) 

for IBM BladeCenter HBA ID                         : 0-QMI2582 HBA Alias                      : HBA Port                       : 1 Port Alias                     : Node Name                      : 20-00-00-24-FF-26-CD-92 Port Name                      : 21-00-00-24-FF-26-CD-92 Port ID                        : 03-03-00 Serial Number                  : LFD1111L54074 Driver Version                 : 8.03.07.03.55.6-k2 BIOS Version                   : 2.13 Driver Firmware Version        : 5.06.05 (90d5) Flash BIOS Version             : 2.13 Flash FCode Version            : 3.17 Flash EFI Version              : 2.38 Flash Firmware Version         : 5.06.03 Actual Connection Mode         : Point to Point Actual Data Rate               : 8 Gbps PortType (Topology)            : NPort Target Count                   : 6 PCI Bus Number                 : 36 PCI Device Number              : 0 PCIe Max Bus Width : x8 PCIe Max Bus Speed             : 5.0 Gbps PCIe Negotiated Width          : x4 PCIe Negotiated Speed          : 5.0 Gbps HBA Status                     : Online --------------------------------


I have run out of ideas regarding this and am looking for help to further diagnose and ultimately solve the issue. The support folks are quite happy just changing out QMI2582 modules until something works, but that is not an end solution, it is just a band aid. I am sure that I am not the only experiencing this issue and want to help come up with a real end solution. So folks, please weigh in.
Updated on 2013-03-24T13:04:35Z at 2013-03-24T13:04:35Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    3234 Posts
    ACCEPTED ANSWER

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-08-31T07:37:33Z  in response to JeffDomogala
    Perhaps you saw this already
    http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5087103&brandind=5000020
    It's looks like all FW and driver versions are ok. Following recommendation for configuration 3 you have to configure fixed speed for HBA and internal FCSM ports and check if fill word is "1" for them.
    • SystemAdmin
      SystemAdmin
      3234 Posts
      ACCEPTED ANSWER

      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

      ‏2012-08-31T08:06:47Z  in response to SystemAdmin
      Sorry, i reread more attentively your message and saw you did the rest too.
      Try to set speed to 4Gb, at least for 1-5 slots
      http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5087397&brandind=5000020
      and set CIOv to work in PCI-E gen1 speed
      http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5084852&brandind=5000019
      Retains doesn't describe your case, but this doesn't mean (IMHO) such or similar symptoms may not appear in analogous situations.
      • JeffDomogala
        JeffDomogala
        21 Posts
        ACCEPTED ANSWER

        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

        ‏2012-08-31T13:20:52Z  in response to SystemAdmin
        Why should I have to cripple the connection speed? If it is just a test for the slots in question that is fine. However, my problems are not limited to slots 1 through 5 in either chassis. Nor is it confined to connections to just the slot 3 or slot 4 FC switches. The only clean switch is slot 4 in one of the bladecenters. When I mean clean there are no CRC errors on any of the 14 bay ports according to the corresponding FC switch, and there are never any HBA errors reported by the O/S.

        Here is the compiled list of trouble slots at the moment:
        Chassis 1:
        I/O module 3 - slots 10 and 11
        I/O module 4 - clean
        Chassis 2:
        I/O module 3 - slot 3
        I/O module 4 - slot 9
        • SystemAdmin
          SystemAdmin
          3234 Posts
          ACCEPTED ANSWER

          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

          ‏2012-09-03T14:31:39Z  in response to JeffDomogala
          Really something very strange and looks like a hardware issue. Did you contact support?
          • JeffDomogala
            JeffDomogala
            21 Posts
            ACCEPTED ANSWER

            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

            ‏2012-09-04T14:57:14Z  in response to SystemAdmin
            I'm about to engage support. I'll report back once I have some information to share.
            • Fani-IBM
              Fani-IBM
              1 Post
              ACCEPTED ANSWER

              Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

              ‏2012-09-26T07:24:02Z  in response to JeffDomogala
              Any update?
              • JewelGuy
                JewelGuy
                10 Posts
                ACCEPTED ANSWER

                Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                ‏2012-10-01T16:15:05Z  in response to Fani-IBM
                We are experiencing very similar issues.

                Have you reached a resolution to your situation?
  • JeffDomogala
    JeffDomogala
    21 Posts
    ACCEPTED ANSWER

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2012-10-02T00:44:23Z  in response to JeffDomogala
    Hey folks. I've been pretty busy and still intend on pursuing this issue. I can tell you that I did switch a pair of of the switches between the two bladecenters and the problem tracks different blades for both the same switch in the other blade and the other switch in the original chassis position. So this seems to be truly random behavior. Given my electrical engineering experience I am still hanging my hat on this being a signal integrity issue that the EDC aboard the QLogic adapters cannot totally correct with the Brocade switches. I am wondering if either the QLogic 8Gb switch 44X1905 or QLogic 4/8Gb switch 88Y6406 have the same problems as with the Brocade switches when connecting to the QLogic adapter cards.
    • JeffDomogala
      JeffDomogala
      21 Posts
      ACCEPTED ANSWER

      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

      ‏2012-10-02T02:12:07Z  in response to JeffDomogala
      I now have cases open for both of the chassis involved. The first data they had me send were the management module service logs and the output of "supportshow" from all 4 FC switches. I'll update once I hear back.
      • JewelGuy
        JewelGuy
        10 Posts
        ACCEPTED ANSWER

        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

        ‏2012-10-02T12:47:53Z  in response to JeffDomogala
        We are experiencing similar situation and are very curious to see what the resolution is.
      • HajoEhlers
        HajoEhlers
        72 Posts
        ACCEPTED ANSWER

        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

        ‏2012-10-02T19:26:08Z  in response to JeffDomogala
        Please read
        * Supported configurations for 8 Gigabit Fibre Channel - IBM BladeCenter H (7989, 8852)
        https://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5087103

        * Bluescreen, Kernel panic, or PSoD with CIOv or CFFh adapter installed - IBM BladeCenter HS22, HS22V
        https://www-947.ibm.com/support/entry/myportal/docdisplay?brandind=5000019&lndocid=MIGR-5084852

        Note :
        * We have similar problems and switch to 4GB FC fixed.
        * Used Hardware:
        QLOGIC 8Gb Intelligent Pass-thru Module for IBM ?BladeCenter - 44X1907
        QLOGIC 8Gb Fibre Channel Expansion Card (CIOv) - 44X1945

        cheers
        Hajo
        • JeffDomogala
          JeffDomogala
          21 Posts
          ACCEPTED ANSWER

          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

          ‏2012-10-02T19:49:53Z  in response to HajoEhlers
          Hajo,

          I have a supported configuration (#3) listed in the MIGR-5087103 document. I should not have to back down on performance due to some hardware design issue. IBM advertises that 8Gb is supported in that configuration, so I expect for the huge amount of cash I have invested in the hardware that I should be able to use 8Gb without any problem. IBM has to figure out how to solve the issue. Me and JewelGuy cannot be the only ones that have this configuration that have these CRC issues. I can tell you that so far I already have a man-month of my time (spread throughout 10 calendar months) invested trying to solve the issue. This includes swapping FC switch modules, shuffling blades, etc... to get a minimum problem set of FC connections. For those problematic connections, I have to disable the switch ports so that the O/S doesn't have to endure 30 second timeouts every time a CRC error occurs. By disabling these channels I have effectively disabled FC redundancy for those blades. So far I haven't gotten bitten by this, but it is only a matter of time.
          • JewelGuy
            JewelGuy
            10 Posts
            ACCEPTED ANSWER

            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

            ‏2012-10-02T20:01:37Z  in response to JeffDomogala
            We are being asked to set the external ports as fillword = 3, along with set all the port speed to fixed 4G.
      • JeffDomogala
        JeffDomogala
        21 Posts
        ACCEPTED ANSWER

        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

        ‏2012-10-03T21:11:04Z  in response to JeffDomogala
        Update. They never called me back since Monday. So, I called them back just now. I exdplained to the tech that I went through all of the firmware/BIOS/settings criterion listed in the "Supported Configurations" document. He went through the logs and verified that is was all correct and then had me dump the Dynamic System Analysis logs from one of the blades to send to him. He verified that last piece of data he needed and now the cases go to level 2 support.
        Chatting with the tech while waiting for "stuff", 8Gb fibre channel has been a common problem between the internal switch ports and the HBA. He could not tell me if QLogic switches had fewer problems connecting to their own HBAs on the blades. He just said that they have seen a lot of problems in that area. Unless some upper level tech can come up with something else, I am going to press them to let me try one of the QLogic 8Gb switches. I would think they would have done this before, but I'll find out. It may be a few days before I am contacted by the level 2 tech.
        Two things about support I do not like so far that I experienced:
        1 - If a call gets dropped in the middle of a call, they do not call you back. That happened today with the first guy, I called back and no-one knows who took the call. So I had to start over again with another guy, who's name and contact information I wrote down immediately.
        2 - There is no way of tracking the cases online if you phoned in the case. You have to call to find out anything. All other companies I deal with have some way of checking status of support requests online. I don;t know why IBM is not on the cutting edge in this regard.

        I'll follow up again once I have another conversation with the level 2 tech. I was told that this will get solved. We'll see...
        • HajoEhlers
          HajoEhlers
          72 Posts
          ACCEPTED ANSWER

          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

          ‏2012-10-04T08:15:52Z  in response to JeffDomogala
          > IBM advertises that 8Gb is supported in that configuration
          Advertising and delivering are two different words ;-)

          Regarding support from IBM
          You should get a Problem Management Record (PMR) Number where all the information should be logged.

          Best parctise for me:
          Either open a call via phone or email.
          I give the following information:
          - Our customer number
          - hw serial no. and hw type
          - short description of the problem.

          Then i get a PMR number otherwise it is not handled correctly by IBM ( Since IBM does not even knows about it)
          cheers
          Hajo
        • JewelGuy
          JewelGuy
          10 Posts
          ACCEPTED ANSWER

          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

          ‏2012-10-04T13:03:10Z  in response to JeffDomogala
          Jeff, yesterday we set all of our external blade ports (all ISL's to another Brocade switch) to fillword = 3. We haven't seen any CRC errors on the blade switches since we made the change. We are not closing the PMR quite yet, we'll monitor over the weekend.
          • JeffDomogala
            JeffDomogala
            21 Posts
            ACCEPTED ANSWER

            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

            ‏2012-10-04T13:27:34Z  in response to JewelGuy
            JewelGuy- Was this with the speed set to 8Gb or 4Gb?
            • JewelGuy
              JewelGuy
              10 Posts
              ACCEPTED ANSWER

              Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

              ‏2012-10-04T13:36:34Z  in response to JeffDomogala
              It was with the ports fixed at 4G. We had to go the 4G route because of our IBM XIV Gen2, which only has 4G ports, and there is a bug in the code we are at that it where the XIV play well with ports that aren't fixed at 4G.
              • JewelGuy
                JewelGuy
                10 Posts
                ACCEPTED ANSWER

                Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                ‏2012-10-09T18:07:33Z  in response to JewelGuy
                Update: We are still experiencing CRC errors on one of our servers within the blade center, even after setting the fillword to 3. IBM support is still working the case.
                • JeffDomogala
                  JeffDomogala
                  21 Posts
                  ACCEPTED ANSWER

                  Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                  ‏2012-11-30T20:53:19Z  in response to JewelGuy
                  I have been going back and forth with level three support for over a month and now there is a customer advocate involved who is moving things along. There seems to be good news on the horizon. For those with H chassis, a midplane replacement will most likely be in your future. There is an updated midplane that is currently undergoing acceptance testing in the IBM QA labs. There have been changes to the midplane to address the slot to slot variations between the QMI-2582 FC HBAs and the switch modules in I/O bays 3 and 4. I have a conference call on Tuesday morning to find out the specifics about the change. If all goes well I will be hopefully installing the revised midplanes in my pair of chassis later this month. I'll reply back with details aftger the conference call.
                  • HajoEhlers
                    HajoEhlers
                    72 Posts
                    ACCEPTED ANSWER

                    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                    ‏2012-12-03T09:45:35Z  in response to JeffDomogala
                    Hi Jeff,
                    thanks a lot for keeping us updated.

                    Cheers
                    Hajo
                    • JeffDomogala
                      JeffDomogala
                      21 Posts
                      ACCEPTED ANSWER

                      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                      ‏2012-12-05T21:31:06Z  in response to HajoEhlers
                      There is good news on the horizon. There is a new H chassis midplane in the final release process that specifically addresses signal integrity issues between the blade slots and I/O bays 3 and 4. They went into the gory details about it. Here's a quick summary:

                      1 - Removed through vias which were essentially high frequency antennae causing interference and reflections that the HBA EDC could not totally handle. The interplane vias are now back drilled so they do not come to the surfaces, which remove the antenna affect.
                      2 - Better trace length matching and routing for the signals... the lengths varied from 1.5 to 18 inches and now they range from 3 to 12 (or maybe it was 14) inches. Anyway, this is for the better.
                      3 - New midplane blade connectors specifically rated for high speed frequencies. Same form factor, just better signal isolation characteristics (shielding).

                      Along with the chassis midplane change, there will be updates to the QMI2582 operational firmware, EDC firmware and device drivers. As well, there are firmware updates for the blade IMMs and AMM for the chassis. And finally, there will be a code update for the fibrechannel switches in the chassis (Brocades in my case).

                      There will be a new variant of the 8852 H chassis sold with this updated midplane. The model number is 8852-5xU instead of 8852-4xU. If I were to venture a guess IBM will be changing the midplanes for those who are having the same symptoms.

                      I am scheduled to have both of my midplanes changed on the weekend of December 22/23. I will be the first customer getting these in the field, so this should be interesting. Considering that I have two full chassis totalling 28 blades, the time to update all of the firmware/drivers is going to be by far the long pole in the tent. If this all works out, we should be completely CRC error free in all 28 slots. And there should no longer be the need to play the shell games to find out which HBA works best with which slot.

                      I will report back with updates.
                      • SystemAdmin
                        SystemAdmin
                        3234 Posts
                        ACCEPTED ANSWER

                        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                        ‏2012-12-17T22:22:11Z  in response to JeffDomogala
                        Has anyone come across this issue on the Qlogic 8GB pass thru modules. I seem to be having connectivity on issues on the Blade Center H with HS22 blades, whereby the FC port will go down unexpectedly and also seeing similar CRC errors in Qlogic Sansurfer. Seems to be random and cannot affect FC connected blades in anty bay. We purchased the kit back in March 2012 and have only got round to implement.

                        Came across the post making reference to the issues with Brocade switches, so wondering if we have the same issue with the backplane.

                        Any assistance or infromation would be helpful.
                        • JeffDomogala
                          JeffDomogala
                          21 Posts
                          ACCEPTED ANSWER

                          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                          ‏2012-12-18T01:35:31Z  in response to SystemAdmin
                          If the pass thru modules are in bays 3 or 4, then yes. I was told that the other bays connected to the CFF type cards were already optimized for high frequency use. This update of the midplane is supposed to bring the signals to bays 3 and 4 into compliance for high frequency.

                          I'll let you all know early next week what the verdict is as the new midplanes are being installed this Saturday. I really hope this works.
                          • JewelGuy
                            JewelGuy
                            10 Posts
                            ACCEPTED ANSWER

                            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                            ‏2013-01-14T13:23:23Z  in response to JeffDomogala
                            Jeff,
                            Since our busy season has past, we are ready to re-engage IBM with our blade chassis issue. Can you provide an update on your situation and how the blades are operating for you?

                            Thanks.
          • HajoEhlers
            HajoEhlers
            72 Posts
            ACCEPTED ANSWER

            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

            ‏2012-10-04T13:34:20Z  in response to JewelGuy
            In case somebody would like to know what the "fillword=3" is all about
            Extract from - http://www.chris-g.co.uk/wordpress/wp-content/uploads/2011/06/FOS-8G-Link-Init-Fillword-Behavior-v1.pdf

            
            ... To comply with the published FC standards, Brocade introduced options 
            
            for ARB/ARB and IDLE/ARB link initialization/fill word support. However, some 8G devices are not capable of properly establishing links with Brocade 8G Fibre Channel switches when ARB/ARB or IDLE/ARB primitives are used. These 8G devices require the legacy IDLE/IDLE sequence to achieve successful link initialization. To address 
            
            this issue, Brocade has provided the ability to configure any of the three possible combinations (IDLE/IDLE, ARB/ARB, or IDLE/ARB) 
            
            for link initialization and fill words. ...
            

            Other info:
            - https://www.ibm.com/developerworks/mydeveloperworks/blogs/anthonyv/entry/brocade_8_gbps_fibre_channel_switches_and_fill_words?lang=en
            - http://community.brocade.com/thread/6287?start=0&tstart=0
  • SystemAdmin
    SystemAdmin
    3234 Posts
    ACCEPTED ANSWER

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2013-01-03T13:12:17Z  in response to JeffDomogala
    Hi Jeff,

    You are not alone with this!

    Issues I am experiencing are on an H class blade centre with HX5 blades , Lpe1205 HBAs and 2 x 20 port Brocade FC switch in the chassis.

    Same creeping CRC count, same I/O freeze. We are using Hyper-V and this results in a "5120" error in the event log, effectively indicating poor I/O performance. SQL and Exchange servers will also for instance complain of delayed writes (up to 30 seconds plus!).

    Been through the mill with IBM support and taken everything upto the lates firmware, bios, patch level per their supported config document (back end storage is an IBM V7000).

    I would very much appreciate it if you could tell me if the newly installed blade midplanes killed of this issue for you? If so I will be requesting much of the same for my situation.

    Many thanks

    PhoenixTA
    • JeffDomogala
      JeffDomogala
      21 Posts
      ACCEPTED ANSWER

      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

      ‏2013-01-14T14:19:52Z  in response to SystemAdmin
      I waited a bit before posting this just to see what would transpire after a couple of weeks with the new midplanes in place. Out of the two chassis, one chassis is completely error free. The other still has an issue with one slot. I've been working with IBM to narrow this one down. So far I can tell you the problem is not the HBA on the blade. I haven't had the opportunity to swap out FC switch modules yet to see if the problem sticks with the switch port or whether this can still be a midplane issue for that particular slot. But overall, there was a huge improvement. The other 27 blades are fine and happy. At this point I would recommend the new midplane. Even if it does not 100% cure the issue, I can say that it is still like night and day. Just be prepared that you will have to be sure everything is up to date using the Bootable Media Creator update system using the latest individual updates. The pre-canned packages are not totally up to date just yet. Regarding the HBA update, you'll have to go back and individually update the EDC firmware after the system update puts the new firmware on the HBAs. Please ask any questions if you have them. I'll be glade to field them.

      --Jeff
      • SystemAdmin
        SystemAdmin
        3234 Posts
        ACCEPTED ANSWER

        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

        ‏2013-01-28T19:36:07Z  in response to JeffDomogala
        Hi Jeff,

        We have similar issue with a v7000 showing HS23 host as degraded, 44x1920 switchs, 44x1945 cards.

        Tried several things, setting up fillword 1 and 3 on switches, tried fixed speed on cards and switch 8gb and 4gb, upgraded to latest firmware with bomc, upgraded EDC, nothing works.

        Does IBM provided a FRU number of new BCH midplane?

        Thanks in advance!
        • SystemAdmin
          SystemAdmin
          3234 Posts
          ACCEPTED ANSWER

          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

          ‏2013-01-29T08:02:27Z  in response to SystemAdmin
          Hi all

          There is a update firmware 6.4.2b4 for Brocade 8Gb SAN Switch Modules (10 & 20 ports)

          http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5089861&brandind=5000020&myns=x108750&mync=E

          6.4.2b4
          Problem(s) Fixed:
          FOS v6.4.2a includes fixes for the following important defects.
          Defect 357780: Excessive encoding out errors on Brocade 300 ISL ports running at 8G
          Defect 364788 Unreliable speed negotiation may be encountered when BR5470 Cu ports are connected to a 3rd party HBA.
          • SystemAdmin
            SystemAdmin
            3234 Posts
            ACCEPTED ANSWER

            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

            ‏2013-01-29T15:00:47Z  in response to SystemAdmin
            Thanks Alex,

            So theres no need of new BCH midplane?
            • JeffDomogala
              JeffDomogala
              21 Posts
              ACCEPTED ANSWER

              Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

              ‏2013-01-29T15:28:53Z  in response to SystemAdmin
              AJM,

              There is still the need for a new BCH midplane. I acutally had to install that 6.4.2b4 firmware on the brocade switches as part of the midplane upgrade procedure. The switch firmware can't fix the physical signal integrity issues caused by the original layout of the midplane. I was told that bays 7,8,9 and 10 were originally designed for high speed devices like 10Gb ethernet and 8Gb fibrechannel. They are routed to the CFF modules on the blades. That was not the case with bays 3 and 4, which are routed to the CIOv modules on the blades. The midplane went back through a routing/layout cycle so that bays 3 and 4 can handle high speed.
              What they did to try to get around the lack of native high speed support was to put ciruitry and firmware on the CIOv fibrechannel host bus adapters so that they could attempt to compensate for any signal integrity issues to bays 3 and 4. However, the signal integrity characteristics were bad enough that even the compensation circuitry could not totally eliminate all of the problems. Hence the new midplane.
              The Brocade switch firmware is one component of the fix because it also can control characteristics of the copper PHYs between the switch and the blade HBAs. I was told that hose characteristcs were adjusted in 6.4.2b4.

              For those interested in part/FRU numbers from my chassis...

              For the original BCH midplane:
              Part Number: 44X2293
              FRU Number: 44X2302
              For the new BCH midplane:
              Part Number: 00Y3428
              FRU Number: 46C9700

              This informaton came from the management module service.txt file.
              • SystemAdmin
                SystemAdmin
                3234 Posts
                ACCEPTED ANSWER

                Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                ‏2013-01-29T15:44:30Z  in response to JeffDomogala
                Thanks a lot Jeff,

                I´m going to update new firmware on brocade switchs, also I have a pair of switches on high speed bays, I´ll use those switch and see if it works.

                Also I have an open PMR with IBM, I´ll ask for new BCH midplanes.

                Regards!
              • SystemAdmin
                SystemAdmin
                3234 Posts
                ACCEPTED ANSWER

                Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                ‏2013-01-29T16:08:51Z  in response to JeffDomogala
                I looked through fresh BC HMM
                https://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-63570&brandind=5000020
                and found an upsetting sentence:
                Removing and replacing the midplane
                Attention: The different midplane FRUs are not interchangeable. A failed midplane FRU should only be replaced with a midplane having the same FRU part number.
                HMM mentions 3 midplane versions:
                1. BC H chassis v6 and earlier - midplane FRU 25R5780
                2. BC H chassis v8 and later - midplane FRU 68Y6734 (or 44X2302)
                3. BC H chassis v11 - midplane FRU 46C9700.
                There are two version on media tray, first compatible with v6 of midplane, and the second one with v8 and v11. Does really the sentences refers only to media tray compatibility?
                • JeffDomogala
                  JeffDomogala
                  21 Posts
                  ACCEPTED ANSWER

                  Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                  ‏2013-01-29T16:50:03Z  in response to SystemAdmin
                  Pavel,

                  I can tell you that I went from midplane #2 to midplane #3. In each chassis, I have 14 HS22's, an optical drive installed in the media tray, a single management module, bay 1 has a Cisco 3110G, all four high speed bays (7-10) have Cisco 4001s, and bays 3 and 4 have the Brocade FC switches. The HS22's have broadcom 4 port 10G adapters in their CFF slot and the QLogic QMI2582 in the CIOv slot. The only hardware component changed during the upgrade was the midplane itself.
                  Be prepared in that there is a ton of firmware updates to be done durign the process.
                  • JewelGuy
                    JewelGuy
                    10 Posts
                    ACCEPTED ANSWER

                    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                    ‏2013-01-29T18:13:35Z  in response to JeffDomogala
                    Jeff,

                    Did you (or someone in your organization) do the upgrades to the firmware, or did IBM do the upgrades?
                    • JeffDomogala
                      JeffDomogala
                      21 Posts
                      ACCEPTED ANSWER

                      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                      ‏2013-01-29T18:40:14Z  in response to JewelGuy
                      JewelGuy,

                      I did all of the firmware upgrades myself. All of the new firmware is backward compatible with the old midplanes, so I actually had everything upgraded before the physical midplane replacement. I have a total of 28 HS-22 blades, so having someone else do it would have taken forever. It took me probably 24 man hours to do that with that many. Each blade required at least 2 reboots, and the reboot cycle on those is probably 5 minutes or so each. Then there is the process of actually performing the firmware updates in between reboots. Certain things like the MM, blade BIOSes and diags images can be blanket applied through the MM. The adapters on board the blades must be done from the blades themselves. I used the Bootable Media Creator to make a CD image to update everything on the blades. You mount the image on the MM then connect it to the blade as a virtual drive. Then the blade can boot from it.
                      • JewelGuy
                        JewelGuy
                        10 Posts
                        ACCEPTED ANSWER

                        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                        ‏2013-01-30T14:22:08Z  in response to JeffDomogala
                        Jeff,

                        Did you get all the firmware upgrades from IBM website, or did you have to hunt for them?
                        • SystemAdmin
                          SystemAdmin
                          3234 Posts
                          ACCEPTED ANSWER

                          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                          ‏2013-01-30T15:10:00Z  in response to JewelGuy
                          Hi JewelGuy, Jeff:

                          In my experience I used IBM BOMC with latest firmwares.

                          For Qlogic card 44x1945 tried with VMware ESXi 5.0 update kit (We have ESXi 5.1, so I the page there is no ECD firmware for ESXi5.1). so I used IBM firmware for q logic card.
                          http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/Product_detail.aspx?oemid=224
                          Yesterday just updated brocade swithch with latest firmware but it didnt work, V7000 still shows host as degraded, so I guess we need new BCH midplanes.
                          • JeffDomogala
                            JeffDomogala
                            21 Posts
                            ACCEPTED ANSWER

                            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                            ‏2013-01-30T15:48:10Z  in response to SystemAdmin
                            JewelGuy,

                            If you update the firmware on the FC HBAs, you must make sure that the EDC firmware is set to the right image. They have separate images in there for Brocade and Q-Logic. I found that when I updated the firmware on the HBAs that the default image went back to Q-Logic. So that has to be manually set back to Brocade then the adapter will change over the image on the next initialization. You'll see on the screen that it is updating the EDC firmware image. There is no way of telling from an O/S which one is selected that I know of. The safe bet is to just go into the BIOS and force it to update the image to Brocade.
                            Another tip for you is to just nail up the speed settings on both ends. If you are 8Gb, nail it up to 8Gb on the HBA and switch port. The HBA speed setting can also be set through the BIOS. I found that some of my HBAs were actually fixed at 4Gb. So I did two things... 1 - Restore the HBA settings to factory defaults, and 2 - REboot and force the EDC to Brocade and nail up the speed to 8Gb.
                            • SystemAdmin
                              SystemAdmin
                              3234 Posts
                              ACCEPTED ANSWER

                              Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                              ‏2013-01-30T16:48:41Z  in response to JeffDomogala
                              Jeff,

                              Wich firmware do you have

                              Firmware 05.07.01 ?
                              firmware 05.06.03 ?

                              With bomc, I have 05.06.03

                              In this page, IBM supported config is:
                              source: https://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=migr-5087103
                              QLogic 8 Gigabit Host Bus Adapter CIOv/CFFh configuration:
                              Update Driver and/or Bundled Firmware (see note 2)
                              EDC level based on Switch type (see notes 3,4)
                              Firmware 05.07.01 (see notes 3)
                              Speed Setting: Fixed
                              Regards
                              • JeffDomogala
                                JeffDomogala
                                21 Posts
                                ACCEPTED ANSWER

                                Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                ‏2013-02-01T14:41:59Z  in response to SystemAdmin
                                I have 05.07.01. There are two places it needs to be:
                                1 - In the HBA, they have a multi-image package. Be sure to also get the EDC package as well
                                2 - The linux driver (assuming linux). RHEL5 has the firmware in the driver package, RHEL6 has a separate firmware image package which I don't IBM still has on Fix Central. I just actually got it a couple of days ago from them and it will eventuallly be on the site. They didn;t understand that there was supposewd to be a separate firmware package for RHEL6.
                                • SystemAdmin
                                  SystemAdmin
                                  3234 Posts
                                  ACCEPTED ANSWER

                                  Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                  ‏2013-02-01T16:49:09Z  in response to JeffDomogala
                                  Thanks Jeff,

                                  I found this on IBM site

                                  For linux:
                                  https://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=migr-5092325#DOCOS

                                  QLogic FC 8 Gb Multiboot Update for Linux - IBM BladeCenter
                                  Version: 8g-f50701-b213-e238
                                  Release Date:2013-01-03

                                  For VMware:
                                  https://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=migr-5092336
                                  Version:8g-f50701-b213-e238
                                  Release Date:2013-01-04

                                  Still no support for Vmware Esxi 5.1
                                  • JeffDomogala
                                    JeffDomogala
                                    21 Posts
                                    ACCEPTED ANSWER

                                    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                    ‏2013-02-01T17:49:06Z  in response to SystemAdmin
                                    AJM,

                                    You're gonna have to reboot the blade anyway because at a minimum you have to force the update of the EDC image to match your fibrechannel switch. So, instead of looking for the ESX specific version of the update, just create the bootable media with the Bootable Media Creator and use it to perform the update. One of my previous posts has the link to it. Then you can update all of the other components on the blade at the same time.
                                    • JewelGuy
                                      JewelGuy
                                      10 Posts
                                      ACCEPTED ANSWER

                                      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                      ‏2013-02-04T17:11:09Z  in response to JeffDomogala
                                      Jeff,

                                      I am not sure if you mentioned it or if it even applies to you. IBM's recommendation when we were going through this was to lock the ports at 4G. Did you do this pre-midplane replacement? If so, did you keep them at 4G or 8G?

                                      Thanks for all the good information.
                                      • JeffDomogala
                                        JeffDomogala
                                        21 Posts
                                        ACCEPTED ANSWER

                                        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                        ‏2013-02-04T18:38:30Z  in response to JewelGuy
                                        JewelGuy,

                                        I never locked anything to 4Gb, but locked it all to 8Gb. It is simply an unacceptable "hack" to ask anyone to do that. Everything is nailed up to 8Gb.
                                        • SystemAdmin
                                          SystemAdmin
                                          3234 Posts
                                          ACCEPTED ANSWER

                                          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                          ‏2013-02-06T13:17:33Z  in response to JeffDomogala
                                          It looks like with latest Brocade FC firmware something changed.
                                          An guy told that he updates Brocade FC to 6.4.2b4, EDC to 3.37, Qlogic CIOv to 05.07.01 and fixed speed on both ends to 8Gb. At least in his case errors disappeared.
                                          • SystemAdmin
                                            SystemAdmin
                                            3234 Posts
                                            ACCEPTED ANSWER

                                            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                            ‏2013-03-24T13:02:08Z  in response to SystemAdmin
                                            Use the Brocade firmware 6.4.2b4 with the new BC-H midplane FRU 46C9700 ONLY !!! If you don't have this
                                            midplane you have to use Brocade 6.4.2b

                                            T.P.
                                            • SystemAdmin
                                              SystemAdmin
                                              3234 Posts
                                              ACCEPTED ANSWER

                                              Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                              ‏2013-03-24T13:04:35Z  in response to SystemAdmin
                                              http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5087103
                                              • FredAoui
                                                FredAoui
                                                1 Post
                                                ACCEPTED ANSWER

                                                PLEASE LET US KNOW IF THE PROBLEM HAS BEEN SOLVED

                                                ‏2013-10-24T18:02:42Z  in response to SystemAdmin

                                                Hi,

                                                We have exactly the same configuration and we are facing the same problem.
                                                Please could you let us know if the problem has been solved and how?

                                                 

                                                 

                                                Thanks a lot
                                                Freddy

                                                 

                                          • StayGreen
                                            StayGreen
                                            7 Posts
                                            ACCEPTED ANSWER

                                            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                                            ‏2013-06-12T16:46:47Z  in response to SystemAdmin

                                            Hi, I have a problem about EDC 3.37 Update. The following is my blade configuration.

                                            BladeCener H 8852
                                            BladeCentetHS23 7875 x3
                                            CIOv slot with Qlogic QMI2582 8GB FC HBA
                                            Brocade 8GB FCSM on IO slot 3 and 4.

                                            I use the image of qlgc_8gb_ciov_fw_edc_337.iso downloaded in the following site to update EDC FW to 3.37.

                                            https://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5087323

                                            Because there is a note as follows in the  EDC3.37 readme, I poweroff FCSM in slot3 and slot4 before running EDC update.

                                             1.4    Dependencies:
                                             - For this update to work correctly, a link cannot exist from either port of the HBA to the chassis switch.   
                                            

                                            http://download.boulder.ibm.com/ibmdl/pub/systems/support/system_x/qlgc_fw_edc_337-bc.txt

                                            Before updating EDC FW,  I confirmed the HBA FW is the lastest one of 5.07.01. However, I failed to do EDC update and current version check dur to the following error.

                                            Error Message → Unable to find part number,Device.EF0415

                                            Now, we poweroff FCSM and run EDC update but failed to update EDC due to the error messages above.

                                            I wonder we need to poweron FCSM then offline the FCSM internal port  before implementing  EDC3.37 Update because I think FC HBA should communicate something with FCSM during EDC update. Any recommendation or advices about updating EDC 3.37 on HS23 with Qlogic 8GB FC HBA + Brocade 20 port FCSM?

                                             

                                             

                                             

                                             

                                             

                                            Updated on 2013-06-13T04:51:13Z at 2013-06-13T04:51:13Z by StayGreen
                        • JeffDomogala
                          JeffDomogala
                          21 Posts
                          ACCEPTED ANSWER

                          Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                          ‏2013-01-30T15:23:07Z  in response to JewelGuy
                          JewelGuy,

                          Updates for anything blade related can be installed using the Bootable Media Creator (http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=TOOL-BOMC). The latest version is 9.30. With this tool, you select the machine type of the blades in your system and it creates an ISO image (or optionally a bootable USB stick) that you boot the blade with to install updates. With the resulting the ISO image, I use the management module "remote drive" to mount the image on the blade, then boot the blade off of it. It eventually comes up with a menu of updates that can be installed. One annoying thing about it is that since the fibrechannel HBAs have no way of telling which EDC firmware is already loaded on it, you must manually select the items for the fibrechannel HBA to force the update of those components. Once everything is selected all of the updates will be installed on the blade all at once. If you haven;t updated anything on your blades since your original purchase, expect the update installs to take about 20 minutes. In order to get the latest updates for everything, there is an option while creating the ISO image to check the IBM site for the latest available updates for everything. You want to choose that option.
                          For the management module and switch updates, you have to manually retrieve these from Fix Central and then manually install them. The management module update is straight forward. The Brocade switch updates are easiest to do through the web interface. You specify a FTP server with the update image location.
                          When I was going through this, everything was already up on fix central except for the brocade switch firmware image. They threw that over the wall to me. However, someone posted the link yesterday to that image (v6.4.2b4).
                          • HajoEhlers
                            HajoEhlers
                            72 Posts
                            ACCEPTED ANSWER

                            Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

                            ‏2013-01-30T15:56:30Z  in response to JeffDomogala
                            Just for the record.
                            The BOMC is also able to create a PXE bootable image.

                            The main side for the Toolcentr is:

                            Welcome to the IBM ToolsCenter for System x and BladeCenter Information Center
                            - http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/topic/toolsctr/toolsctr_welcome.html

                            where you see under the "Deployment tools" the Bootable Media Creator ( BOMC)

                            cheers
                            Hajo
  • SystemAdmin
    SystemAdmin
    3234 Posts
    ACCEPTED ANSWER

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2013-02-01T10:13:33Z  in response to JeffDomogala
    Hello Jeff,

    We have the same config like you on ~10 Blade H :
    14 x HS22v
    San switch Brocade on slots 3 and 4
    CIOv slot with QMI2582 8 GB
    Midplane parts number 44X2293
    We use vmware (4.1 and 5.0) and we have exactly the same problem like you. VM freeze and CRC error sometimes when we start a blade and sometimes when the I/O is very high. We saw that when it's append mutliple Host are impacted.
    We made like you the upgrade and the IBM recommandation (configuration 3), but the problem appear again a few month after.
    We thing we have to change our midplane, but it's a big works because we are more than 10 Blade H in production.
    Can you tell me if your last problem with one blade are still present after the change ? It's always the same problem ?
    We have a call open to IBM.
    Thanks Jeff we hope that is the solution because we have this problem 10 month ago, and this post is like a star in the dark ;-)
    • JeffDomogala
      JeffDomogala
      21 Posts
      ACCEPTED ANSWER

      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

      ‏2013-02-01T17:45:50Z  in response to SystemAdmin
      Labi,

      I still have one blade that is a problem. However I have been dragging my heels because of other priorities. Regardless of whether this one blade was still a problem or not, a lot of other problems were solved wit hthe new midplane.
  • jlotz
    jlotz
    3 Posts
    ACCEPTED ANSWER

    Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

    ‏2013-07-18T18:17:04Z  in response to JeffDomogala

    Can anyone tell me what the minimum required version for the blade system mgmt processor is for the v11 midplane?  We just had the midplane replaced in the first chassis (went from v6 to v11) and my first test blade with BSMP Build ID 1AOO34Z revision 1.85 works fine, but an older blade with BSMP Build ID YUOOC7E revision 1.30 hangs at the discovery phase, and I get error "Problem communicating with BSMP".  I have an email in to our IBM contact, but was hoping someone here knew the answer.  

    Also wanted to note that when you're going from v6 to v11 midplane, you'll also need to replace your media tray and DVD drive.  Not a big deal, but my FE wasn't aware of that and we had to have the new parts delivered, which caused the maintenance to take longer than it would have otherwise. The new part numbers are documented at: http://publib.boulder.ibm.com/infocenter/bladectr/documentation/topic/com.ibm.bladecenter.8852.doc/8852_pdsg.pdf

     

     

    • T_IBM2002
      T_IBM2002
      3 Posts
      ACCEPTED ANSWER

      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

      ‏2013-10-16T21:07:37Z  in response to jlotz

      Hello All, 

      We have 6 of the following Blade Center Midplanes, can someone  please confirm  if this revision has the bay3/bay4 design flaw?  Thank You.

        Product Name    IBM BladeCenter-H Midplane  
        Description        BladeCenter-H  
        Machine Type/Model        8852HC1  
        Part Number        68Y6729  
        FRU Number        68Y6734  
        Hardware Revision        8  
        Manuf. Date        2812

      • jlotz
        jlotz
        3 Posts
        ACCEPTED ANSWER

        Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

        ‏2013-10-17T14:07:11Z  in response to T_IBM2002

        All versions of the midplane prior to v11 have the flaw.  You have v8.  The v11 part # is 46C9700. 

        We had huge issues with our 6 blade centers for years before finally getting all of the midplanes replaced this summer.  We have now been running perfectly clean for 3 months.  Night and day difference.

    • jlotz
      jlotz
      3 Posts
      ACCEPTED ANSWER

      Re: HS-22 w/QMI2582 / BladeCenter H / Brocade 20 port FC switch CRC errors

      ‏2013-10-17T14:09:50Z  in response to jlotz

      Minimum blade firmware levels for the v11 midplane:
      IMM firmware v1.37 YUOOE9C
      Unified Extensible Firmware Interface (UEFI) v1.19 P9E158A

      http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5092520