Troubleshooting
Problem
The BladeCenter JS21 blade may be a new installation or already in production. The blade contains a QLogic 4 Gb HBA and has been configured to boot from a remote SAN disk. Sometimes, the HBA device port does not come up in the "available" state during the initial boot sequence to AIX, preventing the server from accessing the remote LUN. If an alternate path has been enabled, then the blade may still boot up, but will be connected to the remote disk through the alternate path. If the preferred path is tocontroller A, the blade will boot to controller B. If an alternate path has not been enabled, then the blade will fail to find a bootable device. The blade may be configured to boot from a local drive and to use a remote SAN disk as a data drive. The failure symptom in this case will either be that the blade can only access the data drive through the alternate fiber path, or, if only one path is enabled, the blade can not access (mount) the data drive after booting up. The
Resolving The Problem
Source
RETAIN tip: H193709
Symptom
The BladeCenter JS21 (blade may be a new installation or already in production. The blade contains a QLogic 4 Gb HBA and has been configured to boot from a remote SAN disk. Sometimes, the HBA device port does not come up in the "available" state during the initial boot sequence to AIX, preventing the server from accessing the remote LUN.
If an alternate path has been enabled, then the blade may still boot up, but will be connected to the remote disk through the alternate path. If the preferred path is to controller A, the blade will boot to controller B. If an alternate path has not been enabled, then the blade will fail to find a bootable device.
The blade may be configured to boot from a local drive and to use a remote SAN disk as a data drive. The failure symptom in this case will either be that the blade can only access the data drive through the alternate fiber path, or, if only one path is enabled, the blade can not access (mount) the data drive after booting up.
The AIX error report will show a LINK ERROR on the fcs# adapter.
The Brocade or Cisco 4 Gb chassis switch log will show that the blade port is logged in to the internal switch port, even though the server will not be able to access the LUN. This is because the bug is in the driver, not the QLogic HBA. Once the driver initializes correctly, then the connection to the LUN comes up and stays up. If the remote SAN drive is accessible on the preferred path after the blade boots up, it stays accessible.
Affected configurations
The system may be any of the following IBM servers:
- BladeCenter JS21, type 7988, any model
- BladeCenter JS21, type 8844, any model
The system is configured with one or more of the following IBM Options:
- QLogic 4 GB SFF Fibre Channel Expansion Card for IBM BladeCenter, Option part number 26R0890
- QLogic 4 Gb Fibre Channel Expansion (CFFv) Card for IBM BladeCenter, Option part number 41Y8527
The system is configured with at least one of the following:
- AIX 5.3 TL6, any service pack
- AIX 5.3 TL7, any service pack
- AIX 6.1, any service pack
Note: This does not imply that the network operating system will work under all combinations of hardware and software.
Please see the compatibility page for more information:
Solution
Apply the QLogic driver fix for AIX 5.3 defect #647401.
Information for APAR IZ17012 may be found at the following URL:
Workaround
Set the initial link parameter for the QLogic HBA to point-to- point mode. It defaults to FC-AL loop mode.
- Use the following AIX command lsattr -l fcs0 -E to list the current HBA attribute values.
- Use the following AIX command chdev -l fcs0 -a init_link=pt2pt -P to change the init_link attribute to pt2pt
Note: The -P option requires a reboot to take effect.
Additional information
The issue was originally identified between a QLogic HBA and a Brocade switch, however, the issue applies to a Cisco fibre switch module as well.
The QLogic HBA driver is written such that the ports on the card always come up in FC-AL (loop) mode first then switch to point-to-point mode after negotiating with the switch. The negotiation does not always complete before the initialization command to the driver times out, in which case the driver reports the device as "defined" (unavailable) after AIX finishes booting up. The driver may be initialized again using the cfgmgr command, after the port negotiation with the chassis switch has been completed. The driver will then report the fcs# device as "available".
The fix actually applies to many versions of AIX per the following release info:
- APAR IZ17012
AIX 5300-06-07-SP, currently scheduled to be released by 5/28. - APAR IZ16430
AIX 5300-07-04-SP, currently scheduled to be released by 5/28. - APAR IZ16410
AIX 5.3 TL8 / 53N, fix will ship with 5300-08 in 1H 2008. - APAR IZ16395
AIX 5.3 TL9 / 53Q, fix will ship with 5300-09 in 2H 2008. - APAR IZ16384
AIX 6.1 TL1 / 61B, fix will ship with 6100-01 in 1H 2008. - APAR IZ16352
AIX 6.1 TL2 / 61D, fix will ship with 6100-02 in 2H 2008.
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
29 January 2019
UID
ibm1MIGR-5076991