Troubleshooting
Problem
1.0 Active PCI installation overview for Novell NetWare V5.1/v6.0 1.1 Overview The Active PCI support for NetWare V5.1/v6.0 provided by IBM is compliant with many different specifications including full compliance with the NetWare System Bus Driver
Resolving The Problem
|
1.0 Active PCI installation overview for Novell
NetWare v5.1/v6.0/v6.5
|
|---|
|
1.1 Overview
|
|---|
The Active PCI support for NetWare v5.1/v6.0/v6.5 provided by IBM is compliant with many different specifications including full compliance with the NetWare System Bus Driver specification v1.01, PCI v2.2, PCI-X 1.0 and the Hot-Plug PCI v1.1 specification.
More details on performing hot-plug operations in NetWare are available in the Active PCI-X readme file which comes with the driver diskette. Go to Servers - File and device driver downloads, select the server type, and select category hot plug or Active PCI. The readme file can be viewed on-line without downloading the diskette.
Note: An Active PCI is sometimes referred to as a
hot-plug PCI.
The IBM Active PCI system support consists of an interlock switch
and a set of two LEDs for each Active PCI slot. One LED will remain
on when power is On to the given slot. The other LED will indicate
that attention is required to that slot. Slots that do not have
these devices are not Active PCI slots. All Active PCI
operations must be controlled through a console provided by
Novell.
Warning: In no case should active PCI operations be done
without first removing power from the slot through the Netware
Configuration Manager Console (NCMCON). If a slot does not have a
latch, do not remove the adapter. Doing so can cause serious damage
to the system and adapter.
The order of events in any active PCI operation are:
- Load the Active PCI-X driver for IBM Systems to provide Active PCI support.
- Go to the NetWare Configuration Manager Console (NCMCON).
- Select an add or remove operation for the given slot.
- When the adapter power is Off, open the latch and remove, replace or add an adapter and any necessary cabling.
- Close the latch and return to the NCMCON console.
Note: The power to a slot cannot be turned On until the latch is closed for the given slot. The position of the latch (opened or closed) is not important if no adapter is in the given slot.
In the case of adding a new adapter or replacing an old one, the NCMCON console will prompt you to turn on the newly added adapter.
If selected, the adapter power is turned On and the adapter is configured. At this point, if you are using a LAN driver, HWDETECT.NLM will run and load the appropriate driver for you. If you are using any other type of adapter, you will have to go to the system console and manually load the appropriate driver.
Note: In the case of a SCSI controller, you may need to issue the console command SCAN FOR NEW DEVICES before the drive is recognized by NetWare.
|
1.2 Limitations
|
|---|
- On the IBM eServer xSeries 255, slots 2 and 3 only support hot-adding of devices capable of 133Hz PCI-X operation. Hot-adding of all other devices will result in a bus speed mismatch error because of the embedded device residing on that bus. This embedded device causes the bus to operate at 100MHz PCI-X mode by default.
- You cannot load the driver multiple times within the same boot session. If the driver is loaded, adapters are hot-added, and then the driver is unloaded and reloaded, erroneous behavior will occur.
- Multifunction PCI devices are supported as long as the multifunction capability is not provided with or through a PCI-to-PCI bridge.
- Video adapters are not supported in the Active PCI environment due to I/O space limitations and restrictions.
- Devices that are not PCI v2.1 compliant or that do not implement the PCI presence detection pins are not supported.
- Device drivers that are not LAN ODI v3.31 compliant or SCSI NWPA v3.00b compliant are not supported.
- Non-PCI devices are not hot-pluggable.
- HWDETECT.NLM, a Novell provided module, may not find the correct driver if more than one driver is capable of loading for the given adapter.
- Unloading the driver before unloading NCMCON causes the system to go down. The correct procedure is to unload NCMCON and then unload IBMXSBD if really needed.
- Because of errors encountered in testing, HWDETECT.NLM is not
provided by IBM. If you want the capability of being able to
auto-detect the correct driver for newly added adapters, download
the latest HWDETECT.NLM from the Novell web site. The URL to find
the latest version of HWDETECT.NLM from Novell's web site is:
|
2.0 Troubleshooting active PCI operations
|
|---|
Troubleshooting includes basic troubleshooting and advanced troubleshooting.
There are several troubleshooting techniques that can be used to determine why an Active PCI operation failed. Two LEDs are provided for each Active PCI-capable slot. One LED blinks to indicate that attention is required. The second LED indicates power state. Messages are generated by the NCMCON screen, IBMXSBD.NLM driver, and the various adapter drivers.
|
2.1.1 Attention indicator LED
|
|---|
The Attention Indicator LED is controlled by NetWare. At present, the Attention Indicator LED is used only by device drivers to indicate that an adapter in a given slot needs attention. Not all device drivers support the attention indicator messaging. The Attention Indicator will only be cleared upon successful replacement of the adapter in the slot or by the device driver clearing the condition that led it to turn on the Attention Indicator LED in the first place.
|
2.1.2 IBMXSBD.NLM messages
|
|---|
The IBMXSBD.NLM Active PCI-X driver for IBM systems generates messages to indicate change of state in the Active PCI system. If the driver is not loaded, no system messages are generated. The messages that you may see generated by the IBMXSBD driver will all be proceeded by the IBMXSBD: or IBMXSBD Error: tags. The following messages may be seen:
- "IBMXSBD: New Adapter added. Please use the Novell
Configuration Manager console (NCMCON) to configure this new
adapter."
- This indicates that a new adapter was added into a previously empty slot. The empty slot can occur as a result of a previous replace or add Active PCI operation. This is an information message directing you to use the console (NCMCON) to configure the newly added adapter.
- "IBMXSBD Error: Allocate NEB.AESTag() failed."
- This message error indicates that the IBMXSBD driver was unable to allocate an asynchronous event system (AES) tag. Because events happen asynchronously in the hot-plug PCI system, IBMXSBD must have an AES handle to function.
- Unloading other NLMs that have AES tags registered will free up AES resources so that the IBMXSBD driver can properly load.
- "IBMXSBD Error: Not enough memory to generate event for queue."
- This error indicates that memory could not be allocated to place an Active PCI event onto the internal resource queue. This normally occurs when the system runs out of available memory. To fix this problem, unload other NLMs or add additional memory to the system.
- "IBMXSBD Error: Add Adapter command failed because of empty
slot. Slot is x"
- This message occurs when NCMCON is directed to add an adapter to a slot that does not currently have an adapter card in it.
- This error may also occur if a PCI adapter does not meet the PCI v2.1 specification requirement that the adapter use presence pins. The IBMXSBD module uses the presence pins to determine when an adapter has been inserted and removed from the system.
- "IBMXSBD: The IBM hot-plug PCI controller is not present."
- This error occurs when an attempt is made to run the IBMXSBD module on a system that does not have an Active PCI controller. This error may also occur if the Active PCI controller is malfunctioning or not working. The -COMMAND -DEBUG switches can be used on the IBMXSBD driver to determine if the Active PCI controller is working. See Section 2.2 Advanced troubleshooting for more information about debug switches for the IBMXSBD device driver. .
- "IBMXSBD: Current operation would exceed bus load limit."
- This error occurs when the bus cannot support electrically adding additional card at current speed and mode. To fix this problem, try adding the card into a different slot on a different bus.
- "IBMXSBD: PCI-X mode mismatch for slot: x"
- This error happens when the bus is operating at a particular mode, and the card in slot x is not capable of operating in that mode, for example, if the bus is operating at PCi-X mode, and the card in slot x is 33 PCI, or 66 PCI. To fix this problem, try adding the card into a different bus. You can find out information about buses from NCMCON.
All other messages are provided through the Novell Configuration Manager Console (NCMCON) screen. For information about problems in NCMCON, such as, all slots show "No" in the Hot-Plug field or a message appears asking whether to continue waiting another 10 seconds, see Section 2.2.4.
Note: NCMCON currently does not update automatically the status of the slots reported by IBMXSBD, such as current bus speed, and so forth. As a result, there would sometimes be a message that the adapter "Failed" after simply closing the latch on eServer xSeries 440, even if no power fault occurred. In this situation, unload NCMCON and load it again. This will produce the correct status of "Powered off".
|
2.1.3 Adapter driver error messages
|
|---|
Many non-Active PCI device drivers will not load correctly with the ODI v3.31 or the NWPA v3.00b specifications. If you have such a device driver, first determine if your adapter vendor has a certified driver available. If no such driver exists, you will need to either use a different adapter or lose at least some of the Active PCI functionality.
Drivers that do not support the Novell specifications might exhibit such problems as:
- Failure to see new adapters until unloaded and reloaded
- Failure to support single instance unload (Single instance unload is the ability to unload the driver for a single adapter even though the driver might be supported by many adapters.)
- Failure to be able to register PCI resources such as interrupt, I/O ports, memory or prefetchable memory ranges
This section contains information that is more technically advanced than the Basic troubleshooting section. Information contained in this section includes:
- How to determine if a problem loading an adapter device driver is caused by the Hot-Plug Controller, IBMXSBD.NLM, or the device driver itself
- Debug switches available with the IBMXSBD.NLM driver
- How interrupts are assigned
- NCMCON issues such as no "Active PCI slots" showed, or a query about waiting another ten seconds
|
2.2.1 How to determine if the problem is with the
adapter driver or the Active PCI driver
|
|---|
Note: There are no changes to the physical adapter required to support Active PCI operations.
Many other adapter vendors will be providing drivers that meet the required Novell specifications for Active PCI in the future. However, there are times where you may have a driver and will not know whether the driver is Active PCI compatible. The following is a list of methods to determine if a given driver supports Active PCI operations.
- Almost any driver will allow a singular instance in the Active PCI environment.
- The device can be powered Off through the NCMCON screen with a
driver loaded or has a command-line option to remove a single
instance of the driver.
Note: A driver that forces all instances of itself to be unloaded is not considered an Active PCI-X driver.
- Can the driver detect newly added adapters after Active PCI
operation?
Note: Many adapters will not detect new slot numbers because they do a static scan for cards on their first load. If a new card is added, the driver will report that no slots are available. Unloading the adapter (and sometimes the underlying support module, TOKENTSM.NLM, ETHERTSM.NLM, and so forth) and reloading them will allow the new slots to be seen in most instances. However, this will require unloading the driver for all adapters in the system.
- Does the driver complain about PCI configuration problems?
Note: Some older drivers do not allow for PCI configuration resources to be reallocated or changed. These drivers will report that one or more of the PCI resource requirements (IRQs, I/O ports, Memory, or Prefetchable Memory) are not correct/unavailable.
- If new device drivers, support modules, and installation of the latest Support Packs does not fix the problem, then proceed to Section 2.2.2 to debug the IBMXSBD.NLM module.
There are a number of command line switches that can be used to create a SCAN FOR NEW DEVICES before the drive is recognized by NetWare.
There are a number of command line switches that can be used to create an IBMXSBD.LOG file in the specified directory on the hard drive. This file can be looked at to verify that the IBMXSBD.NLM module is working properly. The following are the command line switches and a description of each:- The following commands are case sensitive, but they they require the leading hyphen. If multiple commands are given, they should be separated by spaces. (For example, LOAD IBMXSBD -DETECT DBGALL -COMMAND)
- /?,-?, ? - These flags are used to print out the following list of switches.
- DETECT - This parameter will allow the IBMXSBD.NLM driver to detect if the Active PCI system and controller are detected. The driver will print a message to the system console as to its finding and then terminate. This flag cannot be used with any other flag.
- DBGALL - This parameter must be listed for any debug actions to be logged to the IBMXSBD.LOG file. Various system information including version of NetWare, PCI BIOS discovery, Novell Event Bus (NEB) events, Novell HIN numbers, driver deregistration, memory release traces and the PCI Interrupt Routing Options table are documented.
If a problem arises where you must create the IBMXSBD.LOG file, the file can be submitted as part of the call to the IBM Support Center for further diagnostic help.
|
2.2.3 Interrupt handling
|
|---|
Interrupts are handled in the following manner:
- Only interrupts 9, 10, and 11 will be assigned to Active PCI devices. These options cannot be set. If during BIOS setup, these three interrupts are reserved for ISA Legacy devices, all Active PCI operations will fail.
- IRQ 15 will never be assigned to Active PCI devices unless an adapter already installed in the server of the same type has already been assigned to IRQ 15.
- Interrupts are preserved from the MPS table located in the extended BIOS data area (EBDA) of the system.
- For slots that do not have an adapter in them at boot,
interrupts are assigned according to the following formula:
- If another device that matches the vendorID and deviceID in the PCI Configuration header is found, then the newly added adapter will receive the same interrupt as the other device.
- An unused interrupt between 9, 10, and 11 is assigned.
- The least used interrupt between interrupts 9, 10, and 11 is assigned.
Note: There may be instances where a LAN and SCSI controller may be assigned the same interrupt. If this occurs, the LAN adapter may fail to handle interrupts correctly when the SCSI controller accesses the DOS partition and lose connections. If this occurs, it will be necessary to down the server, reboot it, go into System Setup, and manually assign the interrupts so the devices do not share interrupts.
If NCMCON.NLM fails to report the correct bus speed at which any of the busses is operating, exit NCMCON and reload NCMCON.
If NCMCON.NLM fails to show any slots with "Yes" in the Hot-Plug field, the IBMXSBD.NLM module is not loaded. To fix the problem, exit NCMCON, load IBMXSBD at the system console, and reload NCMCON.If you receive a message about Slot Status not being available with a query to wait another 10 seconds, NCM.NLM or one of its support modules, IOCONFIG.NLM or NEB.NLM is not loaded. To fix this problem, exit NCMCON, load the appropriate support modules, load NCM, then NCMCON. It is not necessary for NCM or NCMCON to be loaded for the IBMXSBD driver to load properly..
If a fully supported ODI v3.31 LAN device driver cannot be unloaded, verify that ODINEB.NLM is loaded. It can be loaded at the system console after NEB.NLM is loaded.All SCSI controller drivers must be unloaded from the system console. Use the following command to remove a single adapter and its volumes from the system:
REMOVE STORAGE ADAPTER Ax
where x is the adapter number as determined by the scan order of the PCI buses in the system. All integrated devices will be assigned first, followed by adapters in PCI Bus 0, PCI Bus 1, and so forth.When using failover pairs in conjunction with Active PCI, you should not set up failover pairs using the Netfinity Ethernet Adapter 2 and the Netfinity Fault Tolerant Ethernet Adapter as two independent pairs. Failover pairs should consist of one adapter type or the other, but not both simultaneously.
|
3.0 Disclaimer
|
|---|
THIS DOCUMENT IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. IBM DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE AND MERCHANTABILITY WITH RESPECT TO THE INFORMATION IN THIS DOCUMENT. BY FURNISHING THIS DOCUMENT, IBM GRANTS NO LICENSES TO ANY PATENTS OR COPYRIGHTS.
Note to U.S. Government Users -- Documentation related to restricted rights -- Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corporation.
Document Location
Worldwide
Was this topic helpful?
Document Information
Modified date:
28 January 2019
UID
ibm1MIGR-42672