The troubleshooter's guide for managing storage using IBM Systems Director

The purpose of this article is to document the troubleshooting methodologies and the best practices to be followed while managing the storage component and also provide the information about what needs to be verified in an entity, such as IBM® Systems Director, LSI, NetApp Eagle provider, and so on to overcome certain issues. In short, this article would serve as a one-stop reference for debugging storage management issues more efficiently.

Shruthi Sanchana Venkateswaran (shruvenk@in.ibm.com), Software Engineer, IBM China

ShruthiShruthi works as a Software Engineer for workload-optimized systems. She handles the management of storage component, and has around three years of experience in handling various issues. Also, she has interacted with various teams, such as IBM Systems Director and LSI/NetApp Eagle Provider to understand the managing process and has come up with useful workarounds and some times fixes to prevalent issues.



22 April 2013

Also available in Chinese

Preface

Managing most of the storage devices using IBM® Systems Director is not a straightforward process in itself. Unlike managing many other components, it requires certain management software, such as Storage Management Initiative Specification (SMI-S) provider, which are obtained from third-party vendors. In such an environment where more than one entity, such as IBM Systems Director, SMI-S provider, IBM AIX® operating system, and so on is involved, we tend to face certain challenges and it becomes necessary for us to be aware of certain details with respect to the entities to ensure smooth functioning of the system. While we already have information on how to manage storage, in detail, what we do not have is the best practices to be followed while managing storage.

This article applies to IBM AIX systems having IBM Systems Director with version 6.2.x onwards and LSI SMI-S provider with version 10.10.55.x and NetApp SMI-S provider with version 10.27.x.

The IBM Systems Director and the SMI-S provider can either exist on different systems, as shown in Figure 1, or can exist in the same system as well. Hence, appropriate care must be taken to run the SMI-S provider commands on the system on which SMI-S provider is installed, and IBM Systems Director commands must be ran on the system on which IBM Systems Director is installed.

Figure 1. SMI-S provider connection illustration
SMI-S provider connection illustration

Assumptions and terminologies

  • For easier understanding, we will term the command to be ran on a system in which IBM systems director is installed as "on the IBM Systems Director system" , and the command to be ran on a system in which SMI-S provider is installed as "on the provider system".
  • All the commands mentioned in the article are based on command-line interface (CLI). Hence, it is required that you need to be logged in as "root" user on the respective systems to be able to run these commands.
  • LSI SMI-S provider is now renamed to NetApp SMI-S provider. So, the value for <PROVIDER_PATH> mentioned along with SMI-S provider commands throughout the article can either be: /opt/lsi/pegasus for LSI SMI-S provider or /opt/netapp/pegasus for NetApp SMI-S provider.
  • The assumption here is that, the process of installing the SMI-S provider and IBM Systems Director , adding storages to the SMI-S provider and updating the SMI-S provider are already known to the reader.

Troubleshooting checklist

This checklist is useful when the IBM Systems Director managed endpoint is not displaying the expected values in its attributes, mainly CommunicationState and AccessState. The ideal values for these attributes are 2 and Unlocked respectively.

Note: The following needs to be followed in the same order mentioned below to get the necessary results. These steps are documented based on past experiences only and may not yield successful results at all times.

Verify status of cimserver

Symptom in IBM Systems Director: CommunicationState is 1 which means 'Not Available'

The status of the cimserver of the SMI-S provider can be verified based on various parameters. It is mandatory that all of these return positive results to confirm that the cimserver is completely up and running. On the provider system:

  1. Run: <PROVIDER_PATH>/bin/cimcli ns

    When active, the above command returns the list of name spaces.

    When inactive, it returns:
    Pegasus Exception: Cannot connect to local CIM server.
    Connection failed. Trying to connect to

  2. Run: netstat -an | grep 35988

    When active, the above command returns: tcp 0 0 *.35988 *.* LISTEN

    When inactive, it returns nothing.

  3. Run the following commands:

    a) cat /tmp/cimserver_start_generic.conf

    b) ps -ef | grep cimserve

    When active, the process ID shown in the file in step a will be running as a [cimserve] process in step b).

    When inactive, the output of step a will be:
    cat: 0652-050 Cannot open /tmp/cimserver_start_generic.conf

    Note: It is expected that the output of step b will have either one [cimserve] process or two[cimserve] processes running. If two [cimserve] processes are running, one of the [cimserve] processes having process ID shown in step b represents the SMI-S provider cimserver and the other [cimserve] process represents the AIX cimserver.

Recover status of cimserver

  • If cimserver is not running completely, which is indicated by negative results for all the above three steps, then start the cimserver using <PROVIDER_PATH>/bin/cimserver.
  • If cimserver is hung and not responding to queries, which is indicated by negative results for certain steps and positive results for the other steps, then kill the existing cimserver processes using process ID obtained in step 3 above: kill -9 <Process ID>, then start the cimserver using PROVIDER_PATH>/bin/cimserver.

Verify password of cimuser

Symptom in IBM Systems Director: AccessState of storage managed endpoint is Locked.

The password of the cimuser should match with the password of the server on which SMI-S provider is installed, failing which IBM Systems Director makes the AccessState field of storage managed endpoint into a locked state.

To verify whether the passwords of cimuser and the provider system are the same, on the provider system, run:

cimcli ns -u root -p <password of the provider system> -l 127.0.0.1:35988

  • When password of cimuser is same as the provider system password, the above command returns the list of namespaces.
  • When password of cimuser is different from the provider system password, the above command returns:
    cimcli Pegasus Exception: HTTP Error (401 Unauthorized).. Cmd = ns Object =

Fix password of cimuser

  • To fix the password of cimuser, run:
    <PROVIDER_PATH>/bin/cimuser -m -u username [ -w old password ] [-n new password]

new password – make sure that this password is same as the password of provider system.

  • If the old password of cimuser is not known, then remove the existing cimuser:

    <PROVIDER_PATH>/bin/cimuser -r -u root

Again, add the user using the password of the provider system:

cimuser -a -u root -w  
<password>

Verify status of storage controllers addition

Symptom in IBM Systems Director: Storage managed endpoint is missing from IBM Systems director.

A storage has two controllers, which has to be mandatorily added to the SMI-S provider installed on a server, for the IBM Systems Director to be able to discover the storage and list the storage managed end points. To verify, run the following command on the provider system:
cimcli ei CIM_ComputerSystem -n root/lsiarray13 | grep OtherIdentifying

Refer to Listing 1. If the controllers are not listed, then add the storage controllers to the SMI-S provider.

Listing 1. Sample output indicating controllers addition
# cimcli ei CIM_ComputerSystem -n root/lsiarray13 | grep OtherIdentifying
  OtherIdentifyingInfo = {"30333936392030333832332051104947"};
  OtherIdentifyingInfo = {"172.23.4.211", "0000:0000:0000:0000:0000:0000:0000:0000",
  OtherIdentifyingInfo = {"172.23.4.212", "0000:0000:0000:0000:0000:0000:0000:0000",
  The above output indicates that, Storage box with 
  Controller A IP: 172.23.4.211 and             
  Controller B IP: 172.23.4.212 are added to Provider.

Restore IBM Systems Director managed endpoint to ideal values

After all the steps in the troubleshooting checklist show positive results, the following steps help in restoring CommunicationState and AccessState in IBM Systems Director to ideal values. On the IBM Systems Director system:

  1. Run the following command :smcli lssys -t StorageSubsystem -A AccessState,CommunicationState

    AccessState should be Unlocked, and CommunicationState should be 2. If it is not, proceed to step 2.

  2. Run the lite query .
    1. Run: smcli lssys -i <IP Address> -oT , where IP Address is the IP address of the provider system
    2. Obtain the OID mentioned in step a above for the OperatingSystem resource type. Example: stg2j5p01, OperatingSystem, 0x1569
    3. Convert the OID to decimal and run the following command:
      smcli querysystem <Decimal OID> 1

Repeat step 1 and if the result of step 1 is negative, then proceed to step 3.

  1. Run the discovery and accessys:
    1. smcli discover -i <IP address>
    2. smcli accesssys -i <IP address> -u <username> -p <password> , where IP address in a and b denotes the IP address of the Provider System.

Repeat step 1 and if the result of step 1 is negative, then report the problem with IBM Systems Director / SMI-S provider teams. Refer to the Reporting a SMI-S provider problem section for more details.


Best practices

While managing storage with IBM Systems Director, it is critical that the following best practices are followed in order to ensure smooth functioning.

Ensure cimserver is in active state, after a system reboot

When a system on which the SMI-S provider is installed is rebooted, the cimserver needs to be active, failing which IBM Systems Director may not be able to communicate with the storages, resulting in the storage managed endpoints to go into an offline state. The following tasks needs to be done on the system to make sure that cimserver is active after a reboot by default. On the provider system:

  • Create a cimserver startup script with the contents mentioned in Listing 2 and place it in a secure location, which is not prone to user intervention, and preferably name it cimstart_scriptlet.
  • Run the following command:
    mkitab "cimserver::once:<location of cimstart_scriptlet>"
    Replace the location tag above with the actual location of cimstart_scriptlet.
  • Open the /etc/inittab file and verify if the entry added in the above step is present or not.
Listing 2. Contents of cimstart_scriptlet
#! /usr/bin/ksh
export PEGASUS_HOME=<PROVIDER_PATH>
export PEGASUS_ROOT=<PROVIDER_PATH>
export LD_LIBRARY_PATH=<PROVIDER_PATH>/lib:$LD_LIBRARY_PATH
export LIBPATH=<PROVIDER_PATH>/lib:$LIBPATH
date>/tmp/cim_server.log2>1
< PROVIDER_PATH > /bin/cimserver >> tmp/cim_server.log2 >1 
exit 0

Environment variables to be set

When an SMI-S provider is installed in an AIX environment (where an AIX cimserver co-exists along with the cimserver of SMI-S provider), to ensure that each of the cimserver do not clash with one another, it is very critical that the following is performed on the provider system:

  • Add the entries mentioned in Listing 3 to the /etc/environment file.
  • Run ". /etc/environment",note the space between the 'dot' and the file name.
  • Verify if the paths have been set by running the env | grep pegasus command and check if all entries listed in Listing 3 are present here. If not present, then run the following command:

    export <entry name which is not listed> one by one.

  • Add the entries in Listing 4 to the /.profile file, if not already present.
Listing 3. Environment variable entries
PATH=<PROVIDER_PATH>/bin:/usr/bin:/etc:/usr/sbin:.... 
all existing values                   
LD_LIBRARY_PATH=<PROVIDER_PATH>/lib
PEGASUS_HOME=<PROVIDER_PATH>
PEGASUS_ROOT=<PROVIDER_PATH>
LIBPATH=<PROVIDER_PATH>/lib
Listing 4. profile file entries
export PEGASUS_HOME=<PROVIDER_PATH>
export PEGASUS_ROOT=<PROVIDER_PATH>
export PATH=<PROVIDER_PATH>/bin:$PATH
export LIBPATH=<PROVIDER_PATH>/lib:$LIBPATH
export LD_LIBRARY_PATH=<PROVIDER_PATH>/lib:$LD_LIBRARY_PATH

Authenticate and add an user to the provider

Provider, by default, comes with no authentication enabled, but it is highly recommended that the authentication is enabled for security purposes and also a user is added.

On the provider system:

  • Run the following command:
    <PROVIDER_PATH>/bin/cimconfig -s enableAuthentication=true
  • Add cimuser using the following command:
    <PROVIDER_PATH>/bin/cimuser -a -u root -w <password>

    <password> - password of the system on which provider is installed.

Updating SMI-S provider

IBM Systems Director queries all the managed endpoints at regular intervals and updates the attributes accordingly. While the provider is getting updated, it is most likely that the cimserver will be down for a given point of time. If the IBM Systems Director queries the storage during this time, the communication between the storage and IBM systems director fails, and result in inappropriate values in storage managed endpoints. Also, the credentials of the provider should be preserved during upgrade; failing which, IBM Systems Director may show the AccessState of storage as Locked. To avoid this, it is a good practice to perform the following steps.

  • On the IBM Systems Director system: Stop IBM Systems Director by running the smstopcommand.
  • On the IBM Systems Director system: Verify the status of IBM Systems Director by running the smstatus -r, command and wait till the status gets updated to inactive. It might take couple of minutes.
  • On the provider system: Perform upgrade of SMI-S Eagle provider.
  • On the provider system: Verify the status of cimserver (refer to the Troubleshooting checklist section) and make sure it is active.
  • On the provider system: Verify if cimuser exists by running the <PROVIDER_PATH>/bin/cimuser -l command. It should return: root; But, if it returns: No users found for listing., then add the cimuser again.
  • On the IBM Systems Director system: Start IBM Systems Director by running the smstartcommand.
  • On the IBM Systems Director system: Verify the status of IBM Systems Director by running the smstatus -r command and wait till the status gets updated to Active. It might take a few minutes.

Changing the password of a system on which SMI-S provider is installed

After a user changes the password of the server, it is mandatory that the newly set password should be reflect in the IBM Systems Director database. To ensure that the database is updated, the user must perform the following steps.

  • Change the password of the system to a preferred new password.
  • On the provider system, run the following command:
    <PROVIDER_PATH>/bin/cimuser -m -u username -w <old password> -n <new password>
  • On the IBM Systems Director system, run the following command:
    smcli lscred -r htpp://<IPaddr>:35988

    <IPaddr> refers to the IP address of the system on which SMI-S provider is installed.

  • Note down Target Identity ID for User Principal : root from the command output of the above step.
  • On the IBM Systems Director system, run the following command:
    smcli chcred –t <Target Identity ID> -c PASSWORD –P <newpassword>

    <Target Identity ID> refers to the target identity captured in the above step.

    <newpassword> refers to the new password of the system that is currently set .

  • On the IBM Systems Director system, stop IBM Systems Director by running the smstopcommand.
  • On the IBM Systems Director system, verify the status of IBM Systems Director by running the smstatus -Rcommand and wait till the status gets updated to inactive. It might take a few minutes.
  • On the IBM Systems Director system, wait for one or two minutes and then start IBM Systems Director by running the smstartcommand.
  • On the IBM Systems Director system,verify the status of IBM Systems Director by running the smstatus -r command and wait till the status gets updated to Active. It might take a few minutes.

Reporting a SMI-S provider problem

Collect all the data mentioned in this section and attach it while reporting the issue to: xdl-eaglesmissupport@netapp.com

Collecting provider logs

This section provides details regarding enabling the logs and the log file details. On the provider system:

Edit the file: <PROVIDER_PATH>/providers/array/providerTraceLog.properties as per details below.

  1. Look for the following section.

    Uncomment the line below (remove the #) to turn on provider debug tracing.

## Valid trace levels are, from most to least logging:
## VERBOSE
## DEBUG
#   INFO
## WARN
## ERROR
## OFF
  1. Uncomment VERBOSE by removing the two pound symbol (#), and you would see the following code.

    Uncomment the line below (remove the #) to turn on provider debug tracing.

## Valid trace levels are, from most to least logging:
   VERBOSE
## DEBUG
## INFO
## WARN
## ERROR
## OFF
  1. Look for the following section.
[LSI2]
#LEVEL=VERBOSE
## Uncomment the line below (remove the #) to turn on provider device event  
   logging.
## Valid trace levels are, from most to least logging:
## ALL
## CRITICAL
## NONE
  1. Uncomment LEVEL and ALL, and the code looks as follows:
[LSI2]
LEVEL=VERBOSE
## Uncomment the line below (remove the #) to turn on provider device event  
   logging.
## Valid trace levels are, from most to least logging:
   ALL
## CRITICAL
## NONE
  1. Collect the log file from <PROVIDER_PATH> /providers/array/SbmaTraceLog.txt

Collecting cimserver logs

  1. Run the following commands to enable logging:
<PROVIDER_PATH>/bin/cimconfig  -s traceLevel=4  -p
<PROVIDER_PATH>/bin/cimconfig  -s traceComponents=ALL  -p
<PROVIDER_PATH>/bin/cimconfig  -s logLevel=TRACE -p
<PROVIDER_PATH>/bin/cimconfig  -s traceFilePath=/tmp/cimserver.trc  -p
<PROVIDER_PATH>/bin/cimserver  -s
<PROVIDER_PATH>/bin/cimserver
  1. If the above commands are not working, try performing it manually using the following commands:
vi <PROVIDER_PATH>/cimserver_planned.conf
Append the following entries to the file:
traceLevel=4
traceComponents=ALL
logLevel=TRACE
traceFilePath=/tmp/cimserver.trc

Save and exit using 'wq' in the vi editor.

Restart the cimserver
<PROVIDER_PATH>/bin/cimserver -s
<PROVIDER_PATH>/bin/cimserver
  1. Collect the log file from: /tmp/cimserver.trc.

Other debug infomation, command outputs, and logs to be collected

  • You can fetch the status of cimserver using any of the following commands:
netstat -an | grep 35988
ps –ef | grep cimserve
cimcli ns
cat /tmp/cimserver_start_generic.conf
  • You can fetch the process ID from the output of the cat /tmp/cimserver_start_generic.conf command and run the procmap <Process id> command.
  • The list of files present in <PROVIDER_PATH>/logs include:
PegasusError.log
PegasusLog.lock
PegasusStandard.log
  • You can fetch install/uninstall logs using the following commands:
/tmp/LSIarray2_install.log
/tmp/LSIarray2_uninstall.log
  • Command output:
<PROVIDER_PATH>/bin/cimcli -n interop ei LSISSI_RegisteredProfile -niq      
<PROVIDER_PATH>/bin/cimcli -n root/LsiArray13 ei LSISSI_StorageSystem
<PROVIDER_PATH>/bin/cimcli ei CIM_ComputerSystem -n root/lsiarray13 | grep
OtherIdentifying
which cimcli
ldd `which cimcli`
  • The version of Eagle Provider installed is given below:

    LSI Eagle Provider : - lslpp -l | grep LSI

    NetApp Eagle Provider: - lslpp -l | grep NetApp

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into AIX and Unix on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=AIX and UNIX
ArticleID=871653
ArticleTitle=The troubleshooter's guide for managing storage using IBM Systems Director
publish-date=04222013