Topic
3 replies Latest Post - ‏2013-09-30T17:17:02Z by AndersLorensen
Ahmar
Ahmar
37 Posts
ACCEPTED ANSWER

Pinned topic LUN Lotus_DB_corp being removed from the server side Automatically

‏2013-09-27T14:47:44Z |

Dear Team

We have strange issue of LUN being removed from the server side even though it appears fine on the storage manager,

we have to restart the blade No 13 to get the Lun back on the OS side.This is happening since last two weeks.

  Logical Drive name:            Lotus_DB_corp                                    
                                                                                 
     Logical Drive status:       Optimal                                          
                                                                                   
      Capacity:                   1,024.000 GB                                     
      Logical Drive ID:           60:08:0e:50:00:24:1e:ce:00:00:01:89:4e:83:59:6a  
      Subsystem ID (SSID):        0                                                
      Associated array:           AYTB_Dominos                                     
      RAID level:                 5                                                
                                                                                   
      LUN:                        0                                                
      Accessible By:              Host Blade_13_Corporate

kindly check the logs and advice

Thanks

Ahmar

Attachments

Updated on 2013-09-27T14:51:50Z at 2013-09-27T14:51:50Z by Ahmar
  • AndersLorensen
    AndersLorensen
    156 Posts
    ACCEPTED ANSWER

    Re: LUN Lotus_DB_corp being removed from the server side Automatically

    ‏2013-09-30T16:12:10Z  in response to Ahmar

    Help yourself, and read your own post Again. Remember, you are writing to other IBM customers who are helping people in their spare time.

     

    For example, "kindly check the logs and advice" - when you didnt even include the logs are not gonna help much.

    You didnt even write what storage system you have. What operating system your server that have the problem uses, how the problem shows from the OS side etc. etc. etc.

     

    You basicly made it impossible for us to help you.

     

    --

    Anders

    • Ahmar
      Ahmar
      37 Posts
      ACCEPTED ANSWER

      Re: LUN Lotus_DB_corp being removed from the server side Automatically

      ‏2013-09-30T16:41:46Z  in response to AndersLorensen

      Dear Anderson,

      thanks for the reply. But I found your advice quite strange , i have attached storage logs file  17-09-2013.zip   1.7 MB, pls check my previous post again.

      The SAS switch Zone view is attached file name 22 and 23.

      Nonetheless we have STORAGE SUBSYSTEM: AYTB-DS3524  running five  Windows Host and five Linux host.

      the issue is with the windows Host   Logical Drive name:            Lotus_DB_corp      

      I am attaching AMM logs aswell

      The two SAS switch have different Firmware levels and the one with issues is SAS switch 2 with fw ver 3.70 and the first sas switch has 3.0 fw

      We have WWN ID conflict issue in one of the SAS switch ,where ID is same for four odd SAS cards example blades 8 ,11,12 and 14.

      Where ID is 500000008000003  which is more than 16 characters.

      IO_MOD (SAS Conn Mod) in I/O Module slot: 03   Working Fine

        Product Name: SAS Connectivity Module, 14-1X Internal SAS ports, 4-4X Extern
      Firmware data:
         --------------
         Type            : Boot ROM
         Build ID        : BRSASM
         File Name       : VitesseSDK
         Release Date    : 06/26/2007
         Release Level   : 0305
         --------------
         Type            : Main Application 1
         Build ID        : BRSASM
         File Name       : IBMSASSM
         Release Date    : 10/24/2007
         Release Level   : 0353   
      ===================================================================
      IO_MOD (SAS Conn Mod) in I/O Module slot: 04  (Issues)
      Product Name: SAS Connectivity Module, 14-1X Internal SAS ports, 4-4X Extern
      Firmware data:
         --------------
         Type            : Boot ROM
         Build ID        : BRSASM
         File Name       : VitesseSDK
         Release Date    : 12/21/2007
         Release Level   : 0308
         --------------
         Type            : Main Application 1
         Build ID        : BRSASM
         File Name       : IBMSASSM
         Release Date    : 07/01/2011
         Release Level   : 0371

      My assumption is that we have corrupt SAS Switch in bay 4 and that is why it shows wrong WWN ID.

      Kiindly advice.

      Let me know if you require more info.

      regards

      • AndersLorensen
        AndersLorensen
        156 Posts
        ACCEPTED ANSWER

        Re: LUN Lotus_DB_corp being removed from the server side Automatically

        ‏2013-09-30T17:17:02Z  in response to Ahmar

        Sorry, I didnt see the log attached to the original post. My apologies.

         

        I have no experience with SAS switches in a Bladecenter, so not sure I can help you with that.

         

        But I took a look at your configuration file, and found some strange stuff there.

        show "Creating Host Group Email_Group.";
        create hostGroup userLabel="Email_Group";
        
        show "Creating Host Blade_02_Cluster with Host Type Index 2 on Host Group Email_Group.";
        // This Host Type Index corresponds to Type Windows Server 2003/Server 2008 Non-Clustered
        create host userLabel="Blade_02_Cluster" hostType=2 hostGroup="Email_Group";
        
        show "Creating Host Blade_13_Corporate with Host Type Index 2 on Host Group Email_Group.";
        // This Host Type Index corresponds to Type Windows Server 2003/Server 2008 Non-Clustered
        create host userLabel="Blade_13_Corporate" hostType=2 hostGroup="Email_Group";
        

         

        You've basicly made a host Group with 2 hosts, that are both set to a non-clustered hosttype. Host Groups are used for clusters, where multiple servers need access to the same LUN's. So this seems very strange and wrong. Maybe when the configuration was made, the person who made it, thought of host Groups as "folders"?

         

        show "Creating Host Port SAS_Corporate_Blade01_HBA1 on Host Blade_13_Corporate with WWN 5005076b08b2c346 and with interfaceType SAS.";
        create hostPort host="Blade_13_Corporate" userLabel="SAS_Corporate_Blade01_HBA1" identifier="5005076b08b2c346" interfaceType=SAS;
        
        show "Creating Host Port Corporate_Hba02 on Host Blade_13_Corporate with WWN 5005076b08b2c347 and with interfaceType SAS.";
        create hostPort host="Blade_13_Corporate" userLabel="Corporate_Hba02" identifier="5005076b08b2c347" interfaceType=SAS;
        

        Strange userlabel on the HBA1, but otherwise looks good, if those 2 WWN numbers are the right two.

         

        If I was you, I'd fix the way the host/host Group is made, so its made as a standalone host. (Unless its a clustered system of course, in which case, the host type is wrong)

        Then I'll check the multipath driver on the Windows to make sure it have all the paths when its working right, and when the problem occurs, I'd check them Again to see if any/all paths are gone. Even with a single SAS switch offline, there should be connectivity...

         

        What multipath driver are you using? IBM's RDAC? Windows MPIO?

         

        --

        Anders