IBM Support

DB2 Purescale : How to manually cleanup dangling entries for mount records from the HA registry and the mount resources?

Technical Blog Post


Abstract

DB2 Purescale : How to manually cleanup dangling entries for mount records from the HA registry and the mount resources?

Body

Couple of months before we saw a practical example of adding a storage group in pureScale - what to expect at DB2, TSA and GPFS level.
/support/pages/node/1139998


Well, that's the ideal world scenario (good case), but user might see dangling entries for mount records from the HA registry and the mount resources after "drop stogroup" command.
So this article is more about how to clean up such mess.


1. create GPFS filesystem

$>db2cluster -cfs -create -filesystem gpfs_temp11 -disk /dev/hdisk113,/dev/hdisk114,/dev/hdisk115,/dev/hdisk116,/dev/hdisk117,/dev/hdisk118,/dev/hdisk119,/dev/hdisk120,/dev/hdisk121,/dev/hdisk122 -mount /db2/DUIT/temp11

$>db2cluster -cfs -create -filesystem gpfs_temp12 -disk /dev/hdisk123,/dev/hdisk124,/dev/hdisk125,/dev/hdisk126,/dev/hdisk127,/dev/hdisk128,/dev/hdisk129,/dev/hdisk130,/dev/hdisk131,/dev/hdisk132 -mount /db2/DUIT/temp12


2. create storage group in db2instance user
   $>db2 "create stogroup <group name> on '/db2/DUIT/temp11', '/db2/DUIT/temp12'"


3. create system temporary tablespace
   $>db2 "create system temporary tablespace <tbs name> managed by automatic storage using stogroup <storage group name>"


4. remove system temporary tablespace
   $>db2 "drop tablespace <tbs name>"


4.5 remove files
    $>rm -rf /db2/DUIT/temp11/*
    $>rm -rf /db2/DUIT/temp12/*
    $>db2cluster -cfs -delete -filesystem gpfs_temp11
    => Note here user did not remove storage group yet


5. remove storage group
   $>db2 "drop stogroup <storage group name>"


   At this stage we see dangling entries for mount records from the HA registry (db2hareg -dump) and the mount resources (mmlsnsd)

      
> check the storage group by db2pd command and seems the corresponding entries are deleted from each member ( member 0 & member 1 ) as expected.
                                                                
$>db2pd -d DUIT -storagepaths                                          
Database Member 0 -- Database DUIT -- Active -- Up 4 days 20:06:04 -- Date 2016-12-14-17.39.16.519793

Storage Group Configuration:                                           
Address            SGID  Default  DataTag    Name                      
0x0A00030026227B80 0     Yes      0          IBMSTOGROUP               
0x0A00030026227E00 1     No       0          SGDUITDB                  
0x0A00030026275460 2     No       0          SGDUITHS                  
0x0A00030026299460 3     No       0          SGDUITTP                  
0x0A000300262C1460 4     No       0          SGDUITTL                   

Storage Group Statistics:                                              
Address            SGID  State      Numpaths  NumDropPen               
0x0A00030026227B80 0     0x00000000 1         0                        
0x0A00030026227E00 1     0x00000000 4         0                        
0x0A00030026275460 2     0x00000000 2         0                        
0x0A00030026299460 3     0x00000000 4         0                        
0x0A000300262C1460 4     0x00000000 1         0                         

Storage Group Paths:                                                   
Address            SGID  PathID    PathState    PathName               
0x0A0003002624D000 0     0         InUse        /db2/DUIT/datas        
0x0A0003002626F000 1     1024      InUse        /db2/DUIT/data1        
0x0A00030026271000 1     1025      InUse        /db2/DUIT/data2        
0x0A00030026273000 1     1026      InUse        /db2/DUIT/data3        
0x0A00030026275000 1     1027      InUse        /db2/DUIT/data4        
0x0A00030026297000 2     2048      InUse        /db2/DUIT/datahst1     
0x0A00030026299000 2     2049      InUse        /db2/DUIT/datahst2     
0x0A000300262BB000 3     3072      InUse        /db2/DUIT/temp1        
0x0A000300262BD000 3     3073      InUse        /db2/DUIT/temp2        
0x0A000300262BF000 3     3074      InUse        /db2/DUIT/temp3        
0x0A000300262C1000 3     3075      InUse        /db2/DUIT/temp4        
0x0A000300262E3000 4     4096      InUse        /db2/DUIT/tool          

Database Member 1 -- Database DUIT -- Active -- Up 4 days 20:33:37 -- Date 2016-12-14-17.39.20.571062

Storage Group Configuration:                                           
Address            SGID  Default  DataTag    Name                      
0x0A00030026227B80 0     Yes      0          IBMSTOGROUP               
0x0A00030026227E00 1     No       0          SGDUITDB                  
0x0A00030026275460 2     No       0          SGDUITHS                  
0x0A00030026299460 3     No       0          SGDUITTP                  
0x0A000300262C1460 4     No       0          SGDUITTL                   

Storage Group Statistics:                                              
Address            SGID  State      Numpaths  NumDropPen               
0x0A00030026227B80 0     0x00000000 1         0                        
0x0A00030026227E00 1     0x00000000 4         0                        
0x0A00030026275460 2     0x00000000 2         0                        
0x0A00030026299460 3     0x00000000 4         0                        
0x0A000300262C1460 4     0x00000000 1         0                         

Storage Group Paths:                                                   
Address            SGID  PathID    PathState    PathName               
0x0A000307DD2B3000 0     0         InUse        /db2/DUIT/datas        
0x0A000307DD239000 1     1024      InUse        /db2/DUIT/data1        
0x0A000307DD6F5000 1     1025      InUse        /db2/DUIT/data2        
0x0A000307DD42E000 1     1026      InUse        /db2/DUIT/data3        
0x0A000307DD6F8000 1     1027      InUse        /db2/DUIT/data4        
0x0A00030026297000 2     2048      InUse        /db2/DUIT/datahst1     
0x0A00030026299000 2     2049      InUse        /db2/DUIT/datahst2     
0x0A000307DCACA000 3     3072      InUse        /db2/DUIT/temp1        
0x0A000307DC9B6000 3     3073      InUse        /db2/DUIT/temp2        
0x0A000307DC006000 3     3074      InUse        /db2/DUIT/temp3        
0x0A000307DE0C7000 3     3075      InUse        /db2/DUIT/temp4        
0x0A000307DD37E000 4     4096      InUse        /db2/DUIT/tool          

      

However db2hareg -dump  still shows those entries (/db2/DUIT/temp11, /db2/DUIT/temp12)

<db2hareg -dump>                                   
B01000000000000,IN,100,2,1,1                                            
B01000000000000,DN,ilancer21,,ilancer21-en1,ilancer21-en2               
B01000000000000,NL,128,ilancer21,0,ilancer21-en1,ilancer21-en2,-,CF     
B01000000000000,NL,0,ilancer21,0,ilancer21-en1,ilancer21-en2,-,MEMBER   
B01000000000000,MO,/db2/DUIT/inst, ,0,4,0                               
B01000000000000,RU,32872,32736                                          
B01000000000000,NL,129,ilancer22,0,ilancer22-en1,ilancer22-en2,-,CF     
B01000000000000,NL,1,ilancer22,0,ilancer22-en1,ilancer22-en2,-,MEMBER   
B01000000000000,DN,ilancer22,,ilancer22-en1,ilancer22-en2               
B01000000000000,MO,/db2/DUIT/diag, ,0,1,0                               
B01000000000000,DB,DUIT,1                                               
B01000000000000,MO,/db2/DUIT/temp12,DUIT,0,1,0
B01000000000000,MO,/db2/DUIT/temp11,DUIT,0,1,0
B01000000000000,MO,/db2/DUIT/tool,DUIT,0,2,0                            
B01000000000000,MO,/db2/DUIT/temp4,DUIT,0,2,0                           
B01000000000000,MO,/db2/DUIT/temp3,DUIT,0,2,0                           
B01000000000000,MO,/db2/DUIT/temp2,DUIT,0,2,0                           
B01000000000000,MO,/db2/DUIT/temp1,DUIT,0,2,0                           
B01000000000000,MO,/db2/DUIT/datahst2,DUIT,0,2,0                        
B01000000000000,MO,/db2/DUIT/datahst1,DUIT,0,2,0                        
B01000000000000,MO,/db2/DUIT/data4,DUIT,0,2,0                           
B01000000000000,MO,/db2/DUIT/data3,DUIT,0,2,0                           
B01000000000000,MO,/db2/DUIT/data2,DUIT,0,2,0                           
B01000000000000,MO,/db2/DUIT/data1,DUIT,0,2,0                           
B01000000000000,MO,/db2/DUIT/datas,DUIT,0,10,0                          

It also shown in mmlsnsed result                                       

                                                                        
$>mmlsnsd                                                               
FILE SYSTEM NAME                       MOUNT POINT                      
---------------------------------      -------------------------        
db2fs1                                 /db2/DUIT/inst                   
gpfs_arch                              /db2/DUIT/arch                   
gpfs_data1                             /db2/DUIT/data1                  
gpfs_data2                             /db2/DUIT/data2                  
gpfs_data3                             /db2/DUIT/data3                  
gpfs_data4                             /db2/DUIT/data4                  
gpfs_datahst1                          /db2/DUIT/datahst1               
gpfs_datahst2                          /db2/DUIT/datahst2               
gpfs_datas                             /db2/DUIT/datas                  
gpfs_diag                              /db2/DUIT/diag                   
gpfs_log_act                           /db2/DUIT/log_act                
gpfs_log_mir                           /db2/DUIT/log_mir                
gpfs_temp1                             /db2/DUIT/temp1                  
gpfs_temp11                            /db2/DUIT/temp11                 
gpfs_temp12                            /db2/DUIT/temp12                 
gpfs_temp2                             /db2/DUIT/temp2                  
gpfs_temp3                             /db2/DUIT/temp3                  
gpfs_temp4                             /db2/DUIT/temp4                  
gpfs_tool                              /db2/DUIT/tool                   
gpfs_work                              /db2/DUIT/work                   
                                                                        
                                                                        
                                                            
tried to offline the TSA resource group of temp11                       
                                                                        
#>chrg -o offline db2mnt-db2_DUIT_temp11-rg                             
root@ilancer22:/tmp # lssam | grep -i temp11                            
Pending offline IBM.ResourceGroup:db2mnt-db2_DUIT_temp11-rg Nominal=Offline                                                         
        '- Online IBM.Application:db2mnt-db2_DUIT_temp11-rs             
                |- Online IBM.Application:db2mnt-db2_DUIT_temp11-rs:ilancer21                     
                '- Online IBM.Application:db2mnt-db2_DUIT_temp11-rs:ilancer22                     
Online IBM.Equivalency:db2mnt-db2_DUIT_temp11-rg_group-equ              
                                                                        
: that group shows pending offline.                                      

 

 

By design, when the "drop stogroup" command is issued, we attempt to delete the mount resources in the following way:
1. first, decrement the usecount for the mount records
2. If the new usecount == 0 for this mount record, we proceed with deleting the mount resource from the TSA resource model and remove the mount record from the HA registry.
 

Now, in this above case, the initial usecount for both the mount resources was 2 (found from db2 traces).
Due to this, the code decrements the usecount for the mount record (which makes it 1) and exits without taking any further action.

Later it was found that due to the user's  incorrect usage of DB2 commands, the resources may not have been properly cleanup in the first invocation of the "drop stogroup" command.
Due to this, in the subsequent attempts to create storage group resulted in incrementing the usecount of the existing mount records to '2'.
 

Now next task is how to clean up this mess?
How to delete the mount records from the HA registry and delete the mount resources?

 

   1. HA registry attempt deletion
    db2inst1@ilancer21:/unify/IBM/db2inst1] db2hareg -del Mount path=/db2/DUIT/temp11,databasename=DUIT
    db2inst1@ilancer21:/unify/IBM/db2inst1] db2hareg -del Mount path=/db2/DUIT/temp12,databasename=DUIT
 

    2. Checking for HA registry

    [db2inst1@ilancer21:/unify/IBM/db2inst1] db2hareg -dump; date
    B01000000000000,IN,100,2,1,1
    B01000000000000,DN,ilancer21,,ilancer21-en1,ilancer21-en2
    B01000000000000,NL,128,ilancer21,0,ilancer21-en1,ilancer21-en2,-,CF
    B01000000000000,NL,0,ilancer21,0,ilancer21-en1,ilancer21-en2,-,MEMBER
    B01000000000000,MO,/db2/DUIT/inst, ,0,4,0
    B01000000000000,RU,32872,32736
    B01000000000000,NL,129,ilancer22,0,ilancer22-en1,ilancer22-en2,-,CF
    B01000000000000,NL,1,ilancer22,0,ilancer22-en1,ilancer22-en2,-,MEMBER
    B01000000000000,DN,ilancer22,,ilancer22-en1,ilancer22-en2
    B01000000000000,MO,/db2/DUIT/diag, ,0,1,0
    B01000000000000,DB,DUIT,1
    B01000000000000,MO,/db2/DUIT/tool,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/temp4,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/temp3,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/temp2,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/temp1,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/datahst2,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/datahst1,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/data4,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/data3,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/data2,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/data1,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/datas,DUIT,0,10,0

    => Both /db2/DUIT/temp11 and /db2/DUIT/temp12 have been deleted

    3. Change for management scope

    root@ilancer21:/usr/lpp/mmfs/bin # export CT_MANAGEMENT_SCOPE=2


    4. Lock for resource group
    root@ilancer21:/usr/lpp/mmfs/bin # rgreq -o lock db2mnt-db2_DUIT_temp11-rg
    Completed applying request to resource group "db2mnt-db2_DUIT_temp11-rg".


    5.Checking for relations
    root@ilancer21:/usr/lpp/mmfs/bin # lsrel | grep -i temp11
    db2_db2inst1_1-rs_DependsOn_db2mnt-db2_DUIT_temp11-rs-rel    IBM.Application:db2_db2inst1_1-rs              db2_db2inst1_1-rg
    db2_db2inst1_0-rs_DependsOn_db2mnt-db2_DUIT_temp11-rs-rel    IBM.Application:db2_db2inst1_0-rs              db2_db2inst1_0-rg


    6. Removing relations (Problem occurred here)
    t1n1[root]:/>rmrel db2_db2inst1_0-rs_DependsOn_db2mnt-db2_DUIT_temp11-rs-rel
    (rmrsrc-api) 2621-014 Command not allowed - one or more related resource groups are online.
    rmrel: 2622-009 An unexpected RMC error occurred.The RMC return code was 1.
    rmrel: 2622-229 None of the specified Relationships were found or could not be removed.

    => db2mnt-db2_DUIT_temp11-rs is related to db2_db2_0-rs
        And db2_db2_0-rs is db2 process (db2sysc)
        If we want to delete db2_db2_0-rs_DependsOn_db2mnt-temp11-rs-rel, it need db2 down
        Actually, through test was erased after the db2stop we need to work online and so need to look for another way.

 

So attempting to use db2 commands to create the storage group and delete the storage group. It worked this time since the HA registries for the mount records were deleted.
Creation and deletion of the storage group made new entries in the HA registry with the proper usecount values and yielded the expected outcomes.


    1. Created storage group once again after HA registry attempt deletion
    db2inst1@ilancer21:/unify/IBM/db2inst1] db2 "create stogroup test_group on '/db2/DUIT/temp11','/db2/DUIT/temp12'"
    DB20000I  The SQL command completed successfully.

    => It must be making use of both "/db2/DUIT/temp11" and "/db2/DUIT/temp12" because 2 filesystems are able to delete later.


    2. Performing deletion for storage group
    [db2inst1@ilancer21:/unify/IBM/db2inst1] db2 "drop stogroup test_group"
    DB20000I  The SQL command completed successfully.


    3. Checking for DB2, TSA and HA registry
    <DB2 Check>
    [db2inst1@ilancer21:/unify/IBM/db2inst1] db2pd -d DUIT -storagepaths
    Storage Group Configuration:
    Address            SGID  Default  DataTag    Name
    0x0A0003002622DBA0 0     Yes      0          IBMSTOGROUP
    0x0A0003002622DE20 1     No       0          SGDUITDB
    0x0A0003002627B460 2     No       0          SGDUITHS
    0x0A0003002629F460 3     No       0          SGDUITTP
    0x0A000300262C7460 4     No       0          SGDUITTL

    Storage Group Statistics:
    Address            SGID  State      Numpaths  NumDropPen
    0x0A0003002622DBA0 0     0x00000000 1         0
    0x0A0003002622DE20 1     0x00000000 4         0
    0x0A0003002627B460 2     0x00000000 2         0
    0x0A0003002629F460 3     0x00000000 4         0
    0x0A000300262C7460 4     0x00000000 1         0
    0x0A000307E281A000 5     0x00000000 2         0

    Storage Group Paths:
    Address            SGID  PathID    PathState    PathName
    0x0A00030026253000 0     0         InUse        /db2/DUIT/datas
    0x0A00030026275000 1     1024      InUse        /db2/DUIT/data1
    0x0A00030026277000 1     1025      InUse        /db2/DUIT/data2
    0x0A00030026279000 1     1026      InUse        /db2/DUIT/data3
    0x0A0003002627B000 1     1027      InUse        /db2/DUIT/data4
    0x0A0003002629D000 2     2048      InUse        /db2/DUIT/datahst1
    0x0A0003002629F000 2     2049      InUse        /db2/DUIT/datahst2
    0x0A000300262C1000 3     3072      InUse        /db2/DUIT/temp1
    0x0A000300262C3000 3     3073      InUse        /db2/DUIT/temp2
    0x0A000300262C5000 3     3074      InUse        /db2/DUIT/temp3
    0x0A000300262C7000 3     3075      InUse        /db2/DUIT/temp4
    0x0A000300262E9000 4     4096      InUse        /db2/DUIT/tool
    ilancer21: db2pd -d DUIT -storagepaths ... completed ok

 

    <TSA Check>
    [db2inst1@ilancer21:/unify/IBM/db2inst1] lssam -nocolor | grep -i temp11
    [db2inst1@ilancer21:/unify/IBM/db2inst1] lssam -nocolor | grep -i temp12

    => TSA is not showing temp11 and temp12

 

    <HA registry>
    [db2inst1@ilancer21:/unify/IBM/db2inst1] db2hareg -dump
    B01000000000000,IN,100,2,1,1
    B01000000000000,DN,ilancer21,,ilancer21-en1,ilancer21-en2
    B01000000000000,NL,128,ilancer21,0,ilancer21-en1,ilancer21-en2,-,CF
    B01000000000000,NL,0,ilancer21,0,ilancer21-en1,ilancer21-en2,-,MEMBER
    B01000000000000,MO,/db2/DUIT/inst, ,0,4,0
    B01000000000000,RU,32872,32736
    B01000000000000,NL,129,ilancer22,0,ilancer22-en1,ilancer22-en2,-,CF
    B01000000000000,NL,1,ilancer22,0,ilancer22-en1,ilancer22-en2,-,MEMBER
    B01000000000000,DN,ilancer22,,ilancer22-en1,ilancer22-en2
    B01000000000000,MO,/db2/DUIT/diag, ,0,1,0
    B01000000000000,DB,DUIT,1
    B01000000000000,MO,/db2/DUIT/tool,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/temp4,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/temp3,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/temp2,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/temp1,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/datahst2,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/datahst1,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/data4,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/data3,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/data2,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/data1,DUIT,0,2,0
    B01000000000000,MO,/db2/DUIT/datas,DUIT,0,10,0

    => HA registry not showing  temp11 and temp12
 

    4. Delete files from both /db2/DUIT/temp11 and /db2/DUIT/temp12
    root@ilancer21:/db2/DUIT/temp11 # rm -fr /db2/DUIT/temp11/.*
    root@ilancer21:/db2/DUIT/temp11 # rm -fr /db2/DUIT/temp11/*

    root@ilancer21:/db2/DUIT/temp12 # rm -fr /db2/DUIT/temp12/.*
    root@ilancer21:/db2/DUIT/temp12 # rm -fr /db2/DUIT/temp12/*
 

    => All the files have been deleted. (The .snapshots file was not deleted)


    5. Performing deletion for GPFS filesystem (root user and location is DB2 <install path>/bin)
    root@ilancer21:/unify/IBM/db2/V11.1_SB_36064/bin # ./db2cluster -cfs -delete -filesystem gpfs_temp11
    File system 'gpfs_temp11' has been successfully deleted.

    root@ilancer21:/unify/IBM/db2/V11.1_SB_36064/bin # ./db2cluster -cfs -delete -filesystem gpfs_temp12
    File system 'gpfs_temp12' has been successfully deleted.

    => Both of them have been successfully deleted

                     

Thanks,
Shashank Kharche

IBM DB2 LUW Lab

 

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm13286641