Topic
  • 5 replies
  • Latest Post - ‏2012-11-30T11:06:03Z by SystemAdmin
SystemAdmin
SystemAdmin
2092 Posts

Pinned topic mmdeldisk doesn't finish

‏2012-11-27T07:42:38Z |
Hi!
If I delete a disk from a GPFS-filesystem, it first starts normal
but after a while it becomes slower and slower and it will need about 23 days to finish and
I have to delete 7 other disks!!!

Here my environment and some command output.

GPFS-Cluster: 8 IBM X3650 nodes (each has 24GB RAM and 4 cores), OS: SLES11 SP1
GPFS-Version: 3.3.0.27 x86_64
Original file system version: 10.00 (3.2.0.0)
Current file system version : 11.05 (3.3.0.2)

Output of "mmlsmgr":

file system      manager node ---------------- ------------------ gpfs_c           10.136.2.1 (gpfs-io01) gpfs_u           10.136.2.5 (gpfs-io05) gpfs_p           10.136.2.8 (gpfs-io08)


Output of "mmlsdisk gpfs_p":

disk         driver   sector failure holds    holds                            storage name         type       size   group metadata data  status        availability pool ------------ -------- ------ ------- -------- ----- ------------- ------------ ------------ nsd_p_d17    nsd         512    1000 Yes      Yes   ready up           system nsd_p_d18    nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d1     nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d2     nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d3     nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d4     nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d5     nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d6     nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d7     nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d8     nsd         512    1000 Yes      Yes   ready         up           system nsd_p_d9     nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d10    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d11    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d12    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d13    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d14    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d15    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d16    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d19    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_d20    nsd         512    1001 Yes      Yes   ready         up           system nsd_p_md01   nsd         512    1010 Yes      No    suspended     up           system nsd_p_md02   nsd         512    1010 Yes      No    suspended     up           system nsd_p_md03   nsd         512    1010 Yes      No    suspended up           system nsd_p_md04   nsd         512    1010 Yes      No    suspended     up           system nsd_p_d21    nsd         512    1000 No       Yes   ready         up           pool_level_30 nsd_p_d22    nsd         512    1000 No       Yes   ready         up           pool_level_30 nsd_p_d23    nsd         512    1001 No       Yes   ready         up           pool_level_30 nsd_p_md05   nsd         512    1010 Yes      No    suspended     up           system nsd_p_md06   nsd         512    1010 Yes      No    suspended     up           system nsd_p_md07   nsd         512    1010 Yes      No    suspended     up           system nsd_p_md08   nsd         512    1010 Yes      No    suspended     up           system Attention: Due to an earlier configuration change the file system may contain data that is at risk of being lost.


I want to delete one disk from filesystem with
mmdeldisk gpfs_p "nsd_p_md08". Here is the output:


Verifying file system configuration information ... mmdeldisk: File system gpfs_p has some disks that are in a non-ready state. Deleting disks ... Scanning file system metadata, phase 1 ... 8 % complete on Mon Nov 26 11:52:07 2012 12 % complete on Mon Nov 26 11:52:12 2012 16 % complete on Mon Nov 26 11:52:16 2012 29 % complete on Mon Nov 26 11:52:21 2012 41 % complete on Mon Nov 26 11:52:30 2012 45 % complete on Mon Nov 26 11:52:34 2012 50 % complete on Mon Nov 26 11:52:39 2012 54 % complete on Mon Nov 26 11:52:44 2012 58 % complete on Mon Nov 26 11:52:48 2012 62 % complete on Mon Nov 26 11:52:53 2012 66 % complete on Mon Nov 26 11:52:57 2012 79 % complete on Mon Nov 26 11:53:06 2012 83 % complete on Mon Nov 26 11:53:10 2012 100 % complete on Mon Nov 26 11:53:12 2012 Scan completed successfully. Scanning file system metadata, phase 2 ... Scanning file system metadata 

for pool_level_30 storage pool Scan completed successfully. Scanning file system metadata, phase 3 ... Scan completed successfully. Scanning file system metadata, phase 4 ... Scan completed successfully. Scanning user file metadata ... 0.87 % complete on Mon Nov 26 11:53:38 2012  (    873505 inodes     467241 MB) 2.29 % complete on Mon Nov 26 11:53:58 2012  (   2951201 inodes    1237112 MB) 3.56 % complete on Mon Nov 26 11:54:19 2012  (   5245582 inodes    1922316 MB) 4.78 % complete on Mon Nov 26 11:54:39 2012  (   7751048 inodes    2575343 MB) 5.48 % complete on Mon Nov 26 11:55:00 2012  (  10122853 inodes    2954712 MB) 6.28 % complete on Mon Nov 26 11:55:20 2012  (  13055921 inodes    3388646 MB) 7.04 % complete on Mon Nov 26 11:55:41 2012  (  15716274 inodes    3798918 MB) 7.66 % complete on Mon Nov 26 11:56:02 2012  (  18731764 inodes    4133525 MB) 8.27 % complete on Mon Nov 26 11:56:22 2012  (  21742082 inodes    4457678 MB) 8.72 % complete on Mon Nov 26 11:56:43 2012  (  24855787 inodes    4703419 MB) 9.35 % complete on Mon Nov 26 11:57:03 2012  (  28242896 inodes    5042937 MB) <some lines deleted> 36.69 % complete on Mon Nov 26 12:09:55 2012  ( 172228608 inodes   19786186 MB) 37.12 % complete on Mon Nov 26 12:10:16 2012  ( 182190080 inodes   20016688 MB) 37.45 % complete on Mon Nov 26 12:10:36 2012  ( 192675840 inodes   20196905 MB) 37.77 % complete on Mon Nov 26 12:10:59 2012  ( 201259013 inodes   20368048 MB) 39.08 % complete on Mon Nov 26 12:11:19 2012  ( 205258752 inodes   21073729 MB) 39.90 % complete on Mon Nov 26 12:11:40 2012  ( 209977344 inodes   21518658 MB) 40.64 % complete on Mon Nov 26 12:12:01 2012  ( 215744512 inodes   21914665 MB) 41.27 % complete on Mon Nov 26 12:12:22 2012  ( 221773824 inodes   22253621 MB) 41.77 % complete on Mon Nov 26 12:12:42 2012  ( 228589568 inodes   22524934 MB) 42.22 % complete on Mon Nov 26 12:13:03 2012  ( 236716032 inodes   22767522 MB) 42.64 % complete on Mon Nov 26 12:13:23 2012  ( 245891072 inodes   22995468 MB) 42.98 % complete on Mon Nov 26 12:13:44 2012  ( 255590400 inodes   23176160 MB) 43.25 % complete on Mon Nov 26 12:14:04 2012  ( 265289728 inodes   23323475 MB) 44.42 % complete on Mon Nov 26 12:14:28 2012  ( 270010990 inodes   23953776 MB) 45.76 % complete on Mon Nov 26 12:14:48 2012  ( 273549325 inodes   24675873 MB) 46.77 % complete on Mon Nov 26 12:15:10 2012  ( 278659072 inodes   25223888 MB) 47.50 % complete on Mon Nov 26 12:15:31 2012  ( 284688384 inodes   25618504 MB) 48.06 % complete on Mon Nov 26 12:15:52 2012  ( 290717696 inodes   25916418 MB) 48.62 % complete on Mon Nov 26 12:16:12 2012  ( 297795584 inodes   26219737 MB) 49.14 % complete on Mon Nov 26 12:16:32 2012  ( 305659904 inodes   26502287 MB) 49.56 % complete on Mon Nov 26 12:16:54 2012  ( 316145664 inodes   26726823 MB) 49.84 % complete on Mon Nov 26 12:17:14 2012  ( 327680000 inodes   26878410 MB) 50.46 % complete on Mon Nov 26 12:17:34 2012  ( 335876359 inodes   27212899 MB) 51.70 % complete on Mon Nov 26 12:17:54 2012  ( 339286467 inodes   27883528 MB) 52.88 % complete on Mon Nov 26 12:18:14 2012  ( 343628112 inodes   28517539 MB) 53.68 % complete on Mon Nov 26 12:18:35 2012  ( 347865088 inodes   28946924 MB) 54.39 % complete on Mon Nov 26 12:18:56 2012  ( 353894400 inodes   29330031 MB) 54.91 % complete on Mon Nov 26 12:19:16 2012  ( 360185856 inodes   29611367 MB) 55.44 % complete on Mon Nov 26 12:19:37 2012  ( 368050176 inodes   29896460 MB) 55.83 % complete on Mon Nov 26 12:19:58 2012  ( 376700928 inodes   30110591 MB) 56.15 % complete on Mon Nov 26 12:20:20 2012  ( 385613824 inodes   30279615 MB) 56.46 % complete on Mon Nov 26 12:20:40 2012  ( 396886016 inodes   30449842 MB) 57.53 % complete on Mon Nov 26 12:21:01 2012  ( 403808821 inodes   31023934 MB) 58.53 % complete on Mon Nov 26 12:21:21 2012  ( 407371776 inodes   31566977 MB) 59.46 % complete on Mon Nov 26 12:21:43 2012  ( 412614656 inodes   32066110 MB) 60.48 % complete on Mon Nov 26 12:22:03 2012  ( 419037184 inodes   32616077 MB) 61.15 % complete on Mon Nov 26 12:22:23 2012  ( 844103680 inodes   32977298 MB) 61.83 % complete on Mon Nov 26 12:22:43 2012  ( 938882868 inodes   33346437 MB) 62.07 % complete on Mon Nov 26 12:23:03 2012  (1062120305 inodes   33472218 MB) 62.31 % complete on Mon Nov 26 12:23:23 2012  (1195459316 inodes   33601272 MB) 62.95 % complete on Mon Nov 26 12:23:43 2012  (1262568180 inodes   33946479 MB) 63.14 % complete on Mon Nov 26 12:24:03 2012  (1296002837 inodes   34052130 MB) 63.38 % complete on Mon Nov 26 12:24:24 2012  (1330605845 inodes   34177305 MB) 63.52 % complete on Mon Nov 26 12:24:44 2012  (1356507258 inodes   34257286 MB) 63.70 % complete on Mon Nov 26 12:25:04 2012  (1397139578 inodes   34353440 MB) 63.83 % complete on Mon Nov 26 12:25:24 2012  (1445374074 inodes   34420218 MB) 63.92 % complete on Mon Nov 26 12:25:45 2012  (1463735337 inodes   34471096 MB) 63.93 % complete on Mon Nov 26 12:26:05 2012  (1465301233 inodes   34478331 MB) (anything is fine until here, I think) 63.95 % complete on Mon Nov 26 12:26:25 2012  (1465301233 inodes   34478331 MB) 63.97 % complete on Mon Nov 26 12:26:55 2012  (1465301233 inodes   34478331 MB) 63.99 % complete on Mon Nov 26 12:27:35 2012  (1465301233 inodes   34478331 MB) <some lines deleted> 65.71 % complete on Mon Nov 26 23:48:25 2012  (1465301233 inodes   34478331 MB) 65.73 % complete on Tue Nov 27 00:03:35 2012  (1465301233 inodes   34478331 MB) 65.75 % complete on Tue Nov 27 00:18:55 2012  (1465301233 inodes   34478331 MB) 65.77 % complete on Tue Nov 27 00:34:25 2012  (1465301233 inodes   34478331 MB) 65.79 % complete on Tue Nov 27 00:50:05 2012  (1465301233 inodes   34478331 MB) 65.81 % complete on Tue Nov 27 01:05:56 2012  (1465301233 inodes   34478331 MB) 65.83 % complete on Tue Nov 27 01:21:56 2012  (1465301233 inodes   34478331 MB) 65.85 % complete on Tue Nov 27 01:38:06 2012  (1465301233 inodes   34478331 MB) 65.87 % complete on Tue Nov 27 01:54:26 2012  (1465301233 inodes   34478331 MB) 65.89 % complete on Tue Nov 27 02:10:56 2012  (1465301233 inodes   34478331 MB) 65.91 % complete on Tue Nov 27 02:27:36 2012  (1465301233 inodes   34478331 MB) 65.93 % complete on Tue Nov 27 02:44:26 2012  (1465301233 inodes   34478331 MB) 65.95 % complete on Tue Nov 27 03:01:26 2012  (1465301233 inodes   34478331 MB) 65.97 % complete on Tue Nov 27 03:18:36 2012  (1465301233 inodes   34478331 MB) 65.99 % complete on Tue Nov 27 03:35:56 2012  (1465301233 inodes   34478331 MB) 66.01 % complete on Tue Nov 27 03:53:26 2012  (1465301233 inodes   34478331 MB) 66.03 % complete on Tue Nov 27 04:11:06 2012  (1465301233 inodes   34478331 MB) 66.05 % complete on Tue Nov 27 04:28:56 2012  (1465301233 inodes   34478331 MB) 66.07 % complete on Tue Nov 27 04:46:56 2012  (1465301233 inodes   34478331 MB) 66.09 % complete on Tue Nov 27 05:05:06 2012  (1465301233 inodes   34478331 MB) 66.11 % complete on Tue Nov 27 05:23:26 2012  (1465301233 inodes   34478331 MB) 66.13 % complete on Tue Nov 27 05:41:56 2012  (1465301233 inodes   34478331 MB) 66.15 % complete on Tue Nov 27 06:00:36 2012  (1465301233 inodes   34478331 MB) 66.17 % complete on Tue Nov 27 06:19:26 2012  (1465301233 inodes   34478331 MB) 66.19 % complete on Tue Nov 27 06:38:26 2012  (1465301233 inodes   34478331 MB) 66.21 % complete on Tue Nov 27 06:57:36 2012  (1465301233 inodes   34478331 MB) 66.23 % complete on Tue Nov 27 07:16:56 2012  (1465301233 inodes   34478331 MB) 66.25 % complete on Tue Nov 27 07:36:26 2012  (1465301233 inodes   34478331 MB) 66.27 % complete on Tue Nov 27 07:56:06 2012  (1465301233 inodes   34478331 MB) ...


On all 8 machines I see many "parallel tsrestripefs worker" waiters but no real disk I/O.
Output of "mmlsnode -N waiters -L" in attached file.

Has someone an idea what's going wrong?

Greetings Dieter
Updated on 2012-11-30T11:06:03Z at 2012-11-30T11:06:03Z by SystemAdmin
  • HajoEhlers
    HajoEhlers
    253 Posts

    Re: mmdeldisk doesn't finish

    ‏2012-11-27T23:16:22Z  
    Just what i notice:

    I must admit that i do not know the meaning of:

    > Scanning user file metadata ...

    i would guess that it is some kind of link list ( metadata) for the files.

    In this case i am surprised about the amount of inodes used by your files.
    1400 MILLION inodes for a total of 34 TB scanned data seems pretty much for me.
    AFAIR this value was for our GPFS about 10% higher than the total used inodes ( from df output ). In this case your average file size would be 24K but in contradiction the 'Scanning file system metadata, phase 1 ... .. phase 3 ' was pretty fast.

    Also you are going to delete all disks from failure group 1010 ( all are in suspend mode ). I do not know if a mmdeldisk would move the metadata onto another
    disks in case the replication setting are not correct. In this case and if nsd_p_md01-07 can hold the data of nsd_p_md08 - i would active nsd_p_md01-07 again.

    Maybe i am on a wrong track but would like to see a mmlsfs and mmdf output for that filesystem.

    cheers
    Hajo
  • YuanZhengcai
    YuanZhengcai
    9 Posts

    Re: mmdeldisk doesn't finish

    ‏2012-11-28T12:33:54Z  
    It seems the mmdeldisk got hang after it processed 1465301233 inodes. From the attached file, threads on most nodes were waiting on InodeCacheObj mutex, for example on node gpfs-io04, all threads waiting on the same mutex 0xFFFFC90016BE1220. The waiting time was small, it seems someone did broadcast but still can not to get the mutex.

    Is it possible that collect couple minutes trace after the strange progress is noticed ?

    63.70 % complete on Mon Nov 26 12:25:04 2012 (1397139578 inodes 34353440 MB)
    63.83 % complete on Mon Nov 26 12:25:24 2012 (1445374074 inodes 34420218 MB)
    63.92 % complete on Mon Nov 26 12:25:45 2012 (1463735337 inodes 34471096 MB)
    63.93 % complete on Mon Nov 26 12:26:05 2012 (1465301233 inodes 34478331 MB) (anything is fine until here, I think)
    63.95 % complete on Mon Nov 26 12:26:25 2012 (1465301233 inodes 34478331 MB)
    63.97 % complete on Mon Nov 26 12:26:55 2012 (1465301233 inodes 34478331 MB)
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: mmdeldisk doesn't finish

    ‏2012-11-28T14:28:00Z  
    Just what i notice:

    I must admit that i do not know the meaning of:

    > Scanning user file metadata ...

    i would guess that it is some kind of link list ( metadata) for the files.

    In this case i am surprised about the amount of inodes used by your files.
    1400 MILLION inodes for a total of 34 TB scanned data seems pretty much for me.
    AFAIR this value was for our GPFS about 10% higher than the total used inodes ( from df output ). In this case your average file size would be 24K but in contradiction the 'Scanning file system metadata, phase 1 ... .. phase 3 ' was pretty fast.

    Also you are going to delete all disks from failure group 1010 ( all are in suspend mode ). I do not know if a mmdeldisk would move the metadata onto another
    disks in case the replication setting are not correct. In this case and if nsd_p_md01-07 can hold the data of nsd_p_md08 - i would active nsd_p_md01-07 again.

    Maybe i am on a wrong track but would like to see a mmlsfs and mmdf output for that filesystem.

    cheers
    Hajo
    Hi Hajo,

    to reactivate some of the suspended disks does the trick.
    I reactivated nsd_p_md01-04 and started the mmdeldisk command for the remaining four nsd's.
    So I could delete the first four disk. May be I can repeat this for the next two disk.
    But what to do with the last one?!?! Let's see what happens :-)

    FYI here the output of mmlsfs and mmdf:

    
    gpfs-io08:~ # mmlsfs gpfs_p flag value            description ---- ---------------- ----------------------------------------------------- -f  16384            Minimum fragment size in bytes -i  512              Inode size in bytes -I  16384            Indirect block size in bytes -m  1                Default number of metadata replicas -M  2                Maximum number of metadata replicas -r  1                Default number of data replicas -R  2                Maximum number of data replicas -j  scatter          Block allocation type -D  nfs4             File locking semantics in effect -k  posix            ACL semantics in effect -a  -1               Estimated average file size -n  16               Estimated number of nodes that will mount file system -B  524288           Block size -Q  fileset          Quotas enforced none             Default quotas enabled -F  419430400        Maximum number of inodes -V  11.05 (3.3.0.2)  Current file system version 10.00 (3.2.0.0)  Original file system version -u  Yes              Support 
    
    for large LUNs? -z  No               Is DMAPI enabled? -L  8388608          Logfile size -E  Yes              Exact mtime mount option -S  No               Suppress atime mount option -K  whenpossible     Strict replica allocation option -P  system;pool_level_30  Disk storage pools in file system -d  nsd_p_d17;nsd_p_d18;nsd_p_d1;nsd_p_d2;nsd_p_d3;nsd_p_d4;nsd_p_d5;nsd_p_d6;nsd_p_d7;nsd_p_d8;nsd_p_d9;nsd_p_d10;nsd_p_d11;nsd_p_d12;nsd_p_d13;nsd_p_d14;nsd_p_d15; -d  nsd_p_d16;nsd_p_d19;nsd_p_d20;nsd_p_md01;nsd_p_md02;nsd_p_md03;nsd_p_md04;nsd_p_d21;nsd_p_d22;nsd_p_d23;nsd_p_md05;nsd_p_md06;nsd_p_md07;nsd_p_md08  Disks in file system -A  yes              Automatic mount option -o  none             Additional mount options -T  /gpfs/p          Default mount point
    


    
    gpfs-io08:~ # mmdf gpfs_p disk                disk size  failure holds    holds              free KB             free KB name                    in KB    group metadata data        in full blocks        in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 71 TB) nsd_p_d1           2928197632     1000 Yes      Yes      1454419456 ( 50%)      42577280 ( 1%) nsd_p_d2           2928197632     1000 Yes      Yes      1455008256 ( 50%)      42401392 ( 1%) nsd_p_d3           2928197632     1000 Yes      Yes      1454391808 ( 50%)      42474976 ( 1%) nsd_p_d4           2928197632     1000 Yes      Yes      1454894080 ( 50%)      42343392 ( 1%) nsd_p_d5           2928199856     1000 Yes      Yes      1454788608 ( 50%)      42371184 ( 1%) nsd_p_d6           2928199856     1000 Yes      Yes      1454813184 ( 50%)      42350640 ( 1%) nsd_p_d7           2928199856     1000 Yes      Yes      1454603776 ( 50%)      42488784 ( 1%) nsd_p_d8           2928199856     1000 Yes      Yes      1454888960 ( 50%)      42367920 ( 1%) nsd_p_d17          2928197632     1000 Yes      Yes      1455085568 ( 50%)      41791552 ( 1%) nsd_p_d18          2928197632     1000 Yes      Yes      1455905280 ( 50%)      41774176 ( 1%) nsd_p_d13          2928199856     1001 Yes      Yes      1454873600 ( 50%)      42709424 ( 1%) nsd_p_d14          2928199856     1001 Yes      Yes      1454778368 ( 50%)      42344544 ( 1%) nsd_p_d15          2928199856     1001 Yes      Yes      1454587904 ( 50%)      42423984 ( 1%) nsd_p_d16          2928199856     1001 Yes      Yes      1455053312 ( 50%)      42457888 ( 1%) nsd_p_d19          2928197632     1001 Yes      Yes      1456759296 ( 50%)      42774192 ( 1%) nsd_p_d20          2928197632     1001 Yes      Yes      1456851456 ( 50%)      42743488 ( 1%) nsd_p_d9           2928199856     1001 Yes      Yes      1454508544 ( 50%)      42505328 ( 1%) nsd_p_d10          2928199856     1001 Yes      Yes      1454662656 ( 50%)      42603616 ( 1%) nsd_p_d11          2928199856     1001 Yes      Yes      1454833152 ( 50%)      42482192 ( 1%) nsd_p_d12          2928199856     1001 Yes      Yes      1454873088 ( 50%)      42503456 ( 1%) nsd_p_md04          439023616     1010 Yes      No        438985216 (100%)          8352 ( 0%) * nsd_p_md03          439023616     1010 Yes      No        438986240 (100%)          7264 ( 0%) * nsd_p_md05          439023616     1010 Yes      No        439011840 (100%)          7104 ( 0%) * nsd_p_md06          439023616     1010 Yes      No        439013376 (100%) 5584 ( 0%) * nsd_p_md07          439023616     1010 Yes      No        439013376 (100%)          5584 ( 0%) * nsd_p_md08          439023616     1010 Yes      No        439014912 (100%)          4048 ( 0%) * nsd_p_md02          439023616     1010 Yes      No        438984704 (100%)          8336 ( 0%) * nsd_p_md01          439023616     1010 Yes      No        439000576 (100%)          9312 ( 0%) * -------------                         -------------------- ------------------- (pool total)      60320073792                           30856537088 ( 51%)     848522672 ( 1%)   Disks in storage pool: pool_level_30 (Maximum disk size allowed is 142 TB) nsd_p_d22         15623913472     1000 No       Yes      6762683392 ( 43%)      79840144 ( 1%) nsd_p_d21         15623913472     1000 No       Yes      6764915712 ( 43%)      80654416 ( 1%) nsd_p_d23         15623913472     1001 No       Yes      6746755072 ( 43%)      78184768 ( 1%) -------------                         -------------------- ------------------- (pool total)      46871740416                           20274354176 ( 43%)     238679328 ( 1%)   =============                         ==================== =================== (data)           105435719744                           49374934528 ( 47%)    1087168736 ( 1%) (metadata)        60320073792                           30856537088 ( 51%)     848522672 ( 1%) =============                         ==================== =================== (total)          107191814208                           51130891264 ( 48%)    1087202000 ( 1%)   Inode Information ----------------- Number of used inodes:        72713046 Number of free inodes:       346717354 Number of allocated inodes:  419430400 Maximum number of inodes:    419430400
    


    cheers
    Dieter
  • HajoEhlers
    HajoEhlers
    253 Posts

    Re: mmdeldisk doesn't finish

    ‏2012-11-28T16:10:47Z  
    Hi Hajo,

    to reactivate some of the suspended disks does the trick.
    I reactivated nsd_p_md01-04 and started the mmdeldisk command for the remaining four nsd's.
    So I could delete the first four disk. May be I can repeat this for the next two disk.
    But what to do with the last one?!?! Let's see what happens :-)

    FYI here the output of mmlsfs and mmdf:

    <pre class="jive-pre"> gpfs-io08:~ # mmlsfs gpfs_p flag value description ---- ---------------- ----------------------------------------------------- -f 16384 Minimum fragment size in bytes -i 512 Inode size in bytes -I 16384 Indirect block size in bytes -m 1 Default number of metadata replicas -M 2 Maximum number of metadata replicas -r 1 Default number of data replicas -R 2 Maximum number of data replicas -j scatter Block allocation type -D nfs4 File locking semantics in effect -k posix ACL semantics in effect -a -1 Estimated average file size -n 16 Estimated number of nodes that will mount file system -B 524288 Block size -Q fileset Quotas enforced none Default quotas enabled -F 419430400 Maximum number of inodes -V 11.05 (3.3.0.2) Current file system version 10.00 (3.2.0.0) Original file system version -u Yes Support for large LUNs? -z No Is DMAPI enabled? -L 8388608 Logfile size -E Yes Exact mtime mount option -S No Suppress atime mount option -K whenpossible Strict replica allocation option -P system;pool_level_30 Disk storage pools in file system -d nsd_p_d17;nsd_p_d18;nsd_p_d1;nsd_p_d2;nsd_p_d3;nsd_p_d4;nsd_p_d5;nsd_p_d6;nsd_p_d7;nsd_p_d8;nsd_p_d9;nsd_p_d10;nsd_p_d11;nsd_p_d12;nsd_p_d13;nsd_p_d14;nsd_p_d15; -d nsd_p_d16;nsd_p_d19;nsd_p_d20;nsd_p_md01;nsd_p_md02;nsd_p_md03;nsd_p_md04;nsd_p_d21;nsd_p_d22;nsd_p_d23;nsd_p_md05;nsd_p_md06;nsd_p_md07;nsd_p_md08 Disks in file system -A yes Automatic mount option -o none Additional mount options -T /gpfs/p Default mount point </pre>

    <pre class="jive-pre"> gpfs-io08:~ # mmdf gpfs_p disk disk size failure holds holds free KB free KB name in KB group metadata data in full blocks in fragments --------------- ------------- -------- -------- ----- -------------------- ------------------- Disks in storage pool: system (Maximum disk size allowed is 71 TB) nsd_p_d1 2928197632 1000 Yes Yes 1454419456 ( 50%) 42577280 ( 1%) nsd_p_d2 2928197632 1000 Yes Yes 1455008256 ( 50%) 42401392 ( 1%) nsd_p_d3 2928197632 1000 Yes Yes 1454391808 ( 50%) 42474976 ( 1%) nsd_p_d4 2928197632 1000 Yes Yes 1454894080 ( 50%) 42343392 ( 1%) nsd_p_d5 2928199856 1000 Yes Yes 1454788608 ( 50%) 42371184 ( 1%) nsd_p_d6 2928199856 1000 Yes Yes 1454813184 ( 50%) 42350640 ( 1%) nsd_p_d7 2928199856 1000 Yes Yes 1454603776 ( 50%) 42488784 ( 1%) nsd_p_d8 2928199856 1000 Yes Yes 1454888960 ( 50%) 42367920 ( 1%) nsd_p_d17 2928197632 1000 Yes Yes 1455085568 ( 50%) 41791552 ( 1%) nsd_p_d18 2928197632 1000 Yes Yes 1455905280 ( 50%) 41774176 ( 1%) nsd_p_d13 2928199856 1001 Yes Yes 1454873600 ( 50%) 42709424 ( 1%) nsd_p_d14 2928199856 1001 Yes Yes 1454778368 ( 50%) 42344544 ( 1%) nsd_p_d15 2928199856 1001 Yes Yes 1454587904 ( 50%) 42423984 ( 1%) nsd_p_d16 2928199856 1001 Yes Yes 1455053312 ( 50%) 42457888 ( 1%) nsd_p_d19 2928197632 1001 Yes Yes 1456759296 ( 50%) 42774192 ( 1%) nsd_p_d20 2928197632 1001 Yes Yes 1456851456 ( 50%) 42743488 ( 1%) nsd_p_d9 2928199856 1001 Yes Yes 1454508544 ( 50%) 42505328 ( 1%) nsd_p_d10 2928199856 1001 Yes Yes 1454662656 ( 50%) 42603616 ( 1%) nsd_p_d11 2928199856 1001 Yes Yes 1454833152 ( 50%) 42482192 ( 1%) nsd_p_d12 2928199856 1001 Yes Yes 1454873088 ( 50%) 42503456 ( 1%) nsd_p_md04 439023616 1010 Yes No 438985216 (100%) 8352 ( 0%) * nsd_p_md03 439023616 1010 Yes No 438986240 (100%) 7264 ( 0%) * nsd_p_md05 439023616 1010 Yes No 439011840 (100%) 7104 ( 0%) * nsd_p_md06 439023616 1010 Yes No 439013376 (100%) 5584 ( 0%) * nsd_p_md07 439023616 1010 Yes No 439013376 (100%) 5584 ( 0%) * nsd_p_md08 439023616 1010 Yes No 439014912 (100%) 4048 ( 0%) * nsd_p_md02 439023616 1010 Yes No 438984704 (100%) 8336 ( 0%) * nsd_p_md01 439023616 1010 Yes No 439000576 (100%) 9312 ( 0%) * ------------- -------------------- ------------------- (pool total) 60320073792 30856537088 ( 51%) 848522672 ( 1%) Disks in storage pool: pool_level_30 (Maximum disk size allowed is 142 TB) nsd_p_d22 15623913472 1000 No Yes 6762683392 ( 43%) 79840144 ( 1%) nsd_p_d21 15623913472 1000 No Yes 6764915712 ( 43%) 80654416 ( 1%) nsd_p_d23 15623913472 1001 No Yes 6746755072 ( 43%) 78184768 ( 1%) ------------- -------------------- ------------------- (pool total) 46871740416 20274354176 ( 43%) 238679328 ( 1%) ============= ==================== =================== (data) 105435719744 49374934528 ( 47%) 1087168736 ( 1%) (metadata) 60320073792 30856537088 ( 51%) 848522672 ( 1%) ============= ==================== =================== (total) 107191814208 51130891264 ( 48%) 1087202000 ( 1%) Inode Information ----------------- Number of used inodes: 72713046 Number of free inodes: 346717354 Number of allocated inodes: 419430400 Maximum number of inodes: 419430400 </pre>

    cheers
    Dieter
    
    -m  1                Default number of metadata replicas -M  2                Maximum number of metadata replicas -r  1                Default number of data replicas -R  2                Maximum number of data replicas
    

    Your replication settings for data and metadata are 1
    Your active failure groups are 1000 & 1001
    -> No replicated data or metadata even that you have 2 FGs

    You have quota enforced on the GPFS
    -> You have on some disks the files
    
    fileset.quota group.quota user.quota
    


    All disks in FG 1010 seems tobe empty but can not be removed from the GPFS.
    
    ... nsd_p_md03          439023616     1010 Yes      No        438986240 (100%) 7264 ( 0%) *
    


    I would suspect that the quota files can not be moved out of FG 1010
    To solve this i would change the FG 1001 to 1010 for all disk belonging to FG 1001 and try to delete the last MetaDataOnly disk from FG 1010.

    If the above does not work then you have to find out which data is still left on the last nsd_p_mdX disk or you do this step even before.

    Or in case you can umount the GPFS on all nodes you might disable quota, delete the quota file ( do a backup before ) , delete the last nsd_p_mdX disk and reenable quotas ( See Restoring quota files on GPFS Administration and Programming Reference ).

    Just my thoughts.
    Hajo
  • SystemAdmin
    SystemAdmin
    2092 Posts

    Re: mmdeldisk doesn't finish

    ‏2012-11-30T11:06:03Z  
    <pre class="jive-pre"> -m 1 Default number of metadata replicas -M 2 Maximum number of metadata replicas -r 1 Default number of data replicas -R 2 Maximum number of data replicas </pre>
    Your replication settings for data and metadata are 1
    Your active failure groups are 1000 & 1001
    -> No replicated data or metadata even that you have 2 FGs

    You have quota enforced on the GPFS
    -> You have on some disks the files
    <pre class="jive-pre"> fileset.quota group.quota user.quota </pre>

    All disks in FG 1010 seems tobe empty but can not be removed from the GPFS.
    <pre class="jive-pre"> ... nsd_p_md03 439023616 1010 Yes No 438986240 (100%) 7264 ( 0%) * </pre>

    I would suspect that the quota files can not be moved out of FG 1010
    To solve this i would change the FG 1001 to 1010 for all disk belonging to FG 1001 and try to delete the last MetaDataOnly disk from FG 1010.

    If the above does not work then you have to find out which data is still left on the last nsd_p_mdX disk or you do this step even before.

    Or in case you can umount the GPFS on all nodes you might disable quota, delete the quota file ( do a backup before ) , delete the last nsd_p_mdX disk and reenable quotas ( See Restoring quota files on GPFS Administration and Programming Reference ).

    Just my thoughts.
    Hajo
    Hi,

    to change the failure group was the solution.
    I changed the FG for nsd_p_md01 and 02 to 1000
    and to 1001 for nsd_p_md03 and 04.
    Now the "mmdeldisk" command does his work with success.

    So many thanks for your help.

    Dieter