Topic
  • 5 replies
  • Latest Post - ‏2012-11-09T12:00:58Z by wladyslaw17
wladyslaw17
wladyslaw17
12 Posts

Pinned topic IndblockBad appearing out of the blue

‏2012-11-08T17:27:21Z |
All,

I'm just trying to understand what fsck does exactly with no luck. Questions go first, explanation follows.

1) What does IndblockBad translates to, i.e. what is "bad"? A block this inode resides in? Inode data? My metadata are replicated - which copy it would be then? List of blocks it points to? A checksum? On what? I can't figure out what triggers this message.
2) Some files are detected as corrupted by mmfsck, some are not and I'm trying to figure out why some files slip through. What's detection logic in here, i.e. what checks are done by mmfsck exactly?
3) How do I verify what gets corrupted - list of blocks inode points to or content of these blocks? Is there a way to do something like capturing full snapshot of inodes to do comparison when I discover corrupted file later?

Longer story - my scenario is that fsck reports issues like the one below:

InodeProblemList: 1 entries
iNum snapId status keep delete noScan new error

----------
----
  • ---
22226371 0 1 0 0 0 1 0x10000000 IndblockBad

It all begins when user reports garbled file (backup copy confirms that), mmfsck is run on mounted filesystem and no errors related to reported file are found - but others are, like the one above, and they do point to other corrupted files. So, I'm left with files which are scrambled and mmfsck not reporting them, files which are scrambled and gpfs somehow discovering them and files becoming corrupt at 1-2 a week rate.

In case that helps - my metadata are replicated (i.e. original and one replica), data are not. Corruption is block based - i.e. gpfs block is either fully fine or every single byte in one differs when compared to backup copy. Corrupted blocks are spread on tens of luns. Files can be read with no errors, but they do differ to backup copy. GPFS is 3.4.0-15, os Linux. Files which get corrupted are not recent ones, majority of them is >4 months old.

Any pointers are welcomed warmly. I'm happy to post any data requested.
Updated on 2012-11-09T12:00:58Z at 2012-11-09T12:00:58Z by wladyslaw17
  • dlmcnabb
    dlmcnabb
    1012 Posts

    Re: IndblockBad appearing out of the blue

    ‏2012-11-08T17:38:56Z  
    That err means that an indirect block for inode 22226371 is corrupted. Online mmfsck cannot fix this problem.

    Find out the filename for this inode by running
    
    tsfindinode -i 22226371 $mountpoint
    


    You can then delete and restore the file from backup.

    However, if there is one file corrupted there may be other things gone wrong, so you should run offline mmfsck to assess the damage:
    
    mmfsck $fsname -v -n > mmfsck.$fsname.out 2>&1
    
  • wladyslaw17
    wladyslaw17
    12 Posts

    Re: IndblockBad appearing out of the blue

    ‏2012-11-08T18:05:53Z  
    • dlmcnabb
    • ‏2012-11-08T17:38:56Z
    That err means that an indirect block for inode 22226371 is corrupted. Online mmfsck cannot fix this problem.

    Find out the filename for this inode by running
    <pre class="jive-pre"> tsfindinode -i 22226371 $mountpoint </pre>

    You can then delete and restore the file from backup.

    However, if there is one file corrupted there may be other things gone wrong, so you should run offline mmfsck to assess the damage:
    <pre class="jive-pre"> mmfsck $fsname -v -n > mmfsck.$fsname.out 2>&1 </pre>
    Dlmcnabb,

    Thank you very much for your answer.

    To clarify - indirect block is no different than on any other fs, right? I.e. it is a block attached to inode storing a list of all blocks used by this file? That would mean that data themselves are probably fine, but block locations get mixed up somehow? And how fsck figures out that they got mixed up?

    Every scrambled file I encounter gets deleted and restored from backup, that's not a problem. The idea was to wipe them all and forget about the issue, but new cases like this one keep appearing. I can't do downtime and the plan was to locate and delete all corrupted files.

    One more question - my metadata are replicated, does it mean that both original and replica are scrambled?
  • wladyslaw17
    wladyslaw17
    12 Posts

    Re: IndblockBad appearing out of the blue

    ‏2012-11-08T18:37:38Z  
    Dlmcnabb,

    Thank you very much for your answer.

    To clarify - indirect block is no different than on any other fs, right? I.e. it is a block attached to inode storing a list of all blocks used by this file? That would mean that data themselves are probably fine, but block locations get mixed up somehow? And how fsck figures out that they got mixed up?

    Every scrambled file I encounter gets deleted and restored from backup, that's not a problem. The idea was to wipe them all and forget about the issue, but new cases like this one keep appearing. I can't do downtime and the plan was to locate and delete all corrupted files.

    One more question - my metadata are replicated, does it mean that both original and replica are scrambled?
    Well, forget about that. I just found legitimate scenario in which by fixing that I make it worse. Key to understanding what's going in is ind in IndblockBad - as it was pointed out it means indirect, not inode.

    Many thanks for clarifying the name for me.
  • dlmcnabb
    dlmcnabb
    1012 Posts

    Re: IndblockBad appearing out of the blue

    ‏2012-11-08T20:11:00Z  
    Dlmcnabb,

    Thank you very much for your answer.

    To clarify - indirect block is no different than on any other fs, right? I.e. it is a block attached to inode storing a list of all blocks used by this file? That would mean that data themselves are probably fine, but block locations get mixed up somehow? And how fsck figures out that they got mixed up?

    Every scrambled file I encounter gets deleted and restored from backup, that's not a problem. The idea was to wipe them all and forget about the issue, but new cases like this one keep appearing. I can't do downtime and the plan was to locate and delete all corrupted files.

    One more question - my metadata are replicated, does it mean that both original and replica are scrambled?
    When GPFS finds one replica invalid, it reads the other one and tries to use that one. So if the seconds replica does not generate an FSSTRUCT error then you are fine. If you actually can modify the metadata, the new version will overwrite both copies fixing the bad one.

    If you run the fsstruct scripts (AIX and Linux versions in /usr/lpp/mmfs/samples/debugtools) to decode the sense data, the line will show the two disk addresses for the indblock (repda). If only one of the disk addresses gets an fsstruct log record, then the other one is good.
  • wladyslaw17
    wladyslaw17
    12 Posts

    Re: IndblockBad appearing out of the blue

    ‏2012-11-09T12:00:58Z  
    • dlmcnabb
    • ‏2012-11-08T20:11:00Z
    When GPFS finds one replica invalid, it reads the other one and tries to use that one. So if the seconds replica does not generate an FSSTRUCT error then you are fine. If you actually can modify the metadata, the new version will overwrite both copies fixing the bad one.

    If you run the fsstruct scripts (AIX and Linux versions in /usr/lpp/mmfs/samples/debugtools) to decode the sense data, the line will show the two disk addresses for the indblock (repda). If only one of the disk addresses gets an fsstruct log record, then the other one is good.
    Thank you for pointing me at fsstruct scripts, they saved me a lot of time on debugging. For my case from yesterday both replicas are invalid:

    11/08@16:01:46 nsd FSSTRUCT gpfsdisk 108 FSErrValidate type=indBlock da=00000003:00000000090700C0(3:151453888) sectors=0040 repda=nVal=2 00000003:00000000090700C0(3:151453888) 00000001:0000000009079000(1:151490560) data=(len=00008000) F3BD5631 00000001 00000000 00000000 08014942 4D4F626A 00007C00 0101020C 00000000 011B096A 5D48A070 01C86ADF 0620866B 5143F996 B82DA320 02000000 E0288800 00000000 00000000 00000000 00000000 00020000 00000000 00000000 00000000 02
    11/08@16:01:47 nsd FSSTRUCT gpfsdisk 108 FSErrValidate type=indBlock da=00000001:0000000009079000(1:151490560) sectors=0040 repda=nVal=2 00000003:00000000090700C0(3:151453888) 00000001:0000000009079000(1:151490560) data=(len=00008000) F3BD5631 00000001 00000000 00000000 08014942 4D4F626A 00007C00 0101020C 00000000 011B096A 5D48A070 018F335F AE44BE7C D7D31C4F E411E801 06000000 207A8001 00000000 00000000 00000000 00000000 00020000 00000000 00000000 00000000 02

    And this is ultimate proof of having files pointing to blocks freed before:

    11/09@04:53:03 nsd FSSTRUCT gpfs 107 FSErrDeallocBlock errno=00000010 diskNum=0000000A sector=0000000272FB9000 nSubblocks=00000020 region=00000B30 segment=0000001D subBlock=00008840

    All that follows this point are my assumptions with no hard proof, so please correct me if I'm wrong anywhere.

    Online mmfsck does only limited checks on list of blocks allocated to inodes (allocation keeps changing while it's running, so that's understandable) and files having blocks allocated twice (i.e. for two different inodes) slip through. I suspect that what it flags are inodes pointing at freed blocks. When I delete corrupted file I free some blocks, which may be pointed at by another inode (which is also corrupted, but not reported in such scenario). So, when deleting corrupted file, I must make sure that blocks for this file are not referenced anywhere else and, in case they are, all files referencing ones must be deleted. I guess offlne fsck does that for me, but bringing fs down is out of question currently.

    And my questions:
    1) is there a way to examine a block to say that it's freed/allocated/allocated to inodes X/Y/Z? The last option is probably too much to ask for, but free/allocated should be possible, as this is what gpfs logs as FSErrDeallocBlock error.
    2) how to delete a file without freeing blocks it points to? That would stop freeing blocks allocated twice and I can catch lost blocks afterwards.
    2) is offline mmfsck supposed to catch and fix such errors at all? Technically it's possible but is it implemented?