Topic
  • 8 replies
  • Latest Post - ‏2018-09-27T14:04:44Z by sxiao
Mikhail Ushanov
Mikhail Ushanov
2 Posts

Pinned topic Invalid SEEK_HOLE/SEEK_DATA behavior

‏2017-12-08T13:20:05Z | bug error linux qemu-img gpfs4.1 gpfs4.2 lseek

Hello to all,

I've found abnormal behavior of SEEK_HOLE/SEEK_DATA parameter implementation in lseek syscall on GPFS. When some system tool like 'tar', 'cp', 'qemu-img', or other, uses 'lseek' for detect holes in sparse file, it found the end of file after the first hole.

This problem can be reproduced on Linux host with kernel >= 3.1. I'm using CentOS 7.2.1511 and kernel 3.10.0-327.59.1.el7.x86_64 and GPFS v4.1 with latest fixes (4.1.1.18).

Steps to reproduce:

  • Create sparse file on GPFS mount:
    • truncate test-sparse.img --size 1G
  • Write 1M of random data after 2M hole from start of the file (bypass host page cache using 'direct' write semantic):
    • dd if=/dev/urandom of=test-sparse.img bs=1M count=1 oflag=seek_bytes,direct seek=2M conv=notrunc
  • Ensure, that real file size is 1M:
    • ll -sh
    • 1.0M -rw-r--r-- 1 root root 1.0G Dec  8 15:59 test-sparse.img
  • Create tar archive from the sparse file. Must be used 'tar' >= 1.29, cause to hole detection support. I've build latest 'tar' from sources:
    • /home/centos/tar-1.29/out/usr/local/bin/tar --sparse --hole-detection=seek -cf out/test-sparse.tar test-sparse.img
  • Ensure, that archive real size is 0M:
    • ll -sh test-sparse.tar
    • 0 -rw-r--r-- 1 root root 10K Dec  8 16:01 test-sparse.tar
  • Untar the archive and ensure, that sparse file is empty:
    • mkdir out && tar -xf test-sparse.tar -C out
    • ll -sh out/test-sparse.img
    • 0 -rw-r--r-- 1 root root 1.0G Dec  8 15:59 out/test-sparse.img

Expected result:

  • Archived sparse file must be identical with original sparse file.

Real result:

  • Archived sparse file was corrupted.

This problem also affects the 'convert' operation in latest 'qemu-img' tool, that uses SEEK_HOLE/SEEK_DATA on converting raw sparse images.\

Thread in GNU bug tracker with the same problem - https://debbugs.gnu.org/cgi/bugreport.cgi?bug=27666

Can anyone check this behavior and confirm as a bug or not?

Updated on 2017-12-08T13:34:08Z at 2017-12-08T13:34:08Z by Mikhail Ushanov
  • sxiao
    sxiao
    61 Posts
    ACCEPTED ANSWER

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2017-12-09T05:02:39Z  

    Hi Steve,

    Thanks for quick reply! But, I think my case is slightly different from case in discuss you mentioned. As I see in 'tar' source code, 'sparse_scan_file_wholesparse' function, which checks the file for truly sparse, called before 'switch (hole_detection)'. In my example, I use '--hole-detection=seek' parameter in the 'tar' command, and get corrupted archive. But, if change '--hole-detection' parameter to 'raw', the archive will be correct, and after extract, output file will be identical with the original file:

    • /home/centos/tar-1.29/out/usr/local/bin/tar --sparse --hole-detection=raw -cf out/test-sparse.tar test-sparse.img
    • ll -sh out/test-sparse.tar
    • 1.0M -rw-r--r-- 1 root root 1.1M Dec  8 22:12 out/test-sparse.tar

    So, It looks like the problem in 'sparse_scan_file_seek', which use SEEK_HOLE and SEEK_DATA to find holes in sparse file. And that parameters in lseek system call works not properly in GPFS.

    Look at the 'strace -f -v' output of 'tar' command with '--hole-detection=seek':

    openat(AT_FDCWD, "test-sparse.img", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 4
    fstat(4, {st_dev=makedev(0, 37), st_ino=40712, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=1048576, st_blocks=2048, st_size=1073741824, st_atime=2017/12/08-22:12:51, st_mtime=2017/12/08-22:02:08, st_ctime=2017/12/08-22:02:08}) = 0
    lseek(4, 0, SEEK_DATA)                  = -1 ENXIO (No such device or address)

    The 'st_blocks' seem to be correct, but 'lseek' with SEEK_DATA returns the end of the file, which is incorrect, and the next data block starts from 2M at the begin of the file. It looks like an incorrect behavior of 'lseek'.

    May you looks closer to this issue and check the lseek implementation in GPFS?

    Mikhall,

    Sorry.    You can correct.   I did not read your problem description carefully enough.

     

    I did not have the setup to perform the test using tar but I was able to confirm that GPFS implementation of SEEK_DATA support for sleek is not working correctly for this case.    Please contact IBM service to have them open a PMR for this issue.    I will make sure a defect get opened for this issue.     You should get notified thru the PMR when a fix become available.

  • chr78
    chr78
    146 Posts

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2017-12-08T13:43:57Z  

    see http://www.spectrumscale.org/pipermail/gpfsug-discuss/2017-November/004215.html

  • sxiao
    sxiao
    61 Posts

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2017-12-08T18:01:58Z  
    • chr78
    • ‏2017-12-08T13:43:57Z

    Look like that particular change did not make it into 4.1.1 release yet for some reason.    It may be in the next 4.1.1 PTF.     You will need to contact IBM service to obtain an efix if you need the fix sooner.     You can reference IV96475 when contact IBM service.

  • Mikhail Ushanov
    Mikhail Ushanov
    2 Posts

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2017-12-08T19:41:46Z  
    • sxiao
    • ‏2017-12-08T18:01:58Z

    Look like that particular change did not make it into 4.1.1 release yet for some reason.    It may be in the next 4.1.1 PTF.     You will need to contact IBM service to obtain an efix if you need the fix sooner.     You can reference IV96475 when contact IBM service.

    Hi Steve,

    Thanks for quick reply! But, I think my case is slightly different from case in discuss you mentioned. As I see in 'tar' source code, 'sparse_scan_file_wholesparse' function, which checks the file for truly sparse, called before 'switch (hole_detection)'. In my example, I use '--hole-detection=seek' parameter in the 'tar' command, and get corrupted archive. But, if change '--hole-detection' parameter to 'raw', the archive will be correct, and after extract, output file will be identical with the original file:

    • /home/centos/tar-1.29/out/usr/local/bin/tar --sparse --hole-detection=raw -cf out/test-sparse.tar test-sparse.img
    • ll -sh out/test-sparse.tar
    • 1.0M -rw-r--r-- 1 root root 1.1M Dec  8 22:12 out/test-sparse.tar

    So, It looks like the problem in 'sparse_scan_file_seek', which use SEEK_HOLE and SEEK_DATA to find holes in sparse file. And that parameters in lseek system call works not properly in GPFS.

    Look at the 'strace -f -v' output of 'tar' command with '--hole-detection=seek':

    openat(AT_FDCWD, "test-sparse.img", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 4
    fstat(4, {st_dev=makedev(0, 37), st_ino=40712, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=1048576, st_blocks=2048, st_size=1073741824, st_atime=2017/12/08-22:12:51, st_mtime=2017/12/08-22:02:08, st_ctime=2017/12/08-22:02:08}) = 0
    lseek(4, 0, SEEK_DATA)                  = -1 ENXIO (No such device or address)

    The 'st_blocks' seem to be correct, but 'lseek' with SEEK_DATA returns the end of the file, which is incorrect, and the next data block starts from 2M at the begin of the file. It looks like an incorrect behavior of 'lseek'.

    May you looks closer to this issue and check the lseek implementation in GPFS?

  • sxiao
    sxiao
    61 Posts

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2017-12-09T05:02:39Z  

    Hi Steve,

    Thanks for quick reply! But, I think my case is slightly different from case in discuss you mentioned. As I see in 'tar' source code, 'sparse_scan_file_wholesparse' function, which checks the file for truly sparse, called before 'switch (hole_detection)'. In my example, I use '--hole-detection=seek' parameter in the 'tar' command, and get corrupted archive. But, if change '--hole-detection' parameter to 'raw', the archive will be correct, and after extract, output file will be identical with the original file:

    • /home/centos/tar-1.29/out/usr/local/bin/tar --sparse --hole-detection=raw -cf out/test-sparse.tar test-sparse.img
    • ll -sh out/test-sparse.tar
    • 1.0M -rw-r--r-- 1 root root 1.1M Dec  8 22:12 out/test-sparse.tar

    So, It looks like the problem in 'sparse_scan_file_seek', which use SEEK_HOLE and SEEK_DATA to find holes in sparse file. And that parameters in lseek system call works not properly in GPFS.

    Look at the 'strace -f -v' output of 'tar' command with '--hole-detection=seek':

    openat(AT_FDCWD, "test-sparse.img", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 4
    fstat(4, {st_dev=makedev(0, 37), st_ino=40712, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=1048576, st_blocks=2048, st_size=1073741824, st_atime=2017/12/08-22:12:51, st_mtime=2017/12/08-22:02:08, st_ctime=2017/12/08-22:02:08}) = 0
    lseek(4, 0, SEEK_DATA)                  = -1 ENXIO (No such device or address)

    The 'st_blocks' seem to be correct, but 'lseek' with SEEK_DATA returns the end of the file, which is incorrect, and the next data block starts from 2M at the begin of the file. It looks like an incorrect behavior of 'lseek'.

    May you looks closer to this issue and check the lseek implementation in GPFS?

    Mikhall,

    Sorry.    You can correct.   I did not read your problem description carefully enough.

     

    I did not have the setup to perform the test using tar but I was able to confirm that GPFS implementation of SEEK_DATA support for sleek is not working correctly for this case.    Please contact IBM service to have them open a PMR for this issue.    I will make sure a defect get opened for this issue.     You should get notified thru the PMR when a fix become available.

  • chr78
    chr78
    146 Posts

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2017-12-12T17:23:31Z  
    • sxiao
    • ‏2017-12-09T05:02:39Z

    Mikhall,

    Sorry.    You can correct.   I did not read your problem description carefully enough.

     

    I did not have the setup to perform the test using tar but I was able to confirm that GPFS implementation of SEEK_DATA support for sleek is not working correctly for this case.    Please contact IBM service to have them open a PMR for this issue.    I will make sure a defect get opened for this issue.     You should get notified thru the PMR when a fix become available.

    Steve, could you please post defect # or APAR once available? thanks!

  • sxiao
    sxiao
    61 Posts

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2017-12-12T19:49:10Z  
    • chr78
    • ‏2017-12-12T17:23:31Z

    Steve, could you please post defect # or APAR once available? thanks!

    There is no APAR assigned yet.   Defect number is 1040925

  • ThomasImmel
    ThomasImmel
    1 Post

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2018-09-27T12:56:22Z  

    Is there any update on the defect ?

  • sxiao
    sxiao
    61 Posts

    Re: Invalid SEEK_HOLE/SEEK_DATA behavior

    ‏2018-09-27T14:04:44Z  

    Is there any update on the defect ?

    Here is the link to the flash alert that was issued for this problem:

    https://www.ibm.com/support/docview.wss?uid=ssg1S1012054

     

    4.1.1.19 or later has the fix.