• 2 replies
  • Latest Post - ‏2013-08-06T11:02:26Z by ufa
29 Posts

Pinned topic GPFS and subblock I/O

‏2013-07-25T16:34:31Z |

We have been considering reimplementing our GPFS storage system (built using x3650's SAS-attached to DCS3700's) to see if we can gain some better performance by changing the fs block size, using larger inodes, etc.  To plan, we have been testing I/O performance to a DCS3700 at the block device level, so we know that while accessing data on an 8+PQ LUN performs well when accessing full stripes on stripe boundaries, accessing smaller units of storage is, well, dismal.

We know it is recommended that the stripe size match the file system block size, but then there is this notion of the "subblock".  Will GPFS ever directly access the raw block device using subblock-sized I/Os?  Or are subblocks only an allocation unit within the file system, and GPFS only does full block sized I/O to the block device?

I know I can use several stripes of the LUN to make up a file system block without losing much performance, but no where near 32!


  • dlmcnabb
    1012 Posts

    Re: GPFS and subblock I/O


    Sequential access of a file will do full block IO with prefetching and writebehind which is where you get the performance impact of wide striping and the stripe width of the RAID matches the GPFS blocksize.

    If you do random reads or write to existing blocks of a file, GPFS will do sector aligned (512B) IO to only read or write the necessary data. There are also heuristics to read and cache more sectors of a block if it appears that there are lots of hits within a short period of time in the same block.

    Subblocks are a unit of allocation (1/32 of a full block), not of doing IOs.

    Some disks only do 4K IOs, so controllers may round sector aligned requests up to 4K (or larger) aligned IOs internally and cache them in NVRAMs.

  • ufa
    169 Posts

    Re: GPFS and subblock I/O


    when using RAID5 or RAID6 and doing writes smaller than the full stripe width, in order to calculate the stripe parity information, all sectors of the stripe on the array disks have to be read by the disk controller (except those to be written anew), and that comes at a price. This is probably what you are seeing when going to smaller I/Os. You'll need to find the optimum which also depends on your I/O patterns. There's no one size fits all ...