Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
10 replies Latest Post - ‏2013-12-05T21:59:24Z by yuri
3CKG_Ramalingam_Ayyamperumal
6 Posts
ACCEPTED ANSWER

Pinned topic IO damn slow in GPFS

‏2013-12-03T16:16:52Z |

Hi GPFS Gurus,

I know I am facing the very primitive bottleneck of IO slowness in my GPFS Setup. I am a newbie to GPFS and have installed & configured GPFS with google's help. 

Please find the IO Comparison between Local & GPFS filesystem. I have increased the pagepool size to 1000M. At the moment, I am not even running any application on this. However, this GPFS filesystem is going to hit with Random IO and the performance is really needed from this filesystem.

 

LOCAL FILESYSTEM

================

bash-3.2# time sh -c "dd if=/dev/zero of=ddfile bs=8k count=1000000 && sync"
1000000+0 records in.
1000000+0 records out.
 
real    0m38.349s
user    0m1.409s
sys     0m22.405s
 
 
GPFS FILESYSTEM

================

bash-3.2# time sh -c "dd if=/dev/zero of=ddfile bs=8k count=1000000 && sync"
1000000+0 records in.
1000000+0 records out.
 
real    1m44.338s
user    0m1.245s
sys     0m9.200s

 

GPFS cluster information
========================
  GPFS cluster name:         PROD_EMS.FT
  GPFS cluster id:           771243877078702085
  GPFS UID domain:           PROD_EMS.FT
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
 
GPFS cluster configuration servers:
-----------------------------------
  Primary server:    EMS_NODE_1
  Secondary server:  EMS_NODE_2
 
 Node  Daemon node name            IP address       Admin node name             Designation
-----------------------------------------------------------------------------------------------
   1   EMS_NODE_1                  10.X.X.1     EMS_NODE_1                  quorum-manager
   2   EMS_NODE_2                  10.X.Y.1    EMS_NODE_2                  quorum-manager
   3   EMS_NODE_3                  10.X.X.2      EMS_NODE_3                  quorum-manager
flag                value                    description
------------------- ------------------------ -----------------------------------
 -f                 16384                    Minimum fragment size in bytes
 -i                 512                      Inode size in bytes
 -I                 16384                    Indirect block size in bytes
 -m                 1                        Default number of metadata replicas
 -M                 2                        Maximum number of metadata replicas
 -r                 1                        Default number of data replicas
 -R                 2                        Maximum number of data replicas
 -j                 cluster                  Block allocation type
 -D                 nfs4                     File locking semantics in effect
 -k                 all                      ACL semantics in effect
 -n                 32                       Estimated number of nodes that will mount file system
 -B                 524288                   Block size
 -Q                 user;group;fileset       Quotas enforced
                    none                     Default quotas enabled
 --filesetdf        no                       Fileset df enabled?
 -V                 12.10 (3.4.0.7)          File system version
 --create-time      Fri Oct 19 18:17:33 2012 File system creation time
 -u                 yes                      Support for large LUNs?
 -z                 no                       Is DMAPI enabled?
 -L                 4194304                  Logfile size
 -E                 yes                      Exact mtime mount option
 -S                 no                       Suppress atime mount option
 -K                 whenpossible             Strict replica allocation option
 --fastea           yes                      Fast external attributes enabled?
 --inode-limit      133120                   Maximum number of inodes
 -P                 system                   Disk storage pools in file system
 -d                 GBSDISK_1;GBSDISK_2      Disks in file system
 -A                 yes                      Automatic mount option
 -o                 none                     Additional mount options
 -T                 /data/tibco              Default mount point
 --mount-priority   0                        Mount priority
 

Any guidance will be much appreciated

Thanks

Ram

 
  • dlmcnabb
    dlmcnabb
    1012 Posts
    ACCEPTED ANSWER

    Re: IO damn slow in GPFS

    ‏2013-12-03T16:28:05Z  in response to 3CKG_Ramalingam_Ayyamperumal

    What code release and maintenance level have you installed?

         "mmfsadm dump version | grep Buil"

    What have you set for configuration?

         mmlsconfig

    • 3CKG_Ramalingam_Ayyamperumal
      6 Posts
      ACCEPTED ANSWER

      Re: IO damn slow in GPFS

      ‏2013-12-03T16:33:41Z  in response to dlmcnabb
      bash-3.2# mmfsadm dump version | grep Buil
      Build branch "3.4.0.11 ".
      Built on Jan 27 2012 at 12:05:31 by .
      bash-3.2# mmlsconfig
      Configuration data for cluster PROD_EMS.FT:
      -------------------------------------------
      myNodeConfigNumber 2
      clusterName PROD_EMS.FT
      clusterId 771243877078702085
      autoload no
      minReleaseLevel 3.4.0.7
      dmapiFileHandleSize 32
      minQuorumNodes 1
      pagepool 1000M
      maxMBpS 3200
      adminMode central
       
      File systems in cluster PROD_EMS.FT:
      ------------------------------------
      /dev/emsfs
      bash-3.2#
       
      • dlmcnabb
        dlmcnabb
        1012 Posts
        ACCEPTED ANSWER

        Re: IO damn slow in GPFS

        ‏2013-12-03T16:57:11Z  in response to 3CKG_Ramalingam_Ayyamperumal

        3.4.0.11 is extremely old with many performance changes since then. You should do rolling upgrade to 3.4.0.25 and since this is Linux, don't forget to rebuild the portabiulity layer.

        • 3CKG_Ramalingam_Ayyamperumal
          6 Posts
          ACCEPTED ANSWER

          Re: IO damn slow in GPFS

          ‏2013-12-03T17:08:00Z  in response to dlmcnabb

          All Servers are on AIX. And it is serving just a shared filesystem across 2 machines

        • 3CKG_Ramalingam_Ayyamperumal
          6 Posts
          ACCEPTED ANSWER

          Re: IO damn slow in GPFS

          ‏2013-12-03T17:49:28Z  in response to dlmcnabb

          Sorry the performance went even worse

          bash-3.2# time sh -c "dd if=/dev/zero of=ddfile bs=8k count=1000000 && sync"
          1000000+0 records in.
          1000000+0 records out.
           
          real    2m4.787s
          user    0m1.257s
          sys     0m9.301s
          bash-3.2# cat /dev/null > ddfile
          bash-3.2# pwd
          /data/tibco/ram
          bash-3.2# time sh -c "dd if=/dev/zero of=ddfile bs=8k count=1000000 && sync"
          1000000+0 records in.
          1000000+0 records out.
           
          real    2m10.597s
          user    0m1.227s
          sys     0m9.110s
           
    • 3CKG_Ramalingam_Ayyamperumal
      6 Posts
      ACCEPTED ANSWER

      Re: IO damn slow in GPFS

      ‏2013-12-03T16:58:09Z  in response to dlmcnabb

      The Storage behind this setup is SVC --> DS8800. I attached the SVC disk directly and that was the IO benefit I received from local filesystem. As it is going to run a message bus application, I believe this will require Random IO. Though I see lot of recommendations for the RIO, I am not rather confident about my setup

      I have attached 2 disks from the same spool from a single storage.

      bash-3.2# mmlsnsd
       
       File system   Disk name    NSD servers
      ---------------------------------------------------------------------------
       emsfs         GBSDISK_1    (directly attached)
       emsfs         GBSDISK_2    (directly attached)
       (free disk)   NECDISK_1    (directly attached)
       
  • HajoEhlers
    HajoEhlers
    251 Posts
    ACCEPTED ANSWER

    Re: IO damn slow in GPFS

    ‏2013-12-04T17:50:12Z  in response to 3CKG_Ramalingam_Ayyamperumal

    1) You notice that your combined sys/user compared to real is 1:10 ?

    real    1m44.338s
    user    0m1.245s
    sys     0m9.200s

     

    2) that the average stremining speed is around 80 MB/s ?

     

    From this i assume that you might use only a single lun, the storage controller is overloaded or ....

    So from this you should use "nmon" to get a quick view of your performance during the test, also get with with "mmdiag --iohist | sort -n -k6,6 " a view on what GPFS reports on IO time .....

     

    BTW: Check also with  ( Example )

    $  iostat -D -d hdisk1 1 10

    the disk load ....

     

    happy trouble-shooting

    Hajo

     

     

    P.S  A  "dd if=/dev/zero of=ddfile bs=8k count=1000000 && sync" does not creates random IO........

     

     

    Updated on 2013-12-04T17:53:46Z at 2013-12-04T17:53:46Z by HajoEhlers
  • yuri
    yuri
    202 Posts
    ACCEPTED ANSWER

    Re: IO damn slow in GPFS

    ‏2013-12-04T23:37:12Z  in response to 3CKG_Ramalingam_Ayyamperumal

    If you want to test random IO performance, "dd if=/dev/zero of=ddfile bs=8k count=1000000 && sync" isn't it.  This isn't random, and isn't small record (8k writes will be coalesced in buffer cache).  With a local fs, you'll have entire RAM available as buffer cache, while with GPFS it'll be limited to the configured pagepool space.  You're probably not interested in the caching efficiency though, but rather want to know the steady-state performance level.

    To exercise the actual random 8k record IO, you need a more advanced tool than 'dd'.  gpfsperf.c (available under /usr/lpp/mmfs/samples) can do this, but there are many other tools.  (Ir)regardless of what tool you use, be sure to use the dataset size that exceeds the RAM/disk controller cache size by at least an order of magnitude, to compensate for caching.  Note that it's critical to know how your app does its writes, in particular whether it uses O_SYNC or O_DIRECT -- those have very different performance profiles from regular buffered writes.

    yuri

    • 3CKG_Ramalingam_Ayyamperumal
      6 Posts
      ACCEPTED ANSWER

      Re: IO damn slow in GPFS

      ‏2013-12-05T19:28:52Z  in response to yuri

      Thanks Yuri & Hajo,

      I used the gpfsperf and received the following output.  This environment has to be used for Informatica GRID Architechture

      bash-3.2# /usr/lpp/mmfs/samples/perf/gpfsperf write seq /gpfs/gogo -r 256k -n                                                                                 1024000000
      /usr/lpp/mmfs/samples/perf/gpfsperf write seq /gpfs/gogo
        recSize 256K nBytes 999936K fileSize 999936K
        nProcesses 1 nThreadsPerProcess 1
        file cache flushed before test
        not using data shipping
        not using direct I/O
        offsets accessed will cycle through the same file segment
        not using shared memory buffer
        not releasing byte-range token after open
        no fsync at end of test
          Data rate was 75339.10 Kbytes/sec, thread utilization 1.000
      bash-3.2# /usr/lpp/mmfs/samples/perf/gpfsperf write seq /gpfs/gogo -r 256k -n 1024000000
      /usr/lpp/mmfs/samples/perf/gpfsperf write seq /gpfs/gogo
        recSize 256K nBytes 999936K fileSize 999936K
        nProcesses 1 nThreadsPerProcess 1
        file cache flushed before test
        not using data shipping
        not using direct I/O
        offsets accessed will cycle through the same file segment
        not using shared memory buffer
        not releasing byte-range token after open
        no fsync at end of test
          Data rate was 68167.37 Kbytes/sec, thread utilization 1.000
       

      Sample output during such run

      ========================

      Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
      hdisk0          28.1     4958.8      19.4          0     10240
       
      tty:      tin         tout    avg-cpu: % user % sys % idle % iowait physc % entc
                0.0        147.7                4.0  52.5   36.9      6.5   0.4  417.1
       
      Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
      hdisk0          35.5     9740.2      38.0          0     19456
       
      tty:      tin         tout    avg-cpu: % user % sys % idle % iowait physc % entc
                0.0        147.5                3.5  54.1   35.9      6.5   0.5  463.9
       
      Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
      hdisk0          53.0     12288.0      48.0          0     24576
       

      Create seq

      bash-3.2# ./gpfsperf create seq /gpfs/gogo -r 256k -n 1024000000
      ./gpfsperf create seq /gpfs/gogo
        recSize 256K nBytes 999936K fileSize 1000000K
        nProcesses 1 nThreadsPerProcess 1
        file cache flushed before test
        not using data shipping
        not using direct I/O
        offsets accessed will cycle through the same file segment
        not using shared memory buffer
        not releasing byte-range token after open
        no fsync at end of test
          Data rate was 91784.40 Kbytes/sec, thread utilization 0.999
       

      Read seq

      bash-3.2# /usr/lpp/mmfs/samples/perf/gpfsperf read seq /gpfs/gogo -r 256k -n 1024000000
      /usr/lpp/mmfs/samples/perf/gpfsperf read seq /gpfs/gogo
        recSize 256K nBytes 999936K fileSize 999936K
        nProcesses 1 nThreadsPerProcess 1
        file cache flushed before test
        not using data shipping
        not using direct I/O
        offsets accessed will cycle through the same file segment
        not using shared memory buffer
        not releasing byte-range token after open
          Data rate was 75419.54 Kbytes/sec, thread utilization 1.000
       
      System configuration: lcpu=4 drives=5 ent=0.10 paths=20 vdisks=0
       
      tty:      tin         tout    avg-cpu: % user % sys % idle % iowait physc % entc
                0.0         34.0                4.9  56.9   26.7     11.4   0.5  457.2
       
      Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
      hdisk0          96.4     12784.0      49.9      25600         0
       
      tty:      tin         tout    avg-cpu: % user % sys % idle % iowait physc % entc
                0.0        148.0                5.0  56.5   28.2     10.2   0.5  487.5
       
      Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
      hdisk0          93.0     12544.0      49.0      25088         0
       

       

      • yuri
        yuri
        202 Posts
        ACCEPTED ANSWER

        Re: IO damn slow in GPFS

        ‏2013-12-05T21:59:24Z  in response to 3CKG_Ramalingam_Ayyamperumal

        I thought you were interested in small random IO performance?  In that case you want something like "gpfsperf read rand -r 8k", not "read seq -r 256k".  For small random IO workloads, the metric that's usually considered to be the most important is not bandwidth, but rather IOPS (reads or writes per sec).

        In any event, once you eliminate caching artefacts, the random IO rate is ultimately controlled by the disk seek time (unless you dataset is so small that it fits comfortably in some caching layer).  There are only so many ways to improve on this.  Basically, you need more physical disks, and/or faster disks (SSD or higher-RPM conventional drives).

        yuri