Topic
8 replies Latest Post - ‏2012-12-20T23:42:47Z by mduff
mduff
mduff
30 Posts
ACCEPTED ANSWER

Pinned topic Reads from replicated metadata from one failure group

‏2012-11-08T00:06:12Z |
Hello,

We are currently seeing reads from metadataonly NSDs only use one failure group. This is a local filesystem, no remote clusters are being used.

This can be seen while running a find command or a GPFS LIST policy. iostat is showing only one failure group being used for reads.

Shouldn't reads of replicated metadata be using both failure groups to provide faster access? The nodes running the find or the GPFS policy have direct connections to all drives in both failure groups.

The readReplicatePolicy didn't make a difference (either default or local), which is expected as I believe this only applies when using remote clusters.

Thank you
Updated on 2012-12-20T23:42:47Z at 2012-12-20T23:42:47Z by mduff
  • dlmcnabb
    dlmcnabb
    994 Posts
    ACCEPTED ANSWER

    Re: Reads from replicated metadata from one failure group

    ‏2012-11-08T00:13:30Z  in response to mduff
    It should unless there are suspended LUNs in the other FG.

    Or possibly if you created the filesystem with only one FG than then later changed it to mmchfs $fsname -m 2 and ran mmrestripefs -R to replicate the existing metadata. GPFS does not rebalance the metadata.
    • mduff
      mduff
      30 Posts
      ACCEPTED ANSWER

      Re: Reads from replicated metadata from one failure group

      ‏2012-11-28T00:18:00Z  in response to dlmcnabb
      Cheers for that Dan.

      We lost an entire metadata failure group, and then ran the restripe after it was recovered.

      Should we run "mmrestripefs -b" to rebalance the metadata?

      Thank you
      • dlmcnabb
        dlmcnabb
        994 Posts
        ACCEPTED ANSWER

        Re: Reads from replicated metadata from one failure group

        ‏2012-11-28T06:58:39Z  in response to mduff
        Unfortunately, rebalance does not work on metadata.
        • mduff
          mduff
          30 Posts
          ACCEPTED ANSWER

          Re: Reads from replicated metadata from one failure group

          ‏2012-11-28T16:11:22Z  in response to dlmcnabb
          I'm not sure of the terminology, but if we call the MD copies primary and secondary, and we know that all of the primary copies are in one failure group (unbalanced), what is the criteria for using the secondary copy to improve read performance?
          • pce
            pce
            57 Posts
            ACCEPTED ANSWER

            Re: Reads from replicated metadata from one failure group

            ‏2012-12-20T14:47:06Z  in response to mduff
            I'd like to confirm the settings. Please run "mmfsadm dump config | grep readReplica" on the nodes of interest to see if they are all set to the same readReplicaPolicy.

            readReplicaPolicy is not a function of remote clusters. It specifies whether to preferentially use local (SAN) access. On reads, the code by default will read the first replica. When readReplicaPolicy is set to "local", and the first replica cannot be accessed locally on the node, the code will check if the second replica can be, and use that instead. Since both replicas are locally accessible, I dont see how it can be generally used to select replicas
            • mduff
              mduff
              30 Posts
              ACCEPTED ANSWER

              Re: Reads from replicated metadata from one failure group

              ‏2012-12-20T15:42:55Z  in response to pce
              Thank you for this pce.

              Dan actually has a informative post about this and I have already summarized this, and we have tested both settings.

              Here is what I sent:


              There is an undocumented configuration parameter that applies to where
              data is read.

              The readReplicaPolicy parameter can be set to either local or default.
              The only difference is that the 'local' setting specifies that the reads
              will be from local disk or an NSD server that is on the same subnet.  This
              is as opposed to 'default', which will choose the first available replica.

              GPFS can only differentiate between:
              1) Direct attach               vs   NSD attach
              2) NSD attach on local subnet  vs   NSD attach not on local subnet

              Would you be able to test reads going across all four NSD servers, with
              and without the readReplicaPolicy set to local?

              You can find the current setting of readReplicaPolicy with mmfsadm dump
              config:

              1. mmfsadm dump config  | grep -i readReplicaPolicy
                readReplicaPolicy default

              Change the value with:

              1. mmchconfig readReplicaPolicy=local
              mmchconfig: Command successfully completed
              mmchconfig: Propagating the cluster configuration data to all
               affected nodes.  This is an asynchronous process.

              Here are the current settings:

              :/# /usr/lpp/mmfs/bin/mmfsadm dump config | grep readReplica
                 readReplicaPolicy default
              []# /usr/lpp/mmfs/bin/mmfsadm dump config | grep readReplica
                 readReplicaPolicy default
              /# /usr/lpp/mmfs/bin/mmfsadm dump config | grep readReplica
                 readReplicaPolicy default
              /# /usr/lpp/mmfs/bin/mmfsadm dump config | grep readReplica
                 readReplicaPolicy default
               
              • pce
                pce
                57 Posts
                ACCEPTED ANSWER

                Re: Reads from replicated metadata from one failure group

                ‏2012-12-20T15:59:56Z  in response to mduff
                The policy did not change to 'local'; it is still at 'default'.

                Use "mmchconfig readReplicaPolicy=local -i" to get an immediate effect, then try your experiments.
                • mduff
                  mduff
                  30 Posts
                  ACCEPTED ANSWER

                  Re: Reads from replicated metadata from one failure group

                  ‏2012-12-20T23:42:47Z  in response to pce
                  We have tried both settings and there is no difference.