Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
4 replies Latest Post - ‏2013-06-20T22:37:16Z by megalosaurus
megalosaurus
megalosaurus
5 Posts
ACCEPTED ANSWER

Pinned topic Problem with mmchfs -V

‏2013-06-17T14:49:57Z |

I have a large cluster (374 nodes) that I have just upgraded from 3.3 to 3.4.0-21. I have run mmchconfig release=latest and I now have "minReleaseLevel 3.4.0.7." My filesystems are at varying version levels - 9.03, 10.01, and 11.05. I tried running "mmchfs fs01 -V full" on one of the 9.03 filesystems, and it came back with "File system version 12.10 is not supported on 1 nodes in the cluster." I have no clue which node this is or why it is not supported. Is this really telling me that there is exactly one node that doesn't support this, or did it simply stop when it found one node. I have gone through the entire cluster and verified that every node is running version 3.4.0-21. Is there some configuration setting that can cause this?

Upgrading to 3.5 is not an option at this time as I still have several 32-bit nodes.

  • dlmcnabb
    dlmcnabb
    1012 Posts
    ACCEPTED ANSWER

    Re: Problem with mmchfs -V

    ‏2013-06-17T15:18:07Z  in response to megalosaurus

    On the Cluster Manager node (mmlsmgr) get the output from "mmfsadm dump tscomm". In the connection table there is a version column. Make sure all the nodes are running the upgraded version. It may have been installed, but the daemon may not have been restarted.

    • megalosaurus
      megalosaurus
      5 Posts
      ACCEPTED ANSWER

      Re: Problem with mmchfs -V

      ‏2013-06-17T17:07:24Z  in response to dlmcnabb

      This gets even more interesting. Over the last couple weeks, I've removed several old nodes from the cluster. These still show up in the connection table. One of these nodes had been dead with a hardware problem since before I began the 3.4 upgrade. This one is still in the connections table with a version number of 1113. I presume this is the one that is causing my problem. Should I be worried about the deleted nodes that are still in the table? And what can I do about this one that is blocking the mmchfs command?

      thanks

      • dlmcnabb
        dlmcnabb
        1012 Posts
        ACCEPTED ANSWER

        Re: Problem with mmchfs -V

        ‏2013-06-19T05:44:26Z  in response to megalosaurus

        Entries in the connection table are never removed. If there is no socket for the old node, then it does not matter. If it is not connected to the Cluster Manager node, then it cannot be part of the cluster.

        Possibly the mmsdrfs file has that node marked as having an old version of the code. If mmchconfig release=LATEST cannot contact that node, it will not update the mmsdrfs version and mmchfs -V may be prevented.

        • megalosaurus
          megalosaurus
          5 Posts
          ACCEPTED ANSWER

          Re: Problem with mmchfs -V

          ‏2013-06-20T22:37:16Z  in response to dlmcnabb

          I found that for any filesystem, it appears to check the filesystem manager node for that filesystem, and if there is any entry in the connections  table that shows an old version, it won't let you do the mmchfs. I restarted GPFS on the filesystem manager nodes (also threw in a reboot while I was at it, but I don't think that was needed), and then all the broken connections went away on those nodes. Once all the filesystems were being managed by nodes that didn't have that bad connection entry, I was able to run mmchfs on them.