Topic
3 replies Latest Post - ‏2011-08-17T17:34:23Z by Casey_B
SystemAdmin
SystemAdmin
69 Posts
ACCEPTED ANSWER

Pinned topic cldare: Unable to execute a command remotely...

‏2011-07-21T05:42:54Z |
Hi All,

Setting up an AIX HACMP 5.5 two node cluster and getting the following error message when running an "Extended Verification and Synchronization"


cldare: Unable to execute a command remotely on node CLUSTER_B, check clcomd.log file 

for more information. Any configuration changes must be propagated to the node before the node will be allowed to join the cluster.


The AIX environment that I'm working on doesn't allow RSH (or any r daemons) to run on it's nodes. I've picked through the /usr/es/sbin/cluster/utilities/cldare script which calls another called /usr/es/sbin/cluster/utilities/cl_rsh. So I'm assuming that this process is failing because there is no RSH allowed between hosts.

Am I correct to assume this? Also, with cldare not being able to execute commands on the remote host, what configuration changes need to be propagated? As far as I can tell, the cluster seems to be working fine.

Thanks,
-Kristian
Updated on 2011-08-17T17:34:23Z at 2011-08-17T17:34:23Z by Casey_B
  • Holgervk
    Holgervk
    10 Posts
    ACCEPTED ANSWER

    Re: cldare: Unable to execute a command remotely...

    ‏2011-08-03T21:58:40Z  in response to SystemAdmin
    the name rsh/cl_rsh is misleading.
    the cl_rsh command does not use rsh but clcomd-daemons (port 6191).
    if you block this on a hostbased firewall, you should open it. other parts of powerha also use clcom (file collections etc.). I would not assume Powerha to run fine with those ports blocked.

    if you dont use a hostbased-firewall, you have another problem.
    your powerha-nodes use the same service-ip so they are on the same subnet so there is no firewall.
    if this is the case, post again telling some details; f.e. output from tcpdump port clcomd when doing cl_rsh other_node id or similar
    • SystemAdmin
      SystemAdmin
      69 Posts
      ACCEPTED ANSWER

      Re: cldare: Unable to execute a command remotely...

      ‏2011-08-05T05:59:49Z  in response to Holgervk
      Hi Holgervk,

      Firstly, thanks for your reply.

      So this is the scenario, the cluster services are currently offline on both cluster nodes.
      
      root@clusterA# lssrc -ls clstrmgrES Current state: ST_INIT sccsid = 
      "@(#)36    1.135.5.2 src/43haes/usr/sbin/cluster/hacmprd/main.C, hacmp.pe, 53haes_r550, 0934B_hacmp550 8/8/09 14:48:23"
      


      Now at this stage, nothing is listening on port 6191 and from your post, which is why cldare is failing.

      
      root@clusterA# cldare -t   cldare: Unable to execute a command remotely on node AUAEUAP289WBCX2, check clcomd.log file 
      
      for more information. Any configuration changes must be propagated to the node before the node will be allowed to join the cluster.
      


      Now once I start cluster services on both nodes, clcomd is listening on 6191.

      
      root@clusterA# netstat -Aan | grep 6191 f100060002052398 tcp4       0      0  *.6191             *.*                LISTEN root@clusterA# rmsock f100060002052398 tcpcb The socket 0x2052008 is being held by proccess 368850 (clcomd).
      


      Now when I run cldare -t, I don't get the error.
      So is this working as designed then, because the error is a little off putting when you do a verification and synchronisation with the cluster offline.

      Thanks,
      Kristian
      • Casey_B
        Casey_B
        29 Posts
        ACCEPTED ANSWER

        Re: cldare: Unable to execute a command remotely...

        ‏2011-08-17T17:34:23Z  in response to SystemAdmin
        clcomdES is not the same as clstrmgrES.

        clcomd is used ( as was mentioned ) as the daemon that services cl_rsh commands, and can be running when
        the cluster is down.

        It seems to me that clcomdES is usually started upon bootup of an AIX node, but I
        am not sure. ( look in /etc/inittab )

        Also, look at the following links to similar information:
        https://www-304.ibm.com/support/docview.wss?uid=isg1IZ47409
        http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.hacmp.trgd/ha_trgd_clcomdes_clstmgres.htm
        http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.hacmp.admngd/ha_admin_disable_enable_cc_daemon.htm

        You can start the clcomd daemon as mentioned in the last link, then you will be able to run synch and
        verify when the rest of the cluster is down.

        Just as a sidenote, I don't remember if cldare is an end user command.
        The general rule is that if there isn't a man page for it, then you should not be running the command.

        ( Otherwise the command is internal, and the syntax and behaviour may change without notice. )

        Hope this helps,
        Casey