Topic
6 replies Latest Post - ‏2008-02-18T07:11:23Z by SystemAdmin
SystemAdmin
SystemAdmin
46 Posts
ACCEPTED ANSWER

Pinned topic Loadleveler multi cluster command error

‏2007-11-27T14:52:19Z |
Hi,

I keep getting this error and I can find no way of solving it:

11/27 16:36:10 TI-15 (MUSTER) ScheddRemoteCmdInbound: This machine lcc.xxx.com is not configured to receive remote command from cluster gridserver
11/27 16:36:10 TI-14 (MUSTER) RemoteCmdOutbound: Received ack 2 indicating an error from the remote inbound Schedd.
11/27 16:36:10 TI-14 (MUSTER) virtual void RemoteCmdOutboundTransaction::do_command(): RemoteCmd: 2512-258 Remote Schedd LoadL_schedd is not configured as an inbound_hosts for this cluster.

The multicluster is setup as follows:
lcc: type = cluster
Local = True
inbound_schedd_port = 9605
secure_schedd_port = 9607
multicluster_security = NOT_SET
ssl_cipher_list = ALL:eNULL:!aNULL
inbound_hosts = xxx.xxx.xxx(lcc) xxx.xxx.xxx(gridserver)
outbound_hosts = xxx.xxx.xxx(lcc) xxx.xxx.xxx(gridserver)

gridserver: type = cluster
Local = False
inbound_schedd_port = 9605
secure_schedd_port = 9607
multicluster_security = NOT_SET
ssl_cipher_list = ALL:eNULL:!aNULL
inbound_hosts = xxx.xxx.xxx(lcc) xxx.xxx.xxx(gridserver)
outbound_hosts = xxx.xxx.xxx(lcc) xxx.xxx.xxx(gridserver)

How do I enable remote commands? Am i missing something?

LoadLeveler 3.4
OpenSuse 10.2
Intel i386

This is for research I am doing - any help will be most appreciated!

Message was edited by: chrispparker
Updated on 2008-02-18T07:11:23Z at 2008-02-18T07:11:23Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    46 Posts
    ACCEPTED ANSWER

    Re: Loadleveler multi cluster command error

    ‏2008-02-04T20:02:27Z  in response to SystemAdmin
    Have you received any help on this issue?

    Message was edited by: raval@us.ibm.com
    Updated on 2008-02-04T20:02:27Z at 2008-02-04T20:02:27Z by SystemAdmin
    • SystemAdmin
      SystemAdmin
      46 Posts
      ACCEPTED ANSWER

      Re: Loadleveler multi cluster command error

      ‏2008-02-05T06:25:59Z  in response to SystemAdmin
      Unfortunately I am still to get help with this particular feature. It is a really frustrating problem. Any help would be appreciated
  • SystemAdmin
    SystemAdmin
    46 Posts
    ACCEPTED ANSWER

    Re: Loadleveler multi cluster command error

    ‏2008-02-05T14:17:46Z  in response to SystemAdmin
    Can you clarify the multicluster setup you have?

    The cluster stanzas you posted tell me that you have two clusters named lcc and gridserver. The inbound_hosts and outbound hosts statements are unclear.

    For each clusters stanza the inbound_hosts and outbound_hosts statements identify the machines (by hostname) within the cluster which you choose to serve as gateway machines for the multi-cluster. The machines are listed and are optionally followed by cluster names in parenthesis. If cluster names are provided, the preceeding hostname only serves those clusters.

    You've specified the same name "xxx.xxx.xxx" in every one of your inbound_hosts and outbound_hosts statements. It appears that you may be using xxx.xxx.xxx to hide the actual names, and I cannot tell if your configuration is correct or not.

    In order for multi-cluster operations to work, the cluster stanzas must be correct in admin files for both clusters.

    The error indicates that there was an attempt to use lcc.xxx.com to serve a request from the gridserver cluster.

    This would indicate to me that in the admin file for the gridserver cluster, the cluster stanza for the lcc cluster contained statements like inbound_hosts = lcc.xxx.com (gridserver).

    However it also indicates that in the admin file for the lcc cluster lcc.xxx.com did not appear in an inbound_hosts statement for the lcc cluster as a server for the gridserver cluster.

    Without seeing the actual hostnames, it is impossible to tell if there is a problem with the configuration.
    • SystemAdmin
      SystemAdmin
      46 Posts
      ACCEPTED ANSWER

      Re: Loadleveler multi cluster command error

      ‏2008-02-05T14:56:35Z  in response to SystemAdmin
      Here is the cluster setup:

      $$$$$$$$$$$$$$$$$$$$$$
      Gridserver admin file
      $$$$$$$$$$$$$$$$$$$$$$

      gridserver: type=cluster
      local=true
      outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
      inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)

      lcc: type=cluster
      local=false
      inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
      outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)

      #########################################################
      $$$$$$$$$$$$$$$$$$$$$$
      LCC admin file
      $$$$$$$$$$$$$$$$$$$$$$

      gridserver: type=cluster
      local=false
      outbound_hosts = dizzy.lcc.uct.ac.za(lcc) ed.cs.uct.ac.za(gridserver)
      inbound_hosts = dizzy.lcc.uct.ac.za(lcc) ed.cs.uct.ac.za(gridserver)
      inbound_schedd_port = 9605

      lcc: type=cluster
      local=true
      inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
      outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
      inbound_schedd_port = 9605

      I used an example off the IBM website - but I might be missing something - thanks for the help so far.
      • SystemAdmin
        SystemAdmin
        46 Posts
        ACCEPTED ANSWER

        Re: Loadleveler multi cluster command error

        ‏2008-02-06T12:09:31Z  in response to SystemAdmin
        The hostnames specified in the inbound_hosts and outbound_hosts statements must be hostnames for machines within the cluster specified as the cluster stanza name.

        So, in your case, for the gridserver cluster stanza you need to specify a host from within the gridserver cluster to serve as an inbound host for the lcc cluster. Also, you need to specify a host from within the gridserver cluster to serve as an outbound host for the lcc cluster.

        Similarly, for the lcc cluster stanza you need to specify a host from within the lcc cluster to serve as an inbound host for the gridserver cluster. Also, you need to specify a host from within the lcc cluster to serve as an outbound host for the gridserver cluster.

        In many cases, the same host is specified as both an inbound and outbound host.

        If I assume that ed.cs.uct.ac.za is a host in the gridserver cluster, and dizzy.lcc.uct.ac.za is a host in the lcc cluster, then I suggest setting your cluster stanzas as follows:

        $$$$$$$$$$$$$$$$$$$$$$
        Gridserver admin file
        $$$$$$$$$$$$$$$$$$$$$$

        gridserver: type=cluster
        local=true
        outbound_hosts = ed.cs.uct.ac.za
        inbound_hosts = ed.cs.uct.ac.za

        lcc: type=cluster
        local=false
        inbound_hosts = dizzy.lcc.uct.ac.za
        outbound_hosts = dizzy.lcc.uct.ac.za

        #########################################################

        $$$$$$$$$$$$$$$$$$$$$$
        LCC admin file
        $$$$$$$$$$$$$$$$$$$$$$

        gridserver: type=cluster
        local=false
        outbound_hosts = ed.cs.uct.ac.za
        inbound_hosts = ed.cs.uct.ac.za
        inbound_schedd_port = 9605

        lcc: type=cluster
        local=true
        inbound_hosts = dizzy.lcc.uct.ac.za
        outbound_hosts = dizzy.lcc.uct.ac.za
        inbound_schedd_port = 9605

        This specifies that ed.cs.uct.ac.za will serve as the gridserver inbound and outbound host for any cluster defined in the multi-cluster. Since lcc is the only other cluster, this host will serve lcc.

        Similarly, this specifies that dizzy.lcc.uct.ac.za will serve as the lcc inbound and outbound host for any cluster defined in the multi-cluster. Since gridcluster is the only other cluster, this host will serve gridcluster.

        In a more complicated multi-cluster configuration, you may want to specify the clusters within parenthesis to limit which clusters the inbound or outbout host will serve.
        • SystemAdmin
          SystemAdmin
          46 Posts
          ACCEPTED ANSWER

          Re: Loadleveler multi cluster command error

          ‏2008-02-18T07:11:23Z  in response to SystemAdmin
          Sorry I have taken so long to reply! I have made the changes, but if I try and run a status command on the local machine i get this error

          christopher@ed:/home/loadl/log> /opt/ibmll/LoadL/full/bin/llstatus -X ed.cs.uct.ac.za
          =================== Cluster ed.cs.uct.ac.za ====================================

          llstatus: 2512-304 An error occurred while receiving data from the LoadL_negotiator daemon in cluster ed.cs.uct.ac.za.
          2539-861 Cannot contact the local outbound Schedd. remote cluster = ed.cs.uct.ac.za.

          OR
          christopher@ed:/home/loadl/log> /opt/ibmll/LoadL/full/bin/llstatus -X gridserver
          =================== Cluster gridserver ====================================

          llstatus: 2512-301 An error occurred while receiving data from the LoadL_negotiator daemon on host ed.cs.uct.ac.za.
          RemoteCmd: 2539-832 The user mapper program /opt/ibmll/LoadL/full/samples/llcluster/cluster_user_mapper failed with rc = -1.

          If i telnet into the ports i get a working telnet session. Do you have any idea what can be wrong?