Topic
  • 6 replies
  • Latest Post - ‏2008-02-18T07:11:23Z by SystemAdmin
SystemAdmin
SystemAdmin
46 Posts

Pinned topic Loadleveler multi cluster command error

‏2007-11-27T14:52:19Z |
Hi,

I keep getting this error and I can find no way of solving it:

11/27 16:36:10 TI-15 (MUSTER) ScheddRemoteCmdInbound: This machine lcc.xxx.com is not configured to receive remote command from cluster gridserver
11/27 16:36:10 TI-14 (MUSTER) RemoteCmdOutbound: Received ack 2 indicating an error from the remote inbound Schedd.
11/27 16:36:10 TI-14 (MUSTER) virtual void RemoteCmdOutboundTransaction::do_command(): RemoteCmd: 2512-258 Remote Schedd LoadL_schedd is not configured as an inbound_hosts for this cluster.

The multicluster is setup as follows:
lcc: type = cluster
Local = True
inbound_schedd_port = 9605
secure_schedd_port = 9607
multicluster_security = NOT_SET
ssl_cipher_list = ALL:eNULL:!aNULL
inbound_hosts = xxx.xxx.xxx(lcc) xxx.xxx.xxx(gridserver)
outbound_hosts = xxx.xxx.xxx(lcc) xxx.xxx.xxx(gridserver)

gridserver: type = cluster
Local = False
inbound_schedd_port = 9605
secure_schedd_port = 9607
multicluster_security = NOT_SET
ssl_cipher_list = ALL:eNULL:!aNULL
inbound_hosts = xxx.xxx.xxx(lcc) xxx.xxx.xxx(gridserver)
outbound_hosts = xxx.xxx.xxx(lcc) xxx.xxx.xxx(gridserver)

How do I enable remote commands? Am i missing something?

LoadLeveler 3.4
OpenSuse 10.2
Intel i386

This is for research I am doing - any help will be most appreciated!

Message was edited by: chrispparker
Updated on 2008-02-18T07:11:23Z at 2008-02-18T07:11:23Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    46 Posts

    Re: Loadleveler multi cluster command error

    ‏2008-02-04T20:02:27Z  
    Have you received any help on this issue?

    Message was edited by: raval@us.ibm.com
    Updated on 2008-02-04T20:02:27Z at 2008-02-04T20:02:27Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    46 Posts

    Re: Loadleveler multi cluster command error

    ‏2008-02-05T06:25:59Z  
    Have you received any help on this issue?

    Message was edited by: raval@us.ibm.com
    Unfortunately I am still to get help with this particular feature. It is a really frustrating problem. Any help would be appreciated
  • SystemAdmin
    SystemAdmin
    46 Posts

    Re: Loadleveler multi cluster command error

    ‏2008-02-05T14:17:46Z  
    Can you clarify the multicluster setup you have?

    The cluster stanzas you posted tell me that you have two clusters named lcc and gridserver. The inbound_hosts and outbound hosts statements are unclear.

    For each clusters stanza the inbound_hosts and outbound_hosts statements identify the machines (by hostname) within the cluster which you choose to serve as gateway machines for the multi-cluster. The machines are listed and are optionally followed by cluster names in parenthesis. If cluster names are provided, the preceeding hostname only serves those clusters.

    You've specified the same name "xxx.xxx.xxx" in every one of your inbound_hosts and outbound_hosts statements. It appears that you may be using xxx.xxx.xxx to hide the actual names, and I cannot tell if your configuration is correct or not.

    In order for multi-cluster operations to work, the cluster stanzas must be correct in admin files for both clusters.

    The error indicates that there was an attempt to use lcc.xxx.com to serve a request from the gridserver cluster.

    This would indicate to me that in the admin file for the gridserver cluster, the cluster stanza for the lcc cluster contained statements like inbound_hosts = lcc.xxx.com (gridserver).

    However it also indicates that in the admin file for the lcc cluster lcc.xxx.com did not appear in an inbound_hosts statement for the lcc cluster as a server for the gridserver cluster.

    Without seeing the actual hostnames, it is impossible to tell if there is a problem with the configuration.
  • SystemAdmin
    SystemAdmin
    46 Posts

    Re: Loadleveler multi cluster command error

    ‏2008-02-05T14:56:35Z  
    Can you clarify the multicluster setup you have?

    The cluster stanzas you posted tell me that you have two clusters named lcc and gridserver. The inbound_hosts and outbound hosts statements are unclear.

    For each clusters stanza the inbound_hosts and outbound_hosts statements identify the machines (by hostname) within the cluster which you choose to serve as gateway machines for the multi-cluster. The machines are listed and are optionally followed by cluster names in parenthesis. If cluster names are provided, the preceeding hostname only serves those clusters.

    You've specified the same name "xxx.xxx.xxx" in every one of your inbound_hosts and outbound_hosts statements. It appears that you may be using xxx.xxx.xxx to hide the actual names, and I cannot tell if your configuration is correct or not.

    In order for multi-cluster operations to work, the cluster stanzas must be correct in admin files for both clusters.

    The error indicates that there was an attempt to use lcc.xxx.com to serve a request from the gridserver cluster.

    This would indicate to me that in the admin file for the gridserver cluster, the cluster stanza for the lcc cluster contained statements like inbound_hosts = lcc.xxx.com (gridserver).

    However it also indicates that in the admin file for the lcc cluster lcc.xxx.com did not appear in an inbound_hosts statement for the lcc cluster as a server for the gridserver cluster.

    Without seeing the actual hostnames, it is impossible to tell if there is a problem with the configuration.
    Here is the cluster setup:

    $$$$$$$$$$$$$$$$$$$$$$
    Gridserver admin file
    $$$$$$$$$$$$$$$$$$$$$$

    gridserver: type=cluster
    local=true
    outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
    inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)

    lcc: type=cluster
    local=false
    inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
    outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)

    #########################################################
    $$$$$$$$$$$$$$$$$$$$$$
    LCC admin file
    $$$$$$$$$$$$$$$$$$$$$$

    gridserver: type=cluster
    local=false
    outbound_hosts = dizzy.lcc.uct.ac.za(lcc) ed.cs.uct.ac.za(gridserver)
    inbound_hosts = dizzy.lcc.uct.ac.za(lcc) ed.cs.uct.ac.za(gridserver)
    inbound_schedd_port = 9605

    lcc: type=cluster
    local=true
    inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
    outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
    inbound_schedd_port = 9605

    I used an example off the IBM website - but I might be missing something - thanks for the help so far.
  • SystemAdmin
    SystemAdmin
    46 Posts

    Re: Loadleveler multi cluster command error

    ‏2008-02-06T12:09:31Z  
    Here is the cluster setup:

    $$$$$$$$$$$$$$$$$$$$$$
    Gridserver admin file
    $$$$$$$$$$$$$$$$$$$$$$

    gridserver: type=cluster
    local=true
    outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
    inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)

    lcc: type=cluster
    local=false
    inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
    outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)

    #########################################################
    $$$$$$$$$$$$$$$$$$$$$$
    LCC admin file
    $$$$$$$$$$$$$$$$$$$$$$

    gridserver: type=cluster
    local=false
    outbound_hosts = dizzy.lcc.uct.ac.za(lcc) ed.cs.uct.ac.za(gridserver)
    inbound_hosts = dizzy.lcc.uct.ac.za(lcc) ed.cs.uct.ac.za(gridserver)
    inbound_schedd_port = 9605

    lcc: type=cluster
    local=true
    inbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
    outbound_hosts = ed.cs.uct.ac.za(gridserver) dizzy.lcc.uct.ac.za(lcc)
    inbound_schedd_port = 9605

    I used an example off the IBM website - but I might be missing something - thanks for the help so far.
    The hostnames specified in the inbound_hosts and outbound_hosts statements must be hostnames for machines within the cluster specified as the cluster stanza name.

    So, in your case, for the gridserver cluster stanza you need to specify a host from within the gridserver cluster to serve as an inbound host for the lcc cluster. Also, you need to specify a host from within the gridserver cluster to serve as an outbound host for the lcc cluster.

    Similarly, for the lcc cluster stanza you need to specify a host from within the lcc cluster to serve as an inbound host for the gridserver cluster. Also, you need to specify a host from within the lcc cluster to serve as an outbound host for the gridserver cluster.

    In many cases, the same host is specified as both an inbound and outbound host.

    If I assume that ed.cs.uct.ac.za is a host in the gridserver cluster, and dizzy.lcc.uct.ac.za is a host in the lcc cluster, then I suggest setting your cluster stanzas as follows:

    $$$$$$$$$$$$$$$$$$$$$$
    Gridserver admin file
    $$$$$$$$$$$$$$$$$$$$$$

    gridserver: type=cluster
    local=true
    outbound_hosts = ed.cs.uct.ac.za
    inbound_hosts = ed.cs.uct.ac.za

    lcc: type=cluster
    local=false
    inbound_hosts = dizzy.lcc.uct.ac.za
    outbound_hosts = dizzy.lcc.uct.ac.za

    #########################################################

    $$$$$$$$$$$$$$$$$$$$$$
    LCC admin file
    $$$$$$$$$$$$$$$$$$$$$$

    gridserver: type=cluster
    local=false
    outbound_hosts = ed.cs.uct.ac.za
    inbound_hosts = ed.cs.uct.ac.za
    inbound_schedd_port = 9605

    lcc: type=cluster
    local=true
    inbound_hosts = dizzy.lcc.uct.ac.za
    outbound_hosts = dizzy.lcc.uct.ac.za
    inbound_schedd_port = 9605

    This specifies that ed.cs.uct.ac.za will serve as the gridserver inbound and outbound host for any cluster defined in the multi-cluster. Since lcc is the only other cluster, this host will serve lcc.

    Similarly, this specifies that dizzy.lcc.uct.ac.za will serve as the lcc inbound and outbound host for any cluster defined in the multi-cluster. Since gridcluster is the only other cluster, this host will serve gridcluster.

    In a more complicated multi-cluster configuration, you may want to specify the clusters within parenthesis to limit which clusters the inbound or outbout host will serve.
  • SystemAdmin
    SystemAdmin
    46 Posts

    Re: Loadleveler multi cluster command error

    ‏2008-02-18T07:11:23Z  
    The hostnames specified in the inbound_hosts and outbound_hosts statements must be hostnames for machines within the cluster specified as the cluster stanza name.

    So, in your case, for the gridserver cluster stanza you need to specify a host from within the gridserver cluster to serve as an inbound host for the lcc cluster. Also, you need to specify a host from within the gridserver cluster to serve as an outbound host for the lcc cluster.

    Similarly, for the lcc cluster stanza you need to specify a host from within the lcc cluster to serve as an inbound host for the gridserver cluster. Also, you need to specify a host from within the lcc cluster to serve as an outbound host for the gridserver cluster.

    In many cases, the same host is specified as both an inbound and outbound host.

    If I assume that ed.cs.uct.ac.za is a host in the gridserver cluster, and dizzy.lcc.uct.ac.za is a host in the lcc cluster, then I suggest setting your cluster stanzas as follows:

    $$$$$$$$$$$$$$$$$$$$$$
    Gridserver admin file
    $$$$$$$$$$$$$$$$$$$$$$

    gridserver: type=cluster
    local=true
    outbound_hosts = ed.cs.uct.ac.za
    inbound_hosts = ed.cs.uct.ac.za

    lcc: type=cluster
    local=false
    inbound_hosts = dizzy.lcc.uct.ac.za
    outbound_hosts = dizzy.lcc.uct.ac.za

    #########################################################

    $$$$$$$$$$$$$$$$$$$$$$
    LCC admin file
    $$$$$$$$$$$$$$$$$$$$$$

    gridserver: type=cluster
    local=false
    outbound_hosts = ed.cs.uct.ac.za
    inbound_hosts = ed.cs.uct.ac.za
    inbound_schedd_port = 9605

    lcc: type=cluster
    local=true
    inbound_hosts = dizzy.lcc.uct.ac.za
    outbound_hosts = dizzy.lcc.uct.ac.za
    inbound_schedd_port = 9605

    This specifies that ed.cs.uct.ac.za will serve as the gridserver inbound and outbound host for any cluster defined in the multi-cluster. Since lcc is the only other cluster, this host will serve lcc.

    Similarly, this specifies that dizzy.lcc.uct.ac.za will serve as the lcc inbound and outbound host for any cluster defined in the multi-cluster. Since gridcluster is the only other cluster, this host will serve gridcluster.

    In a more complicated multi-cluster configuration, you may want to specify the clusters within parenthesis to limit which clusters the inbound or outbout host will serve.
    Sorry I have taken so long to reply! I have made the changes, but if I try and run a status command on the local machine i get this error

    christopher@ed:/home/loadl/log> /opt/ibmll/LoadL/full/bin/llstatus -X ed.cs.uct.ac.za
    =================== Cluster ed.cs.uct.ac.za ====================================

    llstatus: 2512-304 An error occurred while receiving data from the LoadL_negotiator daemon in cluster ed.cs.uct.ac.za.
    2539-861 Cannot contact the local outbound Schedd. remote cluster = ed.cs.uct.ac.za.

    OR
    christopher@ed:/home/loadl/log> /opt/ibmll/LoadL/full/bin/llstatus -X gridserver
    =================== Cluster gridserver ====================================

    llstatus: 2512-301 An error occurred while receiving data from the LoadL_negotiator daemon on host ed.cs.uct.ac.za.
    RemoteCmd: 2539-832 The user mapper program /opt/ibmll/LoadL/full/samples/llcluster/cluster_user_mapper failed with rc = -1.

    If i telnet into the ports i get a working telnet session. Do you have any idea what can be wrong?