Topic
  • 9 replies
  • Latest Post - ‏2013-02-15T16:40:13Z by SystemAdmin
SystemAdmin
SystemAdmin
1245 Posts

Pinned topic FAILED daemon startup: hc

‏2013-02-13T17:05:00Z |
Hi,

I'm trying to run a SPL application over 2 hosts(master: str00210mob02, slave: str00210mob03),
but the system fails with the following log:

Thanks in advance,

Kind Regards.

CDISC0059I The system is starting the instance123@michele.mannino instance.
CDISC0078I The system is starting the runtime services on 2 hosts.
CDISC0056I The system is starting the distributed name service on the str00210mob02 host. The distributed name service has 1 partitions and 1 replications.
CDISC0057I The system is setting the NameServiceUrl property of the instance to DN:str00210mob02:38222, which is the URL of the distributed name service that is running.
CDISC0061I The system is starting in parallel the runtime services of 1 management hosts.
CDISC0060I The system is starting in parallel the runtime services of 1 application hosts.
  1. str00210mob03: ###########################################
  2. str00210mob03: Successfully started daemons:
  3. str00210mob03: FAILED daemon startup: hc
  4. str00210mob03: ###########################################
str00210mob03 faiibm stream FAILED daemon startup: hcled
Error: CDISC5173E The system could not process the following number of hosts: 1. See the previous error messages.
CDISC0062I The system is cleaning up after an instance did not start.
CDISC0063I The system is stopping the runtime services of the instance123@michele.mannino instance.
CDISC0026I The system is stopping the runtime services of the instance immediately.
CDISC0027I The system is stopping the runtime services of the instance. Any failures to stop the services will be ignored.
CDISC0068I The system is stopping in parallel the runtime services of 2 hosts.
CDISC0054I The system is stopping in parallel the distributed name services of the following 1 hosts:
str00210mob02
CDISC0055I The system is resetting the NameServiceUrl property of the instance because the distributed name service is not running.
Error: CDISC5181E The instance did not start. The system shut down and cleaned up the instance services.
1
The instance123 instance cannot be started.
  • SystemAdmin
    SystemAdmin
    1245 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-14T16:03:50Z  
    Hi Michele

    Please review the "Name resolution requirements" at: http://pic.dhe.ibm.com/infocenter/streams/v3r0/index.jsp?topic=%2Fcom.ibm.swg.im.infosphere.streams.install-admin.doc%2Fdoc%2Fibminfospherestreams-install-prerequisites-host-requirements.html

    In particular run:
    • streamtool checkhost -i instance123 -v 2
    • nslookup (using both the name and ip address for both hosts)

    If this does not identify the problem, attach the boot log. streamtool getlog -i instance123

    Regards,
    John
  • SystemAdmin
    SystemAdmin
    1245 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-14T16:44:46Z  
    Hi Michele

    Please review the "Name resolution requirements" at: http://pic.dhe.ibm.com/infocenter/streams/v3r0/index.jsp?topic=%2Fcom.ibm.swg.im.infosphere.streams.install-admin.doc%2Fdoc%2Fibminfospherestreams-install-prerequisites-host-requirements.html

    In particular run:
    • streamtool checkhost -i instance123 -v 2
    • nslookup (using both the name and ip address for both hosts)

    If this does not identify the problem, attach the boot log. streamtool getlog -i instance123

    Regards,
    John
    streamtool checkhost -i streams -v 2

    Date: Thu Feb 14 17:41:39 CET 2013
    Host: str00210mob03
    Instance: streams@michele.mannino
    2 Hosts to check: str00210mob03,str00210mob02
    Reference host: str00210mob03

    =============================================================
    Phase 1 - per-host public key ssh connectivity test...
    =============================================================

    Checking host 1 of 2: str00210mob03... host OK
    Checking host 2 of 2: str00210mob02... host OK

    Phase 1 - public key ssh connectivity test summary:
    2 OK hosts.
    0 problem hosts:

    =============================================================
    Phase 2 - per-host dependency checking...
    =============================================================

    Checking host 1 of 2: str00210mob03... host OK
    Checking host 2 of 2: str00210mob02... host OK

    Phase 2 - per host dependency checking summary:
    2 OK hosts.
    0 problem hosts:
    0 problem categories:
    =============================================================
    Detailed host results
    Verbosity level: 2
    =============================================================
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
    Host: str00210mob03 Status: PASSED

    Hostname: str00210mob03.deis.unibo.it
    IP Address of Hostname: 137.204.57.33

    Check: Public Key SSH Status: PASSED
    Check: Install Owner Streams Runtime Data Access Status: PASSED
    Check: Compatible OS Architecture Set Status: PASSED
    Check: Compatible Streams Runtime Version Set Status: PASSED
    Check: Same Install Owner Set Status: PASSED
    Check: Recovery Mode Database Configuration Status: SKIPPED
    Details: install owner's DB configuration file not found

    Section: Dependency Checker Checks
    Check: Compatible OS Architecture Status: PASSED
    Check: Compatible SELinux Configuration Status: PASSED
    Check: Compatible Java Configuration Status: PASSED
    Check: Compatible Language Encoding Status: PASSED
    Check: Compatible Network Configuration Status: PASSED
    Check: Required Installed Packages Status: PASSED

    Section: Instance Specific Checks
    Check: Instance Owner Streams Runtime Data Access Status: PASSED
    Check: Instance Configuration Access Status: PASSED
    Check: Recovery Mode Database Access Status: SKIPPED
    Details: recovery mode is not enabled for instance 'streams@michele.mannino'
    Check: Same Instance Owner Set Status: PASSED
    Check: Instance LogPath Access Checking Status: PASSED
    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
    Host: str00210mob02 Status: PASSED

    Hostname: str00210mob02.deis.unibo.it
    IP Address of Hostname: 137.204.57.32

    Check: Public Key SSH Status: PASSED
    Check: Install Owner Streams Runtime Data Access Status: PASSED
    Check: Compatible OS Architecture Set Status: PASSED
    Check: Compatible Streams Runtime Version Set Status: PASSED
    Check: Same Install Owner Set Status: PASSED
    Check: Recovery Mode Database Configuration Status: SKIPPED
    Details: install owner's DB configuration file not found

    Section: Dependency Checker Checks
    Check: Compatible OS Architecture Status: PASSED
    Check: Compatible SELinux Configuration Status: PASSED
    Check: Compatible Java Configuration Status: PASSED
    Check: Compatible Language Encoding Status: PASSED
    Check: Compatible Network Configuration Status: PASSED
    Check: Required Installed Packages Status: PASSED

    Section: Instance Specific Checks
    Check: Instance Owner Streams Runtime Data Access Status: PASSED
    Check: Instance Configuration Access Status: PASSED
    Check: Recovery Mode Database Access Status: SKIPPED
    Details: recovery mode is not enabled for instance 'streams@michele.mannino'
    Check: Same Instance Owner Set Status: PASSED
    Check: Instance LogPath Access Checking Status: PASSED
    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    =============================================================
    Overall Summary
    =============================================================

    2 hosts checked.
    2 OK hosts.
    0 problem hosts:

    nslookup str00210mob02.deis.unibo.it
    Server: 137.204.58.1
    Address: 137.204.58.1#53

    • server can't find str00210mob02.deis.unibo.it: NXDOMAIN



    nslookup str00210mob03.deis.unibo.it
    Server: 137.204.58.1
    Address: 137.204.58.1#53

    • server can't find str00210mob03.deis.unibo.it: NXDOMAIN
  • SystemAdmin
    SystemAdmin
    1245 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-14T20:55:56Z  
    Hi Michele

    Please review the "Name resolution requirements" at: http://pic.dhe.ibm.com/infocenter/streams/v3r0/index.jsp?topic=%2Fcom.ibm.swg.im.infosphere.streams.install-admin.doc%2Fdoc%2Fibminfospherestreams-install-prerequisites-host-requirements.html

    In particular run:
    • streamtool checkhost -i instance123 -v 2
    • nslookup (using both the name and ip address for both hosts)

    If this does not identify the problem, attach the boot log. streamtool getlog -i instance123

    Regards,
    John
    I've attached also the log using streamtool getlog -i <instance>

    // ON MASTER str00210mob03

    michele.mannino@str00210mob03 Desktop$ cat /etc/hosts
    127.0.0.1 localhost.localdomain localhost
    137.204.57.33 str00210mob03.deis.unibo.it str00210mob03
    137.204.57.32 str00210mob02.deis.unibo.it str00210mob02
    ::1 localhost6.localdomain6 localhost6

    // ON SLAVE str00210mob02

    michele.mannino@str00210mob02 ~$ cat /etc/hosts
    127.0.0.1 localhost.localdomain localhost
    137.204.57.33 str00210mob03.deis.unibo.it str00210mob03
    137.204.57.32 str00210mob02.deis.unibo.it str00210mob02
    ::1 localhost6.localdomain6 localhost6
  • SystemAdmin
    SystemAdmin
    1245 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-15T09:40:07Z  
    Hi Michele

    Please review the "Name resolution requirements" at: http://pic.dhe.ibm.com/infocenter/streams/v3r0/index.jsp?topic=%2Fcom.ibm.swg.im.infosphere.streams.install-admin.doc%2Fdoc%2Fibminfospherestreams-install-prerequisites-host-requirements.html

    In particular run:
    • streamtool checkhost -i instance123 -v 2
    • nslookup (using both the name and ip address for both hosts)

    If this does not identify the problem, attach the boot log. streamtool getlog -i instance123

    Regards,
    John
    In the slave host (str00210mob02.deis.unibo.it):

    Feb 2013 10:01:56.088 22367 INFO :::NAM MDN_NameService.cpp:getServerURI:470 - DN_NameService::getServerURI: corbaloc:iiop:str00210mob03.deis.unibo.it:36369/dname

    15 Feb 2013 10:01:56.089 22367 INFO :::Core.Corba MDistilleryCorbaSinglePOAConnector.h:connectByURI:108 - Failed to get a service from 'uri:corbaloc:iiop:str00210mob03.deis.unibo.it:36369/dnamecorbaloc:iiop:str00210mob03.deis.unibo.it:36369/dname
    '. Retrying...

    15 Feb 2013 10:01:56.894 22367 INFO :::Core.Corba MDistilleryCorbaSinglePOAConnector.h:connectByURI:108 -
    Failed to get a service from 'uri:corbaloc:iiop:str00210mob03.deis.unibo.it:36369/dnamecorbaloc:iiop:str00210mob03.deis.unibo.it:36369/dname

    I try to ping the master (str00210mob03.deis.unibo.it):

    michele.mannino@str00210mob02 ~$ ping -p 36369 str00210mob03
    PATTERN: 0x363609
    PING str00210mob03.deis.unibo.it (137.204.57.33) 56(84) bytes of data.
    64 bytes from str00210mob03.deis.unibo.it (137.204.57.33): icmp_seq=1 ttl=64 time=0.551 ms
    64 bytes from str00210mob03.deis.unibo.it (137.204.57.33): icmp_seq=2 ttl=64 time=0.396 ms ...

    What's wrong?

    Please help me...
  • SystemAdmin
    SystemAdmin
    1245 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-15T14:54:26Z  
    This appears to be a name resolution problem since nslookup failed. I'm asking my colleagues for help with the next step.

    Regards,
    John
  • DennyHatz
    DennyHatz
    102 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-15T15:37:59Z  
    This appears to be a name resolution problem since nslookup failed. I'm asking my colleagues for help with the next step.

    Regards,
    John
    I looked thru your boot log and found the following :
    #########################################################################################
    WARNING WARNING WARNING !! ULIMIT CHECK on Host str00210mob03.deis.unibo.it
    ulimit max user processes (-u) setting of 1024 is LOW
    See InfoSphere Streams Information Center for ulimit recommendations
    #########################################################################################
    14 Feb 2013 17:36:52 .. Host str00210mob03.deis.unibo.it ulimit settings.
    core file size (blocks, -c) 0
    data seg size (kbytes, -d) unlimited
    scheduling priority (-e) 0
    file size (blocks, -f) unlimited
    pending signals (-i) 29191
    max locked memory (kbytes, -l) 64
    max memory size (kbytes, -m) unlimited
    open files (-n) 1024
    pipe size (512 bytes, -p) 8
    POSIX message queues (bytes, -q) 819200
    real-time priority (-r) 0
    stack size (kbytes, -s) 10240
    cpu time (seconds, -t) unlimited
    max user processes (-u) 1024
    virtual memory (kbytes, -v) unlimited
    file locks (-x) unlimited

    Try setting the ulimit max user processes (-u) setting to the same value as pending signals value (29191) on both hosts
    See InfoSphere Streams Information Center and the Admin and Install guide and search for ulimit recommendations.
    From the Admin Guide:

    To change the max user processes value, edit the /etc/security/limits.d/90-
    nproc.conf file as shown in the following example:
    • soft nproc 29191
    You must restart your system for the changes to take effect.
    To verify updated settings, enter the following command:
    ulimit -a
  • SystemAdmin
    SystemAdmin
    1245 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-15T16:02:06Z  
    • DennyHatz
    • ‏2013-02-15T15:37:59Z
    I looked thru your boot log and found the following :
    #########################################################################################
    WARNING WARNING WARNING !! ULIMIT CHECK on Host str00210mob03.deis.unibo.it
    ulimit max user processes (-u) setting of 1024 is LOW
    See InfoSphere Streams Information Center for ulimit recommendations
    #########################################################################################
    14 Feb 2013 17:36:52 .. Host str00210mob03.deis.unibo.it ulimit settings.
    core file size (blocks, -c) 0
    data seg size (kbytes, -d) unlimited
    scheduling priority (-e) 0
    file size (blocks, -f) unlimited
    pending signals (-i) 29191
    max locked memory (kbytes, -l) 64
    max memory size (kbytes, -m) unlimited
    open files (-n) 1024
    pipe size (512 bytes, -p) 8
    POSIX message queues (bytes, -q) 819200
    real-time priority (-r) 0
    stack size (kbytes, -s) 10240
    cpu time (seconds, -t) unlimited
    max user processes (-u) 1024
    virtual memory (kbytes, -v) unlimited
    file locks (-x) unlimited

    Try setting the ulimit max user processes (-u) setting to the same value as pending signals value (29191) on both hosts
    See InfoSphere Streams Information Center and the Admin and Install guide and search for ulimit recommendations.
    From the Admin Guide:

    To change the max user processes value, edit the /etc/security/limits.d/90-
    nproc.conf file as shown in the following example:
    • soft nproc 29191
    You must restart your system for the changes to take effect.
    To verify updated settings, enter the following command:
    ulimit -a
    I've done this.

    ulimit -a

    core file size (blocks, -c) 0
    data seg size (kbytes, -d) unlimited
    scheduling priority (-e) 0
    file size (blocks, -f) unlimited
    pending signals (-i) 29191
    max locked memory (kbytes, -l) 64
    max memory size (kbytes, -m) unlimited
    open files (-n) 4096
    pipe size (512 bytes, -p) 8
    POSIX message queues (bytes, -q) 819200
    real-time priority (-r) 0
    stack size (kbytes, -s) 10240
    cpu time (seconds, -t) unlimited
    max user processes (-u) 29191
    virtual memory (kbytes, -v) unlimited
    file locks (-x) unlimited
  • DennyHatz
    DennyHatz
    102 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-15T16:11:10Z  
    I've done this.

    ulimit -a

    core file size (blocks, -c) 0
    data seg size (kbytes, -d) unlimited
    scheduling priority (-e) 0
    file size (blocks, -f) unlimited
    pending signals (-i) 29191
    max locked memory (kbytes, -l) 64
    max memory size (kbytes, -m) unlimited
    open files (-n) 4096
    pipe size (512 bytes, -p) 8
    POSIX message queues (bytes, -q) 819200
    real-time priority (-r) 0
    stack size (kbytes, -s) 10240
    cpu time (seconds, -t) unlimited
    max user processes (-u) 29191
    virtual memory (kbytes, -v) unlimited
    file locks (-x) unlimited
    I assume you fixed the ulimit on both hosts right?

    We were a little confused when you posted to a very old thread:
    Hello,

    I've resolved using iptables command and flusing all rules, just like this:

    sudo iptables -F INPUT

    Thanks in advance...

    Has the problem been resolved?
  • SystemAdmin
    SystemAdmin
    1245 Posts

    Re: FAILED daemon startup: hc

    ‏2013-02-15T16:40:13Z  
    • DennyHatz
    • ‏2013-02-15T16:11:10Z
    I assume you fixed the ulimit on both hosts right?

    We were a little confused when you posted to a very old thread:
    Hello,

    I've resolved using iptables command and flusing all rules, just like this:

    sudo iptables -F INPUT

    Thanks in advance...

    Has the problem been resolved?
    I use iptables to solve my problem, because I suppose the packetes were rejected by
    the tool...