Topic
IC4NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
9 replies Latest Post - ‏2013-02-15T16:40:13Z by SystemAdmin
SystemAdmin
SystemAdmin
1245 Posts
ACCEPTED ANSWER

Pinned topic FAILED daemon startup: hc

‏2013-02-13T17:05:00Z |
Hi,

I'm trying to run a SPL application over 2 hosts(master: str00210mob02, slave: str00210mob03),
but the system fails with the following log:

Thanks in advance,

Kind Regards.

CDISC0059I The system is starting the instance123@michele.mannino instance.
CDISC0078I The system is starting the runtime services on 2 hosts.
CDISC0056I The system is starting the distributed name service on the str00210mob02 host. The distributed name service has 1 partitions and 1 replications.
CDISC0057I The system is setting the NameServiceUrl property of the instance to DN:str00210mob02:38222, which is the URL of the distributed name service that is running.
CDISC0061I The system is starting in parallel the runtime services of 1 management hosts.
CDISC0060I The system is starting in parallel the runtime services of 1 application hosts.
  1. str00210mob03: ###########################################
  2. str00210mob03: Successfully started daemons:
  3. str00210mob03: FAILED daemon startup: hc
  4. str00210mob03: ###########################################
str00210mob03 faiibm stream FAILED daemon startup: hcled
Error: CDISC5173E The system could not process the following number of hosts: 1. See the previous error messages.
CDISC0062I The system is cleaning up after an instance did not start.
CDISC0063I The system is stopping the runtime services of the instance123@michele.mannino instance.
CDISC0026I The system is stopping the runtime services of the instance immediately.
CDISC0027I The system is stopping the runtime services of the instance. Any failures to stop the services will be ignored.
CDISC0068I The system is stopping in parallel the runtime services of 2 hosts.
CDISC0054I The system is stopping in parallel the distributed name services of the following 1 hosts:
str00210mob02
CDISC0055I The system is resetting the NameServiceUrl property of the instance because the distributed name service is not running.
Error: CDISC5181E The instance did not start. The system shut down and cleaned up the instance services.
1
The instance123 instance cannot be started.
  • SystemAdmin
    SystemAdmin
    1245 Posts
    ACCEPTED ANSWER

    Re: FAILED daemon startup: hc

    ‏2013-02-14T16:03:50Z  in response to SystemAdmin
    Hi Michele

    Please review the "Name resolution requirements" at: http://pic.dhe.ibm.com/infocenter/streams/v3r0/index.jsp?topic=%2Fcom.ibm.swg.im.infosphere.streams.install-admin.doc%2Fdoc%2Fibminfospherestreams-install-prerequisites-host-requirements.html

    In particular run:
    • streamtool checkhost -i instance123 -v 2
    • nslookup (using both the name and ip address for both hosts)

    If this does not identify the problem, attach the boot log. streamtool getlog -i instance123

    Regards,
    John
    • SystemAdmin
      SystemAdmin
      1245 Posts
      ACCEPTED ANSWER

      Re: FAILED daemon startup: hc

      ‏2013-02-14T16:44:46Z  in response to SystemAdmin
      streamtool checkhost -i streams -v 2

      Date: Thu Feb 14 17:41:39 CET 2013
      Host: str00210mob03
      Instance: streams@michele.mannino
      2 Hosts to check: str00210mob03,str00210mob02
      Reference host: str00210mob03

      =============================================================
      Phase 1 - per-host public key ssh connectivity test...
      =============================================================

      Checking host 1 of 2: str00210mob03... host OK
      Checking host 2 of 2: str00210mob02... host OK

      Phase 1 - public key ssh connectivity test summary:
      2 OK hosts.
      0 problem hosts:

      =============================================================
      Phase 2 - per-host dependency checking...
      =============================================================

      Checking host 1 of 2: str00210mob03... host OK
      Checking host 2 of 2: str00210mob02... host OK

      Phase 2 - per host dependency checking summary:
      2 OK hosts.
      0 problem hosts:
      0 problem categories:
      =============================================================
      Detailed host results
      Verbosity level: 2
      =============================================================
      <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Host: str00210mob03 Status: PASSED

      Hostname: str00210mob03.deis.unibo.it
      IP Address of Hostname: 137.204.57.33

      Check: Public Key SSH Status: PASSED
      Check: Install Owner Streams Runtime Data Access Status: PASSED
      Check: Compatible OS Architecture Set Status: PASSED
      Check: Compatible Streams Runtime Version Set Status: PASSED
      Check: Same Install Owner Set Status: PASSED
      Check: Recovery Mode Database Configuration Status: SKIPPED
      Details: install owner's DB configuration file not found

      Section: Dependency Checker Checks
      Check: Compatible OS Architecture Status: PASSED
      Check: Compatible SELinux Configuration Status: PASSED
      Check: Compatible Java Configuration Status: PASSED
      Check: Compatible Language Encoding Status: PASSED
      Check: Compatible Network Configuration Status: PASSED
      Check: Required Installed Packages Status: PASSED

      Section: Instance Specific Checks
      Check: Instance Owner Streams Runtime Data Access Status: PASSED
      Check: Instance Configuration Access Status: PASSED
      Check: Recovery Mode Database Access Status: SKIPPED
      Details: recovery mode is not enabled for instance 'streams@michele.mannino'
      Check: Same Instance Owner Set Status: PASSED
      Check: Instance LogPath Access Checking Status: PASSED
      >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
      <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
      Host: str00210mob02 Status: PASSED

      Hostname: str00210mob02.deis.unibo.it
      IP Address of Hostname: 137.204.57.32

      Check: Public Key SSH Status: PASSED
      Check: Install Owner Streams Runtime Data Access Status: PASSED
      Check: Compatible OS Architecture Set Status: PASSED
      Check: Compatible Streams Runtime Version Set Status: PASSED
      Check: Same Install Owner Set Status: PASSED
      Check: Recovery Mode Database Configuration Status: SKIPPED
      Details: install owner's DB configuration file not found

      Section: Dependency Checker Checks
      Check: Compatible OS Architecture Status: PASSED
      Check: Compatible SELinux Configuration Status: PASSED
      Check: Compatible Java Configuration Status: PASSED
      Check: Compatible Language Encoding Status: PASSED
      Check: Compatible Network Configuration Status: PASSED
      Check: Required Installed Packages Status: PASSED

      Section: Instance Specific Checks
      Check: Instance Owner Streams Runtime Data Access Status: PASSED
      Check: Instance Configuration Access Status: PASSED
      Check: Recovery Mode Database Access Status: SKIPPED
      Details: recovery mode is not enabled for instance 'streams@michele.mannino'
      Check: Same Instance Owner Set Status: PASSED
      Check: Instance LogPath Access Checking Status: PASSED
      >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
      =============================================================
      Overall Summary
      =============================================================

      2 hosts checked.
      2 OK hosts.
      0 problem hosts:

      nslookup str00210mob02.deis.unibo.it
      Server: 137.204.58.1
      Address: 137.204.58.1#53

      • server can't find str00210mob02.deis.unibo.it: NXDOMAIN



      nslookup str00210mob03.deis.unibo.it
      Server: 137.204.58.1
      Address: 137.204.58.1#53

      • server can't find str00210mob03.deis.unibo.it: NXDOMAIN
    • SystemAdmin
      SystemAdmin
      1245 Posts
      ACCEPTED ANSWER

      Re: FAILED daemon startup: hc

      ‏2013-02-14T20:55:56Z  in response to SystemAdmin
      I've attached also the log using streamtool getlog -i <instance>

      // ON MASTER str00210mob03

      michele.mannino@str00210mob03 Desktop$ cat /etc/hosts
      127.0.0.1 localhost.localdomain localhost
      137.204.57.33 str00210mob03.deis.unibo.it str00210mob03
      137.204.57.32 str00210mob02.deis.unibo.it str00210mob02
      ::1 localhost6.localdomain6 localhost6

      // ON SLAVE str00210mob02

      michele.mannino@str00210mob02 ~$ cat /etc/hosts
      127.0.0.1 localhost.localdomain localhost
      137.204.57.33 str00210mob03.deis.unibo.it str00210mob03
      137.204.57.32 str00210mob02.deis.unibo.it str00210mob02
      ::1 localhost6.localdomain6 localhost6
    • SystemAdmin
      SystemAdmin
      1245 Posts
      ACCEPTED ANSWER

      Re: FAILED daemon startup: hc

      ‏2013-02-15T09:40:07Z  in response to SystemAdmin
      In the slave host (str00210mob02.deis.unibo.it):

      Feb 2013 10:01:56.088 22367 INFO :::NAM MDN_NameService.cpp:getServerURI:470 - DN_NameService::getServerURI: corbaloc:iiop:str00210mob03.deis.unibo.it:36369/dname

      15 Feb 2013 10:01:56.089 22367 INFO :::Core.Corba MDistilleryCorbaSinglePOAConnector.h:connectByURI:108 - Failed to get a service from 'uri:corbaloc:iiop:str00210mob03.deis.unibo.it:36369/dnamecorbaloc:iiop:str00210mob03.deis.unibo.it:36369/dname
      '. Retrying...

      15 Feb 2013 10:01:56.894 22367 INFO :::Core.Corba MDistilleryCorbaSinglePOAConnector.h:connectByURI:108 -
      Failed to get a service from 'uri:corbaloc:iiop:str00210mob03.deis.unibo.it:36369/dnamecorbaloc:iiop:str00210mob03.deis.unibo.it:36369/dname

      I try to ping the master (str00210mob03.deis.unibo.it):

      michele.mannino@str00210mob02 ~$ ping -p 36369 str00210mob03
      PATTERN: 0x363609
      PING str00210mob03.deis.unibo.it (137.204.57.33) 56(84) bytes of data.
      64 bytes from str00210mob03.deis.unibo.it (137.204.57.33): icmp_seq=1 ttl=64 time=0.551 ms
      64 bytes from str00210mob03.deis.unibo.it (137.204.57.33): icmp_seq=2 ttl=64 time=0.396 ms ...

      What's wrong?

      Please help me...
  • SystemAdmin
    SystemAdmin
    1245 Posts
    ACCEPTED ANSWER

    Re: FAILED daemon startup: hc

    ‏2013-02-15T14:54:26Z  in response to SystemAdmin
    This appears to be a name resolution problem since nslookup failed. I'm asking my colleagues for help with the next step.

    Regards,
    John
    • DennyHatz
      DennyHatz
      102 Posts
      ACCEPTED ANSWER

      Re: FAILED daemon startup: hc

      ‏2013-02-15T15:37:59Z  in response to SystemAdmin
      I looked thru your boot log and found the following :
      #########################################################################################
      WARNING WARNING WARNING !! ULIMIT CHECK on Host str00210mob03.deis.unibo.it
      ulimit max user processes (-u) setting of 1024 is LOW
      See InfoSphere Streams Information Center for ulimit recommendations
      #########################################################################################
      14 Feb 2013 17:36:52 .. Host str00210mob03.deis.unibo.it ulimit settings.
      core file size (blocks, -c) 0
      data seg size (kbytes, -d) unlimited
      scheduling priority (-e) 0
      file size (blocks, -f) unlimited
      pending signals (-i) 29191
      max locked memory (kbytes, -l) 64
      max memory size (kbytes, -m) unlimited
      open files (-n) 1024
      pipe size (512 bytes, -p) 8
      POSIX message queues (bytes, -q) 819200
      real-time priority (-r) 0
      stack size (kbytes, -s) 10240
      cpu time (seconds, -t) unlimited
      max user processes (-u) 1024
      virtual memory (kbytes, -v) unlimited
      file locks (-x) unlimited

      Try setting the ulimit max user processes (-u) setting to the same value as pending signals value (29191) on both hosts
      See InfoSphere Streams Information Center and the Admin and Install guide and search for ulimit recommendations.
      From the Admin Guide:

      To change the max user processes value, edit the /etc/security/limits.d/90-
      nproc.conf file as shown in the following example:
      • soft nproc 29191
      You must restart your system for the changes to take effect.
      To verify updated settings, enter the following command:
      ulimit -a
      • SystemAdmin
        SystemAdmin
        1245 Posts
        ACCEPTED ANSWER

        Re: FAILED daemon startup: hc

        ‏2013-02-15T16:02:06Z  in response to DennyHatz
        I've done this.

        ulimit -a

        core file size (blocks, -c) 0
        data seg size (kbytes, -d) unlimited
        scheduling priority (-e) 0
        file size (blocks, -f) unlimited
        pending signals (-i) 29191
        max locked memory (kbytes, -l) 64
        max memory size (kbytes, -m) unlimited
        open files (-n) 4096
        pipe size (512 bytes, -p) 8
        POSIX message queues (bytes, -q) 819200
        real-time priority (-r) 0
        stack size (kbytes, -s) 10240
        cpu time (seconds, -t) unlimited
        max user processes (-u) 29191
        virtual memory (kbytes, -v) unlimited
        file locks (-x) unlimited
        • DennyHatz
          DennyHatz
          102 Posts
          ACCEPTED ANSWER

          Re: FAILED daemon startup: hc

          ‏2013-02-15T16:11:10Z  in response to SystemAdmin
          I assume you fixed the ulimit on both hosts right?

          We were a little confused when you posted to a very old thread:
          Hello,

          I've resolved using iptables command and flusing all rules, just like this:

          sudo iptables -F INPUT

          Thanks in advance...

          Has the problem been resolved?
          • SystemAdmin
            SystemAdmin
            1245 Posts
            ACCEPTED ANSWER

            Re: FAILED daemon startup: hc

            ‏2013-02-15T16:40:13Z  in response to DennyHatz
            I use iptables to solve my problem, because I suppose the packetes were rejected by
            the tool...