Topic
  • 13 replies
  • Latest Post - ‏2013-09-06T21:36:27Z by Jason T
sgsiebers
sgsiebers
13 Posts

Pinned topic Unable to start instance against ZooKeeper

‏2013-08-16T21:02:05Z |

I'm having some trouble getting my instance to run against Zookeeper.  I've set things up and validated the ZK connection using the provided utilities, but when I go to start the instance it errors out unable to fully start the SWS because of an error writing data to Zookeper:

CDISC5316E The name service that uses ZooKeeper failed because of the following error: Unable to create directory /ZK@streams Exception: com.ibm.distillery.utils.exc.ZooKeeperException: Unable to create directory/com.ibm.streams/instances/ZK@streams -- CAUSE EXCEPTION org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /com.ibm.streams/instances/ZK@streams -- END CAUSE EXCEPTION ,  location=,  backtrace=,  exceptionCode= DistilleryExceptionCode object - Message id = NoMessageId,  number of substitution text=0.
 

When I look at that node in zookeeper-client I see there is an ACL set on the ZNode (which I didn't set):

[zk: localhost:2181(CONNECTED) 1] getAcl /com.ibm.streams
'digest,'DEV@streams:xkjM+cmZrMHn/lpRMQC+YyCJEPM=
: cdrwa

 

And simply listing the contents of that Znode crashes the zookeeper-client.

[zk: localhost:2181(CONNECTED) 4] ls /com.ibm.streams
Exception in thread "main" org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /com.ibm.streams
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1448)
        at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1476)
        at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:717)
        at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
        at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:365)
        at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
        at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)

 

If it helps at all, my setup is leveraging a 3-node Zk cluster that is separate from the node where my Streams management service are running.

Any ideas what's going on here or how to resolve?  Thanks!

 

  • jmasarik
    jmasarik
    6 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-19T13:38:38Z  

    Hi Scott,

    I will be looking into this problem for you.

    Regards,

    John

  • Jason T
    Jason T
    31 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-21T23:06:13Z  

    Hello Scott,

    If you could please turn on tracing (streamtool setproperty InfrastructureTracelevel=trace), start the instance and post the logs that would be helpful.  Also, if you could possibly run through the sequence of events leading up to this failure that would also be helpful.  Please be as specific as possible.

    Thank You,
    Jason

  • sgsiebers
    sgsiebers
    13 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-22T17:13:43Z  
    • Jason T
    • ‏2013-08-21T23:06:13Z

    Hello Scott,

    If you could please turn on tracing (streamtool setproperty InfrastructureTracelevel=trace), start the instance and post the logs that would be helpful.  Also, if you could possibly run through the sequence of events leading up to this failure that would also be helpful.  Please be as specific as possible.

    Thank You,
    Jason

    Hi Jason...thanks for the help.  Please let me know if there's any other info that I can provide.

    ~Scott

    Setup:

    1. 3 node Zookeeper cluster, with empty filesystem (except for one empty znode at /zookeeper/quota). 
    2. Created streams instance named "ZK" created, configured for the zookeeper cluster and the connection test validated successfully.

    Steps Taken:

    1. "streamtool setproperty InfrastructureTraceLeve=trace -i ZK" //Succeeded
    2. Start the instance and capture output using: "streamtool startinstance -i ZK | tee zk_startup.log" and see that startup fails...see attached output log from the startinstance command
    3. See attached logs captured using: "streamtool getlogs -i ZK"
    4. Examine ZK filesystem using zookeeper-client
    5. The following EMPTY znode path now exists in my zookeeper cluster: /com.ibm.streams/instances
    6. Using getAcl, I see that the /com.ibm.streams/instances znode (and all parents) have the following perimssions: 'world,'anyone: cdrwa

    Attachments

  • sgsiebers
    sgsiebers
    13 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-22T17:19:21Z  
    • sgsiebers
    • ‏2013-08-22T17:13:43Z

    Hi Jason...thanks for the help.  Please let me know if there's any other info that I can provide.

    ~Scott

    Setup:

    1. 3 node Zookeeper cluster, with empty filesystem (except for one empty znode at /zookeeper/quota). 
    2. Created streams instance named "ZK" created, configured for the zookeeper cluster and the connection test validated successfully.

    Steps Taken:

    1. "streamtool setproperty InfrastructureTraceLeve=trace -i ZK" //Succeeded
    2. Start the instance and capture output using: "streamtool startinstance -i ZK | tee zk_startup.log" and see that startup fails...see attached output log from the startinstance command
    3. See attached logs captured using: "streamtool getlogs -i ZK"
    4. Examine ZK filesystem using zookeeper-client
    5. The following EMPTY znode path now exists in my zookeeper cluster: /com.ibm.streams/instances
    6. Using getAcl, I see that the /com.ibm.streams/instances znode (and all parents) have the following perimssions: 'world,'anyone: cdrwa

    To add some additional context...I first tried this with a streams instance called "DEV" that was also configured for job recovery and that's what exhibited the first failure I mentioned above.  I was wondering if recovery somehow had an impact so I shutdown that instance, cleared the zookeeper filesystem, and then started fresh with the "ZK" streams instance used above which seems to exhibit a slightly different problem now.  Originally the /com.ibm.streams znode had a digest based ACL, but now when it gets recreated the ACL permissions of /com.ibm.stream/instances seem open, but I still get a NoAuth error.

    Thanks again for the help!

    ~Scott

  • Jason T
    Jason T
    31 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-23T16:41:25Z  
    • sgsiebers
    • ‏2013-08-22T17:13:43Z

    Hi Jason...thanks for the help.  Please let me know if there's any other info that I can provide.

    ~Scott

    Setup:

    1. 3 node Zookeeper cluster, with empty filesystem (except for one empty znode at /zookeeper/quota). 
    2. Created streams instance named "ZK" created, configured for the zookeeper cluster and the connection test validated successfully.

    Steps Taken:

    1. "streamtool setproperty InfrastructureTraceLeve=trace -i ZK" //Succeeded
    2. Start the instance and capture output using: "streamtool startinstance -i ZK | tee zk_startup.log" and see that startup fails...see attached output log from the startinstance command
    3. See attached logs captured using: "streamtool getlogs -i ZK"
    4. Examine ZK filesystem using zookeeper-client
    5. The following EMPTY znode path now exists in my zookeeper cluster: /com.ibm.streams/instances
    6. Using getAcl, I see that the /com.ibm.streams/instances znode (and all parents) have the following perimssions: 'world,'anyone: cdrwa

    Hello Scott,

    We looked through the tracefiles and found the following warning in the ZK@streams.boot.log file.

    #########################################################################################
    #####  WARNING WARNING WARNING !! ULIMIT CHECK on Host cepdevmgmt1.northamerica.cerner.net
    #####  ulimit max user processes (-u) setting of 1024 is LOW
    #####  See InfoSphere Streams Information Center for ulimit recommendations
    #########################################################################################

    Development has found that ulimit failures can cause unpredicable behavior so just to be sure you should increase this value.  Please refer to the Streams Information Center for the recommended ulimit settings, apply them as they relate to your system and run your testcase again.  The link to the Information Center discussing ulimit can be found here.

    Thank You,
    Jason

  • sgsiebers
    sgsiebers
    13 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-23T18:34:30Z  
    • Jason T
    • ‏2013-08-23T16:41:25Z

    Hello Scott,

    We looked through the tracefiles and found the following warning in the ZK@streams.boot.log file.

    #########################################################################################
    #####  WARNING WARNING WARNING !! ULIMIT CHECK on Host cepdevmgmt1.northamerica.cerner.net
    #####  ulimit max user processes (-u) setting of 1024 is LOW
    #####  See InfoSphere Streams Information Center for ulimit recommendations
    #########################################################################################

    Development has found that ulimit failures can cause unpredicable behavior so just to be sure you should increase this value.  Please refer to the Streams Information Center for the recommended ulimit settings, apply them as they relate to your system and run your testcase again.  The link to the Information Center discussing ulimit can be found here.

    Thank You,
    Jason

    Jason,


    Thanks for looking into this.  I upped the max user process setting on all nodes in the streams cluster and still get the NoAuth error (and the ulimit warning is no longer in the logs).  Please see newly attached error logs.

    ~Scott

    Attachments

  • DennyHatz
    DennyHatz
    102 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-23T18:59:07Z  
    • sgsiebers
    • ‏2013-08-23T18:34:30Z

    Jason,


    Thanks for looking into this.  I upped the max user process setting on all nodes in the streams cluster and still get the NoAuth error (and the ulimit warning is no longer in the logs).  Please see newly attached error logs.

    ~Scott

    Scott

    I am working on this problem with Jason.  Can you check the acl on the directory /com.ibm.streams and the subdirectories now after you have changed the ulimit? 

    Thank you

    Denny

  • sgsiebers
    sgsiebers
    13 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-23T20:35:15Z  
    • DennyHatz
    • ‏2013-08-23T18:59:07Z

    Scott

    I am working on this problem with Jason.  Can you check the acl on the directory /com.ibm.streams and the subdirectories now after you have changed the ulimit? 

    Thank you

    Denny

    Still wide open:

    [zk: localhost:2181(CONNECTED) 0] getAcl /com.ibm.streams
    'world,'anyone
    : cdrwa
    [zk: localhost:2181(CONNECTED) 1] getAcl /com.ibm.streams/instances
    'world,'anyone
    : cdrwa
    [zk: localhost:2181(CONNECTED) 2]
     

  • DennyHatz
    DennyHatz
    102 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-26T20:50:33Z  
    • sgsiebers
    • ‏2013-08-23T20:35:15Z

    Still wide open:

    [zk: localhost:2181(CONNECTED) 0] getAcl /com.ibm.streams
    'world,'anyone
    : cdrwa
    [zk: localhost:2181(CONNECTED) 1] getAcl /com.ibm.streams/instances
    'world,'anyone
    : cdrwa
    [zk: localhost:2181(CONNECTED) 2]
     

    We are trying to recreate your failure but have not been able to do so as yet.

    One additional question: What version of Zookeeper are you using?  Are you using the one we ship with Streams?

  • sgsiebers
    sgsiebers
    13 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-26T22:00:04Z  
    • DennyHatz
    • ‏2013-08-26T20:50:33Z

    We are trying to recreate your failure but have not been able to do so as yet.

    One additional question: What version of Zookeeper are you using?  Are you using the one we ship with Streams?

    I guess I wasn't aware that Streams shipped with a specific Zookeeper version.  We are trying to run against Zookeeper 3.4.3 (provided from Cloudera CDH 4.1.1) which is the standard version we use across our enterprise for our other Big Data infrastructure. 

    When I was testing the Streams 3.1 beta, I did have some success using Zookeeper 3.4.5 from CDH 4.2, but there were a lot of other differences in that setup so hard to say the zookeeper version difference is what's breaking me.  Thanks for looking into this, please let me know what other information I can provide.

  • Jason T
    Jason T
    31 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-28T21:46:35Z  
    • sgsiebers
    • ‏2013-08-26T22:00:04Z

    I guess I wasn't aware that Streams shipped with a specific Zookeeper version.  We are trying to run against Zookeeper 3.4.3 (provided from Cloudera CDH 4.1.1) which is the standard version we use across our enterprise for our other Big Data infrastructure. 

    When I was testing the Streams 3.1 beta, I did have some success using Zookeeper 3.4.5 from CDH 4.2, but there were a lot of other differences in that setup so hard to say the zookeeper version difference is what's breaking me.  Thanks for looking into this, please let me know what other information I can provide.

    Hi Scott,

    Do you have the ability to log PMRs?  If yes, could you please log a PMR to pursue this as the information we are going to request will expose security details of your installation.  If no, could you please post an email address so I can contact you.

    Thank You,

    Jason

  • sgsiebers
    sgsiebers
    13 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-08-28T22:17:35Z  
    • Jason T
    • ‏2013-08-28T21:46:35Z

    Hi Scott,

    Do you have the ability to log PMRs?  If yes, could you please log a PMR to pursue this as the information we are going to request will expose security details of your installation.  If no, could you please post an email address so I can contact you.

    Thank You,

    Jason

    Jason,


    Thanks for continuing to look into this.  Unfortunately our IBM rep hasn't set me up to log PMR's yet.  I'll make sure they get that setup soon, but in the meantime you can reach me at scott (dot) siebers (at) cerner (dot) com.  Thanks!

    ~Scott

  • Jason T
    Jason T
    31 Posts

    Re: Unable to start instance against ZooKeeper

    ‏2013-09-06T21:36:27Z  
    • sgsiebers
    • ‏2013-08-23T18:34:30Z

    Jason,


    Thanks for looking into this.  I upped the max user process setting on all nodes in the streams cluster and still get the NoAuth error (and the ulimit warning is no longer in the logs).  Please see newly attached error logs.

    ~Scott

    Hello,

    Through further investigation we found the process that installed zookeeper assigned 2 of the nodes to the same node ID.  To alleviate this the zookeeper data directories were removed and the cluster was recreated with the proper node IDs.  This has fixed the issue.

    Jason