Topic
  • 3 replies
  • Latest Post - ‏2013-09-24T09:10:58Z by r2d
r2d
r2d
19 Posts

Pinned topic HDFSFileSink operator. (Streams to BI data flow)

‏2013-09-21T16:43:09Z |

I am trying to send data from streams to BI using the HDFSFleSink operator.  Streams and BI are installed on 2 different servers. I followed the steps given in ibm forum:

- All BI services are running on BI server.

- Copied hadoop-conf and IHC folder from BI server to streams server and set HADOOP_HOME to IHC folder in streams server.

- Ran the operator..

- I am able to ping the servers.

 

I am getting the below errors in the log files:

21 Sep 2013 23:35:58.026 [20049] ERROR #splapptrc,J[15],P[47],samplesink,spl_operator M[PEImpl.cpp:instantiateOperators:4
63]  - CDISR5030E: An exception occurred during the execution of the samplesink operator. The exception is: Could not con
nect to HDFS
21 Sep 2013 23:35:58.027 [20049] ERROR #splapptrc,J[15],P[47],samplesink,spl_pe M[PEImpl.cpp:process:675]  - CDISR5079E: 
An exception occurred during the processing of the processing element. The error is: Could not connect to HDFS.
21 Sep 2013 23:35:58.027 [20049] ERROR #splapptrc,J[15],P[47],samplesink,spl_operator M[PEImpl.cpp:process:696]  - CDISR5
053E: Runtime failures occurred in the following operators: samplesink.
21 Sep 2013 23:35:58.180 [20008] ERROR spl_metric M[PEMetricsImpl.cpp:dumpMetricAtExit:107]  - The Number of tuples proce
ssed (port 0) metric of the processing element has a value of 0 at exit.
21 Sep 2013 23:35:58.180 [20008] ERROR spl_metric M[PEMetricsImpl.cpp:dumpMetricAtExit:107]  - The Number of bytes proces
sed (port 0) metric of the processing element has a value of 0 at exit.
21 Sep 2013 23:35:58.181 [20008] ERROR spl_metric M[PEMetricsImpl.cpp:dumpMetricAtExit:107]  - The Number of window punct
uations processed (port 0) metric of the processing element has a value of 0 at exit.
21 Sep 2013 23:35:58.181 [20008] ERROR spl_metric M[PEMetricsImpl.cpp:dumpMetricAtExit:107]  - The Number of final punctu
ations processed (port 0) metric of the processing element has a value of 0 at exit.
 
21 Sep 2013 23:35:47.467 [20048] ERROR :::NAM.LookupEntry M[DN_NameService.cpp:lookupObject:690]  - got NameService::not_
found
21 Sep 2013 23:35:52.674 [20048] ERROR :::Core.Transport.Tcp M[TCPConnection.cpp:connectToServerUnlocked:429]  - Connecti
on attempt failed for '46.46' retrying (1)
21 Sep 2013 23:35:52.679 [20048] ERROR :::Core.Transport.Tcp M[TCPConnection.cpp:connectToServerUnlocked:334]  - Connecti
on successfully established for '46.46' (1)
21 Sep 2013 23:35:52.680 [20048] ERROR :::NAM.LookupEntry M[DN_NameService.cpp:lookupObject:690]  - got NameService::not_
found
21 Sep 2013 23:35:57.888 [20048] ERROR :::Core.Transport.Tcp M[TCPConnection.cpp:connectToServerUnlocked:429]  - Connecti
on attempt failed for '47.47' retrying (1)
21 Sep 2013 23:35:57.888 [20048] ERROR :::Core.Transport.Tcp M[TCPConnection.cpp:connectToServerUnlocked:429]  - Connecti
on attempt failed for '47.47' retrying (1)
21 Sep 2013 23:35:57.889 [20048] ERROR :::NAM.LookupEntry M[DN_NameService.cpp:lookupObject:690]  - got NameService::not_
found
 
 
21 Sep 2013 23:35:58.029 [20049] ERROR :::PEC.StartPE M[PECServer.cpp:runPE:313]  P[47] - PE 47 uncaught Distillery Excep
tion: 'virtual void SPL::PEImpl::process()' [./src/SPL/Runtime/ProcessingElement/PEImpl.cpp:697] with msg: Runtime failur
es occurred in the following operators: samplesink.
[1379781358024228] [bi@streams] [192.168.1.173] [#splapplog,J[15],P[47],samplesink,HdfsCommon] Could not access HDFS file
 system on host 192.168.1.171 on  port 9,000.
[1379781358026502] [bi@streams] [192.168.1.173] [#splapplog,J[15],P[47],samplesink,spl_pe] CDISR5033E: An exception occur
red during the execution of the samplesink operator. Processing element number 47 is terminating.
[1379781358029689] [bi@streams] [192.168.1.173] [:::PEC.StartPE] CDISR4308E The processing element ID 47 was shut down be
cause of an unexpected error. The error is: Runtime failures occurred in the following operators: samplesink..
[1379781358090700] [bi@streams] [192.168.1.173] [:::PEC.MonitorPeHealth,J[15],P[47]] CDISR4000E The processing element ID
 47 of the BINamespace::BIintegration job with job ID 15 was submitted by the streams user, and then was shut down unexpe
ctedly. The error is: Distillery Exception: 'virtual void SPL::PEImpl::process()' [./src/SPL/Runtime/ProcessingElement/PE
Impl.cpp:697] with msg: Runtime failures occurred in the following operators: samplesink..
13/09/21 23:35:49 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 0 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:50 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 1 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:51 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 2 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:52 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 3 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:53 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 4 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:54 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 5 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:55 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 6 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:56 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 7 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:57 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 8 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:58 INFO ipc.Client: Retrying connect to server: devbigbin/192.168.1.171:9000. Already tried 9 time(s); ret
ry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
13/09/21 23:35:58 ERROR security.UserGroupInformation: PriviledgedActionException as:biadmin cause:java.net.ConnectExcept
ion: Call to devbigbin/192.168.1.171:9000 failed on connection exception: java.net.ConnectException: Connection refused
Exception in thread "main" java.net.ConnectException: Call to devbigbin/192.168.1.171:9000 failed on connection exception
: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1136)
at org.apache.hadoop.ipc.Client.call(Client.java:1112)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:135)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:276)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:241)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1414)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1432)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem$1.run(FileSystem.java:117)
at org.apache.hadoop.fs.FileSystem$1.run(FileSystem.java:115)
at java.security.AccessController.doPrivileged(AccessController.java:310)
at javax.security.auth.Subject.doAs(Subject.java:573)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:115)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:614)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:453)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:579)
at org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:202)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1243)
at org.apache.hadoop.ipc.Client.call(Client.java:1087)
... 17 more
 
  • Stan
    Stan
    76 Posts

    Re: HDFSFileSink operator. (Streams to BI data flow)

    ‏2013-09-23T16:02:03Z  

    Call to devbigbin/192.168.1.171:9000 failed on connection exception: java.net.ConnectException: Connection refused

    Exception in thread "main" java.net.ConnectException: Call to devbigbin/192.168.1.171:9000 failed on connection exception
    : java.net.ConnectException: Connection refused

    Verify the host and port are correct for your hDFS filesystem.  If so it looks like the HDFS port may be blocked.  See if you can get a list of the hadoop file system from your Streams host:

    hadoop fs -ls /

     

     

  • r2d
    r2d
    19 Posts

    Re: HDFSFileSink operator. (Streams to BI data flow)

    ‏2013-09-24T02:55:11Z  
    • Stan
    • ‏2013-09-23T16:02:03Z

    Call to devbigbin/192.168.1.171:9000 failed on connection exception: java.net.ConnectException: Connection refused

    Exception in thread "main" java.net.ConnectException: Call to devbigbin/192.168.1.171:9000 failed on connection exception
    : java.net.ConnectException: Connection refused

    Verify the host and port are correct for your hDFS filesystem.  If so it looks like the HDFS port may be blocked.  See if you can get a list of the hadoop file system from your Streams host:

    hadoop fs -ls /

     

     

    verified hadoop fs -ls /

    it lists the files correctly..

    But under /opt/ibm/biginsights/bin, when i run ./start.sh hadoop command, i am getting this error:

    [INFO] @25.74.42.235 - 2013-09-24 09:50:38,910 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.ipc.RemoteException:org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException:org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode: 192.168.1.171:50010

    [INFO] @25.74.42.235 - SHUTDOWN_MSG: Shutting down DataNode at devbigbin/192.168.1.171
    [INFO] @25.74.42.235 - ************************************************************/
    [INFO] @25.74.42.235 - [ERROR] datanode failed to start
     

     

    I checked the includes file, mentioned for dfs.hosts property, it seems to be correct..

  • r2d
    r2d
    19 Posts

    Re: HDFSFileSink operator. (Streams to BI data flow)

    ‏2013-09-24T09:10:58Z  

    working fine now. Problem with host and port.. Thanks..