Topic
18 replies Latest Post - ‏2013-05-15T20:50:46Z by Stan
Frank_Blau
Frank_Blau
28 Posts
ACCEPTED ANSWER

Pinned topic Error Putting Data on BigInsights

‏2013-05-07T06:37:56Z |

I have 2 vm's, one with BigInsights, one with Streams.

They can see and ping each other. Streams is running, Hadoop is running.

HADOOP_HOME (on both machines, I don't know if this is right) is set to /opt/ibm/biginsights/IHC

A very simple composite:

use com.ibm.streams.bigdata.hdfs::HDFSFileSink ;
 
composite Main
{
graph
(stream<rstring someData> Beacon_1_out0) as Beacon_1 = Beacon()
{
param
iterations : 10u ;
output
Beacon_1_out0 : someData = "here is some data" ;
}
 
() as HDFSFileSink_1 = HDFSFileSink(Beacon_1_out0)
{
param
NamenodeHost : "192.168.116.128" ;
NamenodePort : 9000u ;
file : "foo%FILENUM.txt" ;
format : txt ;
HDFSUser : "biadmin" ;
}
 
}
-----------------------

Won't  build... get the following errors:

Description Resource Path Location Type
 'hdfsFS' does not name a type HadoopTest Line 173, column 1 SPL Build
 variable or field 'flushToDisk' declared void HadoopTest Line 35, column 1 SPL Build
 'int SPL::_Operator::HDFSFileSink_1$OP::BufferWrapper::flushToDisk' is not a static member of 'class SPL::_Operator::HDFSFileSink_1$OP::BufferWrapper' HadoopTest Line 35, column 1 SPL Build
 'hdfsFS' was not declared in this scope HadoopTest Line 35, column 1 SPL Build
 expected primary-expression before 'const' HadoopTest Line 35, column 1 SPL Build
 initializer expression list treated as compound expression HadoopTest Line 35, column 1 SPL Build
 expected ',' or ';' before '{' token HadoopTest Line 36, column 1 SPL Build
 hdfs.h: No such file or directory HadoopTest Line 4, column 18 SPL Build
 'hdfsFS' has not been declared HadoopTest Line 104, column 1 SPL Build
 'hdfsFS' does not name a type HadoopTest Line 144, column 1 SPL Build
 
  • Stan
    Stan
    76 Posts
    ACCEPTED ANSWER

    Re: Error Putting Data on BigInsights

    ‏2013-05-08T15:52:38Z  in response to Frank_Blau

    I setup your simple composite in Streams Studio as an SPL Project and, after adjusting the Namenodehost and HDFSUser was able to successfully build and launch the project.  Can you supply additional information on your build environement?  There appears to be more to this than the simple composite.

    I note that the build errors refer to HadoopTest.  Please check your build environment and remove references to HadoopTest or, if HadoopTest is a requirement please describes how it relates to the build.

    Also,  if  'ls $HADOOP_HOME/lib' lists the hadoop jars (hadoop*.jar) then the variable is set correctly.

    • Frank_Blau
      Frank_Blau
      28 Posts
      ACCEPTED ANSWER

      Re: Error Putting Data on BigInsights

      ‏2013-05-08T16:29:11Z  in response to Stan

      thanks for your help!

      What information do you want about the build environment?

      HadoopTest is the name of the project that this is built under.

      [Edited] If I run  ls $HADOOP_HOME on that (on the BigInsights server), I see:

      -rw-r--r--  1 biadmin biadmin    6828 Nov  2  2012 hadoop-ant-1.0.0.jar
      -rw-r--r--  1 biadmin biadmin 3765874 Nov  2  2012 hadoop-core-1.0.0.jar
      lrwxrwxrwx  1 biadmin biadmin      21 Nov  2  2012 hadoop-core.jar -> hadoop-core-1.0.0.jar
      lrwxrwxrwx  1 biadmin biadmin      25 Nov  2  2012 hadoop-example.jar -> hadoop-examples-1.0.0.jar
      -rw-r--r--  1 biadmin biadmin  142457 Nov  2  2012 hadoop-examples-1.0.0.jar
      -rw-r--r--  1 biadmin biadmin 2569445 Nov  2  2012 hadoop-test-1.0.0.jar
      -rw-r--r--  1 biadmin biadmin  287761 Nov  2  2012 hadoop-tools-1.0.0.jar

       

      When I run ls $HADOOP_HOME/lib I see:

      hadoop-fairscheduler-1.0.0.jar  

      Should there be more?

      • Stan
        Stan
        76 Posts
        ACCEPTED ANSWER

        Re: Error Putting Data on BigInsights

        ‏2013-05-08T19:01:34Z  in response to Frank_Blau

        Embarassing (to me) I erred in my previous post  - you are correct that the jars are in HADOOP_HOME - not in HADOOP_HOME/lib.   This doesn't seem to be the problem however.

        It seemed odd that your project name is identified in the error messages.  I thought you might be building from the command line, possibly using a makefile.  I now see you are using StreamsStudio but think there is a rather long source file somehwhere in your project (note that line 173 is referenced).   As indicated below I forced some build errors and see the messages you posted are probably from the Problems tab (I generally view the Console window so didn't pick up on the format you presented). 

        In the following error you posted I am not seeing your SPL file listed under the RESOURCE colum - that information seems to be missing - the text appears to be just the columns:

         Description  ,  Path , Location, Type

        'hdfsFS' does not name a type |  HadoopTest  |  Line 173, column 1 |  SPL Build

        We need to find out what source file is generating the error from  Line 173 (and the other referenced).  Checking the Console messages may give a clearer view of what is happeing  (see my example messages below).

        You can track down problems in the Project Explorer view by following the RED x's.  Expand every section marked with a RED x until you get to the lowest level and open the file.

        If you can't locate the file in Project Explorer you might want to start over from scratch and create a new, clean project that included only the bigdata toolkit and your SPL code :

        File -> New -> Project -> SPL Project   - You will need to include 'bigdata' toolkit .  Give it a new name and reenter the SPL code :  New -> SPL Source File 

        NOTE: I forced a build failure by removing the 'bigdata toolkit' from the project and still did not get projectName references with line numbers - Main.spl was listed as expected and the line numbers were a within the number of lines in Main.spl.  My Console window headers and  errors looked like this:

        ---- SPL Build for project ForumProbHDFSFileSink started ---- May 8, 2013 11:05:45 AM PDT

        Building main composite: Main using build configuration: Distributed

        /home/streamsadmin/InfoSphereStreams/bin/sc -M Main --output-directory=output/Main/Distributed --data-directory=data --no-toolkit-indexing --no-mixed-mode-preprocessing

        Main.spl:1:4: CDISP0760W WARNING: Namespace 'com.ibm.streams.bigdata.hdfs' undefined in 'use' directive.
        Main.spl:14:24: CDISP0053E ERROR: Unknown identifier 'HDFSFileSink'.
        Main.spl:20:10: CDISP0053E ERROR: Unknown identifier 'txt'.

        CDISP0092E ERROR: Cannot continue compilation due to previous errors.

        ---- SPL Build for project ForumProbHDFSFileSink completed in 0.271 seconds ----

         

  • Frank_Blau
    Frank_Blau
    28 Posts
    ACCEPTED ANSWER

    Re: Error Putting Data on BigInsights

    ‏2013-05-08T20:27:34Z  in response to Frank_Blau

    Stan, I think I found something... if I look in the toolkits directory, I see this (attached).

    I also discovered that when I create a new project, the bigdata toolkit no longer shows up in my list of possible toolkits.

    Is there some way to just reinstall that component?

     

    Attachments

  • Frank_Blau
    Frank_Blau
    28 Posts
    ACCEPTED ANSWER

    Re: Error Putting Data on BigInsights

    ‏2013-05-08T21:20:40Z  in response to Frank_Blau

    I removed and reinstalled the big data toolkit.

    This is now what I get on build:

     

     
    ---- SPL Build for project HadoopTest started ---- May 8, 2013 2:19:41 PM PDT
     
    Building main composite: Main using build configuration: Standalone
     
    /opt/ibm/InfoSphereStreams/bin/sc -M Main --output-directory=output/Main/Standalone --data-directory=data -T -t /home/streamsadmin/workspace/com.ibm.streams.bigdata --no-toolkit-indexing --no-mixed-mode-preprocessing 
     
    --- Environment Variables ---
        COLORTERM=gnome-terminal
        CVS_RSH=ssh
        DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-SYARdBwujN,guid=cce1b8b76e5a2d3880d78b00518abbb4
        DESKTOP_SESSION=default
        DESKTOP_STARTUP_ID=
        DISPLAY=:0.0
        GDMSESSION=default
        GDM_XSERVER_LOCATION=local
        GNOME_DESKTOP_SESSION_ID=Default
        GNOME_KEYRING_SOCKET=/tmp/keyring-JdJEiv/socket
        GTK_RC_FILES=/etc/gtk/gtkrc:/home/streamsadmin/.gtkrc-1.2-gnome2
        G_BROKEN_FILENAMES=1
        HADOOP_HOME=/opt/ibm/biginsights/IHC
        HISTSIZE=1000
        HOME=/home/streamsadmin
        HOSTNAME=mybox.localdomain
        IBM_JAVA_COMMAND_LINE=/opt/ibm/java-x86_64-60/bin/java -Dosgi.requiredJavaVersion=1.5 -Dhelp.lucene.tokenizer=standard -XX:MaxPermSize=256m -Xms40m -Xmx512m -jar /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar -os linux -ws gtk -arch x86_64 -showsplash /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.platform_4.2.2.v201302041200/splash.bmp -launcher /home/streamsadmin/Desktop/eclipse/eclipse -name Eclipse --launcher.library /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.1.200.v20120913-144807/eclipse_1502.so -startup /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar --launcher.overrideVmargs -exitdata 5f000e -product org.eclipse.epp.package.jee.product -vm /opt/ibm/java-x86_64-60/bin/java -vmargs -Dosgi.requiredJavaVersion=1.5 -Dhelp.lucene.tokenizer=standard -XX:MaxPermSize=256m -Xms40m -Xmx512m -jar /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar
        INPUTRC=/etc/inputrc
        JAVA_HOME=/usr/java/jre1.7.0_07
        LANG=en_US.UTF-8
        LESSOPEN=|/usr/bin/lesspipe.sh %s
        LOGNAME=streamsadmin
        LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=00;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31:*.zip=00;31:*.z=00;31:*.Z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*.tif=00;35:
        MAIL=/var/spool/mail/streamsadmin
        MOZILLA_FIVE_HOME=/usr/lib64/xulrunner-1.9.2
        ODBCINI=/usr/local/etc/odbc.ini
        OLDPWD=/home/streamsadmin
        PATH=/opt/ibm/InfoSphereStreams/bin:/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/streamsadmin/bin
        PWD=/home/streamsadmin/Desktop/eclipse
        SESSION_MANAGER=local/mybox.localdomain:/tmp/.ICE-unix/4436
        SHELL=/bin/bash
        SHLVL=2
        SSH_AGENT_PID=4472
        SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
        SSH_AUTH_SOCK=/tmp/ssh-eIOGKM4436/agent.4436
        STREAMS_ADAPTERS_ODBC_INCPATH=/usr/local/include
        STREAMS_ADAPTERS_ODBC_LIBPATH=/usr/local/lib
        STREAMS_ADAPTERS_ODBC_MYSQL=1
        STREAMS_DEFAULT_IID=streams
        STREAMS_INSTALL=/opt/ibm/InfoSphereStreams
        TERM=xterm
        USER=streamsadmin
        USERNAME=streamsadmin
        WINDOWID=16777297
        XAUTHORITY=/tmp/.gdmCX8UWW
        XMODIFIERS=@im=none
        _=./eclipse
     
    Creating types...
    Creating functions...
    Creating operators...
    Creating PEs...
    Creating standalone app...
    Creating application model...
    Building binaries...
     [CXX-operator] HDFSFileSink_1
    src/operator/HDFSFileSink_1.cpp:7:29: error: BigdataResource.h: No such file or directory
    In file included from src/operator/HDFSFileSink_1.cpp:13:
    src/operator/./HDFSFileSink_1.h:4:18: error: hdfs.h: No such file or directory
    src/operator/./HDFSFileSink_1.h:102: error: 'hdfsFS' has not been declared
    src/operator/./HDFSFileSink_1.h:142: error: 'hdfsFS' does not name a type
    src/operator/./HDFSFileSink_1.h:171: error: 'hdfsFS' does not name a type
    src/operator/HDFSFileSink_1.cpp:35: error: variable or field 'flushToDisk' declared void
    src/operator/HDFSFileSink_1.cpp:35: error: 'int SPL::_Operator::HDFSFileSink_1$OP::BufferWrapper::flushToDisk' is not a static member of 'class SPL::_Operator::HDFSFileSink_1$OP::BufferWrapper'
    src/operator/HDFSFileSink_1.cpp:35: error: 'hdfsFS' was not declared in this scope
    src/operator/HDFSFileSink_1.cpp:35: error: expected primary-expression before 'const'
    src/operator/HDFSFileSink_1.cpp:35: error: initializer expression list treated as compound expression
    src/operator/HDFSFileSink_1.cpp:36: error: expected ',' or ';' before '{' token
    make: *** [build/operator/HDFSFileSink_1.o] Error 1
    CDISP0141E ERROR: Compilation of the generated code has failed.
     
    ---- SPL Build for project HadoopTest completed in 3.737 seconds ----
     
    • Frank_Blau
      Frank_Blau
      28 Posts
      ACCEPTED ANSWER

      Re: Error Putting Data on BigInsights

      ‏2013-05-08T22:13:07Z  in response to Frank_Blau

      I was able to reinstall the big data toolkit (for some reason I had a project by the same name and that was overriding the one in the toolkit directory).

      But I still cannot get my simple composite to build. Here is the console output:

      -------------------

       
      ---- SPL Build for project werwer started ---- May 8, 2013 3:11:32 PM PDT
       
      Building main composite: Main using build configuration: Distributed
       
      /opt/ibm/InfoSphereStreams/bin/sc -M Main --output-directory=output/Main/Distributed --data-directory=data -t /opt/ibm/InfoSphereStreams/toolkits/com.ibm.streams.bigdata --no-toolkit-indexing --no-mixed-mode-preprocessing 
       
      --- Environment Variables ---
          COLORTERM=gnome-terminal
          CVS_RSH=ssh
          DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-SYARdBwujN,guid=cce1b8b76e5a2d3880d78b00518abbb4
          DESKTOP_SESSION=default
          DESKTOP_STARTUP_ID=
          DISPLAY=:0.0
          GDMSESSION=default
          GDM_XSERVER_LOCATION=local
          GNOME_DESKTOP_SESSION_ID=Default
          GNOME_KEYRING_SOCKET=/tmp/keyring-JdJEiv/socket
          GTK_RC_FILES=/etc/gtk/gtkrc:/home/streamsadmin/.gtkrc-1.2-gnome2
          G_BROKEN_FILENAMES=1
          HADOOP_HOME=/opt/ibm/biginsights/IHC
          HISTSIZE=1000
          HOME=/home/streamsadmin
          HOSTNAME=mybox.localdomain
          IBM_JAVA_COMMAND_LINE=/opt/ibm/java-x86_64-60/bin/java -Dosgi.requiredJavaVersion=1.5 -Dhelp.lucene.tokenizer=standard -XX:MaxPermSize=256m -Xms40m -Xmx512m -jar /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar -os linux -ws gtk -arch x86_64 -showsplash /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.platform_4.2.2.v201302041200/splash.bmp -launcher /home/streamsadmin/Desktop/eclipse/eclipse -name Eclipse --launcher.library /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.1.200.v20120913-144807/eclipse_1502.so -startup /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar --launcher.overrideVmargs -exitdata 64000e -product org.eclipse.epp.package.jee.product -vm /opt/ibm/java-x86_64-60/bin/java -vmargs -Dosgi.requiredJavaVersion=1.5 -Dhelp.lucene.tokenizer=standard -XX:MaxPermSize=256m -Xms40m -Xmx512m -jar /home/streamsadmin/Desktop/eclipse//plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar
          INPUTRC=/etc/inputrc
          JAVA_HOME=/usr/java/jre1.7.0_07
          LANG=en_US.UTF-8
          LESSOPEN=|/usr/bin/lesspipe.sh %s
          LOGNAME=streamsadmin
          LS_COLORS=no=00:fi=00:di=00;34:ln=00;36:pi=40;33:so=00;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=00;32:*.cmd=00;32:*.exe=00;32:*.com=00;32:*.btm=00;32:*.bat=00;32:*.sh=00;32:*.csh=00;32:*.tar=00;31:*.tgz=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31:*.zip=00;31:*.z=00;31:*.Z=00;31:*.gz=00;31:*.bz2=00;31:*.bz=00;31:*.tz=00;31:*.rpm=00;31:*.cpio=00;31:*.jpg=00;35:*.gif=00;35:*.bmp=00;35:*.xbm=00;35:*.xpm=00;35:*.png=00;35:*.tif=00;35:
          MAIL=/var/spool/mail/streamsadmin
          MOZILLA_FIVE_HOME=/usr/lib64/xulrunner-1.9.2
          ODBCINI=/usr/local/etc/odbc.ini
          OLDPWD=/home/streamsadmin
          PATH=/opt/ibm/InfoSphereStreams/bin:/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/streamsadmin/bin
          PWD=/home/streamsadmin/Desktop/eclipse
          SESSION_MANAGER=local/mybox.localdomain:/tmp/.ICE-unix/4436
          SHELL=/bin/bash
          SHLVL=2
          SSH_AGENT_PID=4472
          SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
          SSH_AUTH_SOCK=/tmp/ssh-eIOGKM4436/agent.4436
          STREAMS_ADAPTERS_ODBC_INCPATH=/usr/local/include
          STREAMS_ADAPTERS_ODBC_LIBPATH=/usr/local/lib
          STREAMS_ADAPTERS_ODBC_MYSQL=1
          STREAMS_DEFAULT_IID=streams
          STREAMS_INSTALL=/opt/ibm/InfoSphereStreams
          TERM=xterm
          USER=streamsadmin
          USERNAME=streamsadmin
          WINDOWID=16777297
          XAUTHORITY=/tmp/.gdmCX8UWW
          XMODIFIERS=@im=none
          _=./eclipse
       
      Checking constraints...
      Creating types...
      Creating functions...
      Creating operators...
      Creating PEs...
      Creating application model...
      Building binaries...
       [CXX-operator] HDFSFileSink_1
      In file included from src/operator/HDFSFileSink_1.cpp:13:
      src/operator/./HDFSFileSink_1.h:4:18: error: hdfs.h: No such file or directory
      src/operator/./HDFSFileSink_1.h:104: error: 'hdfsFS' has not been declared
      src/operator/./HDFSFileSink_1.h:144: error: 'hdfsFS' does not name a type
      src/operator/./HDFSFileSink_1.h:173: error: 'hdfsFS' does not name a type
      src/operator/HDFSFileSink_1.cpp:35: error: variable or field 'flushToDisk' declared void
      src/operator/HDFSFileSink_1.cpp:35: error: 'int SPL::_Operator::HDFSFileSink_1$OP::BufferWrapper::flushToDisk' is not a static member of 'class SPL::_Operator::HDFSFileSink_1$OP::BufferWrapper'
      src/operator/HDFSFileSink_1.cpp:35: error: 'hdfsFS' was not declared in this scope
      src/operator/HDFSFileSink_1.cpp:35: error: expected primary-expression before 'const'
      src/operator/HDFSFileSink_1.cpp:35: error: initializer expression list treated as compound expression
      src/operator/HDFSFileSink_1.cpp:36: error: expected ',' or ';' before '{' token
      make: *** [build/operator/HDFSFileSink_1.o] Error 1
      CDISP0141E ERROR: Compilation of the generated code has failed.
       
      ---- SPL Build for project werwer completed in 3.923 seconds ----
       
      • Stan
        Stan
        76 Posts
        ACCEPTED ANSWER

        Re: Error Putting Data on BigInsights

        ‏2013-05-09T14:48:49Z  in response to Frank_Blau

        The only time I have seen errors listing 'hdfs.h' are when an incompatable version of Hadoop was being used.  Check the setting of the property HDFSFile.  Otherwise it may be that your copy of the toolkit is corrupt - it is odd that it disappeared from eclipse.  Did you reinstall from the Streams installer package? 

        I assume you are using Hadoop 1.0 + so the default is correct.  However, some of the samples set this value explicitly to 'FALSE' so will not work in a Hadoop 1.0+ environment.

         For HDFSFile Check the setting of 

        useVersionOneApi
        This parameter value has a boolean data type. If the parameter value is true, the operator uses Hadoop version 1.0.0. If the parameter value is false, the operator uses Hadoop version 0.20.2. If you do not specify this parameter, it defaults to the value true.
        If this parameter value does not match the product that is installed in the location specified by the HADOOP_HOME environment variable, a compile-time error occurs.
        Important: On IBM® Power Systems™, set this parameter value to false to use the supported version of Hadoop.

         

        • Frank_Blau
          Frank_Blau
          28 Posts
          ACCEPTED ANSWER

          Re: Error Putting Data on BigInsights

          ‏2013-05-10T16:39:09Z  in response to Stan

          Boo.

          true or false values, I still get the error.

          One question... at compile time, how is it using HADOOP_HOME? Does the Hadoop Server (a different VM in this case) have to be running at compile time?

          What is the purpose of HADOOP_HOME on the Streams Server? I ask because I see an error that says:

          ls: /opt/ibm/biginsights/IHC: No such file or directory
          ls: /opt/ibm/biginsights/IHC/lib: No such file or directory

          When those files/directories are not on the Streams server, they are on the BigInsights server.

          Frank

           

          • Stan
            Stan
            76 Posts
            ACCEPTED ANSWER

            Re: Error Putting Data on BigInsights

            ‏2013-05-14T14:30:50Z  in response to Frank_Blau

            I found out the following about Hadoop dependencies for the Big Data Toolkit. 

            The current recommendation is to create a shared file system for the BI install on the Streams cluster machines.  Set HADOOP_HOME to the <BI_INSTALL>/IHC directory.

            It is also possbile to copy a subset of the BI install (hadoop-conf and IHC) to the Hadoop cluster and make the files available (shared files) to the machines of the cluster.  The file copy process is planned to be documented in the next release (note: plans could change).  It has undergone basic testing.  Please let us know if you use this process and encounter any issues or if it works great.  Currently this is unsupported but seems pretty straight forward.

            a. Copy <BI install>/IHC folder to Streams cluster/host and place it under adirectory on the cluster which Streams can access . For example "/home/Streams/BI_Install_Folder"

            b. Copy <BI install>/hadoop-conf folder to Streams host and place it undera directory on the cluster which Streams can access . For example "/home/Streams/BI_Install_Folder"

            c. On Streams cluster, before submittinga job, set environment variable HADOOP_HOME=/home/Streams/BI_Install_Folder/IHC

            d. On Streams cluster , before submitting a job, set environment variable JAVA_HOME to the Location where Java is Installed.

             

            • Frank_Blau
              Frank_Blau
              28 Posts
              ACCEPTED ANSWER

              Re: Error Putting Data on BigInsights

              ‏2013-05-14T18:39:40Z  in response to Stan

              No dice:

              On Build:

               

              Building binaries...
               [CXX-operator] HDFSFileSink_1
              /opt/ibm/InfoSphereStreams/include/SPL/Runtime/ProcessingElement/ProcessingElement.h:27: error: conflicting declaration 'typedef struct JavaVM_ JavaVM'
              /usr/lib/gcc/x86_64-redhat-linux/4.1.2/include/jni.h:96: error: 'JavaVM' has a previous declaration as 'typedef class _Jv_JavaVM JavaVM'
              /mnt/hgfs/vmshare/IHC/src/c++/libhdfs/hdfs.h: In member function 'void* SPL::_Operator::HDFSFileSink_1$OP::getHdfsPtr()':
              /mnt/hgfs/vmshare/IHC/src/c++/libhdfs/hdfs.h:106: error: too many arguments to function 'void* hdfsConnectAsUser(const char*, tPort, const char*)'
              src/operator/HDFSFileSink_1.cpp:193: error: at this point in file
              make: *** [build/operator/HDFSFileSink_1.o] Error 1
              CDISP0141E ERROR: Compilation of the generated code has failed.
               
              • Stan
                Stan
                76 Posts
                ACCEPTED ANSWER

                Re: Error Putting Data on BigInsights

                ‏2013-05-14T21:21:49Z  in response to Frank_Blau

                Looks like you encountered the hdfs.h version problem I mentioned earlier.  Hadoops hdfsConnectAsUser previously took 5 arguments (v 0.2, I think) and now takes 3 (v 1.0).  Specify the  HDFSFileSink parameter :

                            VersionOneApi=true

                I suspect this will also  resolve the conflicing typedef  class/struct for _Jv_JavaVM JavaVM

                Hadoop reference:

                http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/c%2B%2B/libhdfs/hdfs.h?r1=1077228&r2=1077227&pathrev=1077228

                HDFS-1000   changes:
                -hdfsFS hdfsConnectAsUser(const char* host, tPort port, const char *user , const char **groups, int groups_size )
                +hdfsFS hdfsConnectAsUser(const char* host, tPort port, const char *user)

                 

                • Frank_Blau
                  Frank_Blau
                  28 Posts
                  ACCEPTED ANSWER

                  Re: Error Putting Data on BigInsights

                  ‏2013-05-14T21:30:29Z  in response to Stan

                  Well... it's different:

                   

                  with true
                   
                  Building binaries...
                   [CXX-operator] HDFSFileSink_1
                  /opt/ibm/InfoSphereStreams/include/SPL/Runtime/ProcessingElement/ProcessingElement.h:27: error: conflicting declaration 'typedef struct JavaVM_ JavaVM'
                  /usr/lib/gcc/x86_64-redhat-linux/4.1.2/include/jni.h:96: error: 'JavaVM' has a previous declaration as 'typedef class _Jv_JavaVM JavaVM'
                  make: *** [build/operator/HDFSFileSink_1.o] Error 1
                  CDISP0141E ERROR: Compilation of the generated code has failed.
                   
                  ---- SPL Build for project Hadoop_Test completed in 4.879 seconds --
                   
                  --------------------
                  with false
                   
                  Building binaries...
                   [CXX-operator] HDFSFileSink_1
                  /opt/ibm/InfoSphereStreams/include/SPL/Runtime/ProcessingElement/ProcessingElement.h:27: error: conflicting declaration 'typedef struct JavaVM_ JavaVM'
                  /usr/lib/gcc/x86_64-redhat-linux/4.1.2/include/jni.h:96: error: 'JavaVM' has a previous declaration as 'typedef class _Jv_JavaVM JavaVM'
                  /mnt/hgfs/vmshare/IHC/src/c++/libhdfs/hdfs.h: In member function 'void* SPL::_Operator::HDFSFileSink_1$OP::getHdfsPtr()':
                  /mnt/hgfs/vmshare/IHC/src/c++/libhdfs/hdfs.h:106: error: too many arguments to function 'void* hdfsConnectAsUser(const char*, tPort, const char*)'
                  src/operator/HDFSFileSink_1.cpp:193: error: at this point in file
                  make: *** [build/operator/HDFSFileSink_1.o] Error 1
                  CDISP0141E ERROR: Compilation of the generated code has failed.
                   
                  ---- SPL Build for project Hadoop_Test completed in 3.988 seconds ----
                   
                  • Stan
                    Stan
                    76 Posts
                    ACCEPTED ANSWER

                    Re: Error Putting Data on BigInsights

                    ‏2013-05-14T22:13:32Z  in response to Frank_Blau

                    A 50% improvement!! Hopefully we are down to last layer of this problem. Looks like a problem between C++ and JAVA.

                    Please run the dependency checker from the environment (account / logon) in which you are running Streams Studio. It should check JAVA and C++

                    cd product-installation-directory/bin
                    ./dependency_checker.sh

                    reference: http://pic.dhe.ibm.com/infocenter/streams/v3r0/topic/com.ibm.swg.im.infosphere.streams.install-admin.doc/doc/ibminfospherestreams-install-prerequisites-dependency-checker.html

                    Minimum requirements based on your OS can be found here:

                    http://pic.dhe.ibm.com/infocenter/streams/v3r0/topic/com.ibm.swg.im.infosphere.streams.install-admin.doc/doc/ibminfospherestreams-install-prerequisites-rpm-tables.html

                    I assume the directory name means C++ is 4.1.2 which is supported for Linux 5 but certain levels are specified

                    /usr/lib/gcc/x86_64-redhat-linux/4.1.2

                    • Frank_Blau
                      Frank_Blau
                      28 Posts
                      ACCEPTED ANSWER

                      Re: Error Putting Data on BigInsights

                      ‏2013-05-14T22:19:22Z  in response to Stan

                      Here is the output... no errors reported:

                       

                      [streamsadmin@mybox bin]$ ./dependency_checker.sh 
                       
                      IBM InfoSphere Streams for Non-Production Environment 3.0.0.0 Dependency Checker
                      Date:  Tue May 14 15:17:30 PDT 2013
                       
                      === System Information ===
                      * Hostname:  mybox.localdomain
                      * IP address:  192.168.0.112
                      * Operating system:  Red Hat Enterprise Linux Server release 5.8 (Tikanga)
                      * System architecture:  x86_64
                      * Security-Enhanced Linux setting:  Permissive
                      * Java vendor:  IBM Corporation
                      * Java version:  1.6.0
                      * Java VM version:  2.4
                      * Java runtime version:  pxa6460sr11-20120806_01 (SR11)
                      * Java full version:  JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr11-20120801_118201 (JIT enabled, AOT enabled)
                      J9VM - 20120801_118201
                      JIT  - r9_20120608_24176ifx1
                      GC   - 20120516_AA
                      * Java IBM system encoding:  UTF-8
                      * Encoding:  UTF-8
                       
                      === System Configuration Check ===
                      * Status:  PASS - Check:  Hostname and IP address check
                      * Status:  PASS - Check:  Operating system version check
                      * Status:  PASS - Check:  Architecture check
                      * Status:  PASS - Check:  Java check
                      * Status:  PASS - Check:  Encoding check
                       
                      === Software Dependency Package Check ===
                      * Status:  CORRECT VERSION - Package:  perl-XML-Simple, System Version:  2.14-4.fc6
                      * Status:  CORRECT VERSION - Package:  gcc-c++, System Version:  4.1.2-52.el5_8.1
                      * Status:  CORRECT VERSION - Package:  curl-devel, System Version:  7.15.5-15.el5
                      * Status:  CORRECT VERSION - Package:  ibm-java-x86_64-sdk, System Version:  6.0-11.0
                       
                      === Summary of Errors and Warnings ===
                       
                      CDISI0003I The dependency checker evaluated the system and did not find errors or warnings.
                       
                      • Stan
                        Stan
                        76 Posts
                        ACCEPTED ANSWER

                        Re: Error Putting Data on BigInsights

                        ‏2013-05-15T13:24:39Z  in response to Frank_Blau

                        Please insure that JAVA_HOME is set to the location of a JDK (full developement install) and not a JRE (runtime Java environement).  I've been told this would cause the typedef error you received.

                        • Frank_Blau
                          Frank_Blau
                          28 Posts
                          ACCEPTED ANSWER

                          Re: Error Putting Data on BigInsights

                          ‏2013-05-15T17:21:48Z  in response to Stan

                          That did it!

                          I can now successfully compile it!

                          Now on to the connection issues! :)

                          • Frank_Blau
                            Frank_Blau
                            28 Posts
                            ACCEPTED ANSWER

                            Re: Error Putting Data on BigInsights

                            ‏2013-05-15T18:58:37Z  in response to Frank_Blau

                            OK, it is sending data to HDFS now... but there is one small glitch.

                            It only writes the data to HDFS after I cancel the job... Is there a way to get it to write the data to HDFS at every tuple?

                             

                            Never mind... figured out the buffer. ;)

                            • Stan
                              Stan
                              76 Posts
                              ACCEPTED ANSWER

                              Re: Error Putting Data on BigInsights

                              ‏2013-05-15T20:50:46Z  in response to Frank_Blau

                              I assume you either set the buffer to a small value or are you inserting windows punctuation into the stream?

                              About [HDFS]FileSink

                               The buffer is only written to disk when adding the next tuple would exceed the size of the buffer (as specified in the bufferSize parameter), when it receives punctuation (including final punctuation), or when the operator shuts down.