Topic
23 replies Latest Post - ‏2012-04-10T16:18:48Z by SystemAdmin
victor42
victor42
44 Posts
ACCEPTED ANSWER

Pinned topic TCPSource and packet parsing

‏2012-04-04T19:57:09Z |
Hi All,
I have couple of questions regarding parsing data in IP packet fetched with TCPSource...
The packet format looks like the following:
uint32 timestamp,
char12 tag, // - 12 bytes string
uint32 packetSize, - number of bytes in received packet
char[] payload // !! variable size payload

1. How can i read char12? if I "pack" char12 into "rstring", how would the system know to read only 12 bytes?
2. I only interested in fetching of timestamp, tag, and packetSze, so if I declare
stream<uint32 timestamp, rstring tag, uint32 size> fetch = TCPSource()
will it skip the rest of the packet and wait for the next packet to come ?

Thanks,
-V
  • mendell
    mendell
    219 Posts
    ACCEPTED ANSWER

    Re: TCPSource and packet parsing

    ‏2012-04-04T21:04:29Z  in response to victor42
    Unfortunately, there isn't an easy answer to this one. TCPSource can read the data, from the string, but there is no way to get the format that you want. Supported formats are:
    • bin: internal SPL binary encoding (will read TCPSink format: bin output)
    • csv: comma separated values (in text)
    • txt: format like an SPL tuple: { a = 5, b = "hi"}
    • line: read one line of text into an rstring
    • block: read blockSize bytes into a blob

    The only one that works for you is block. You can read each block into a blob:
    
    stream<blob data> Input = TCPSource () 
    { param format : block; blockSize: 4096; 
    // address, port, etc 
    }
    


    After that you have a stream of blocks coming out of the operator. You will have to write a primitive operator (in C++ or Java) that will take the incoming data and convert that into SPL tuples:
    
    stream<uint32 ts, rstring[12] tag, uint32 size> Data = MyOperator(Input) 
    { 
    }
    


    This operator will have to pull apart the data in the blob, setting the various fields in a tuple, and then will call submit (otuple, 0); to send the tuple downstream. The only interesting part is that the blocks in the blob necessarily contain complete packets, so you will have to be able to save your state, and continue when the next blob arrives.

    You can write a particular operator that knows the attribute types of your output stream, and does something like:
    
    
    // Do the hard work to grab the values from the blob ... 
    // Now create the tuple and send downstream OPort0Tuple otuple (tsValue, tagValue, sizeValue); submit (otuple, 0);
    


    Mark
    • Jim Sharpe
      Jim Sharpe
      98 Posts
      ACCEPTED ANSWER

      Re: TCPSource and packet parsing

      ‏2012-04-04T21:10:44Z  in response to mendell
      Although it's implied by the context, I think you meant to say "the blocks in the blob do not necessarily contain complete packets"
    • victor42
      victor42
      44 Posts
      ACCEPTED ANSWER

      Re: TCPSource and packet parsing

      ‏2012-04-04T21:57:09Z  in response to mendell
      Mark,
      Thanks very much for the post. Considering that i do not know size of packet, what should be the size of "blob"?
      If the max length of ip packet is 65535 bytes, can it be:
      stream<blob data> Input = TCPSource () {
      param format : block;
      blockSize: 65535;
      // address, port, etc
      }

      e.g. will TCPSource read one packet at the time?

      Also, how can i cast "block" in native code?

      Thanks again,
      -V
      • mendell
        mendell
        219 Posts
        ACCEPTED ANSWER

        Re: TCPSource and packet parsing

        ‏2012-04-04T22:03:30Z  in response to victor42
        An SPL blob contains a bunch of raw bytes and a length. The TCPSource that I mentioned would read the TCP connection 4K bytes at a time. It would take several tuples containing the blobs to get all of a 64K packet.

        In C++, the tuple would contain an attrbute of type SPL::blob. You can use member functions on this to find the length, and access the data within the blob, that you can then use to create the output tuple.

        A (perhaps better) alternate would be to create your own source operator that reads from a TCP socket, and generates tuples from that. That would allow you to do reads from the TCP connection of the right size, and not worry about having to combine blocks of data. The only tricky part in a source is remembering to check occasionally for PE shutdown, so that you can shutdown in a timely manner.
    • victor42
      victor42
      44 Posts
      ACCEPTED ANSWER

      Re: TCPSource and packet parsing

      ‏2012-04-04T22:12:26Z  in response to mendell
      Wouldn't i be able to read 20 bytes of text into my native operator?

      stream<line data> Input = TCPSource () {
      param format : block;
      blockSize: 4096;
      // address, port, etc
      }
      • mendell
        mendell
        219 Posts
        ACCEPTED ANSWER

        Re: TCPSource and packet parsing

        ‏2012-04-04T22:30:30Z  in response to victor42
        It sounds to me that your input is in binary, not text. You want to read 20 bytes into your native operator, and then use the packetsize information to tell you how much data to read and ignore, and then you can try reading the next packet.

        Mark
        • victor42
          victor42
          44 Posts
          ACCEPTED ANSWER

          Re: TCPSource and packet parsing

          ‏2012-04-05T13:11:00Z  in response to mendell
          can't i treat text as byte[] and pass it into the native code?
          • mendell
            mendell
            219 Posts
            ACCEPTED ANSWER

            Re: TCPSource and packet parsing

            ‏2012-04-05T19:32:51Z  in response to victor42
            because format : text mode reads up to a newline, and then discards the newline. This won't work for you. You can treat an rstring as a set of bytes, but I don't think the contents of the rstring will be what you want.
        • victor42
          victor42
          44 Posts
          ACCEPTED ANSWER

          Re: TCPSource and packet parsing

          ‏2012-04-05T14:46:37Z  in response to mendell
          Hi Mark,
          How would i implement alternative to TCPSource as Java operator?
          I looked into documentation and samples but didn't find specific info regarding getting data from socket...
          Seems as I have to implement
          public void process(StreamingInput<Tuple> port, Tuple tuple) in my operator...
          but how can i "connect" that input to actual logic of reading the socket?
          Some "skeleton" to give me some idea would be greatly appreciated.
          Thanks,
          -V
          • mendell
            mendell
            219 Posts
            ACCEPTED ANSWER

            Re: TCPSource and packet parsing

            ‏2012-04-05T19:37:27Z  in response to victor42
            That prototype is not the right one for a source. If you want to write a Java Source operator, look at the samples in
            Java Samples

            Connecting to a TCP socket will have to be done using the Java interfaces to TCP sockets. That is out of the scope of SPL/Streams.
            • victor42
              victor42
              44 Posts
              ACCEPTED ANSWER

              Re: TCPSource and packet parsing

              ‏2012-04-06T17:04:12Z  in response to mendell
              any idea how to create operator descriptor, e.g. RandomBeacon.xml ?
              Tutorial omits this part but the sample wouldn't compile without it .
              • hnasgaard
                hnasgaard
                200 Posts
                ACCEPTED ANSWER

                Re: TCPSource and packet parsing

                ‏2012-04-06T17:53:47Z  in response to victor42
                Refer to the SPL Operator Model Refeence. This should have all the information you need to complete the operator model.
                • victor42
                  victor42
                  44 Posts
                  ACCEPTED ANSWER

                  Re: TCPSource and packet parsing

                  ‏2012-04-06T18:24:27Z  in response to hnasgaard
                  When i try to define "Beackon - like" operator that takes no parameters, or input arguments ( see attached) i have the following error:

                  ISP0180E Error while loading operator model, details: An operator with more than one input port cannot have a punctuation preserving output port without specifying the window punctuation port.
                  • hnasgaard
                    hnasgaard
                    200 Posts
                    ACCEPTED ANSWER

                    Re: TCPSource and packet parsing

                    ‏2012-04-06T18:41:11Z  in response to victor42
                    You have windowPunctuationOutputMode set to Preserving, which says it will preserve the punctuations from the input ports, of which there are none. Note that in the Beacon operator, it is set to Generating, which says it will generate punctuations for some condition. Since you are creating a source operator, you can specify either Free or Generating.
                    • victor42
                      victor42
                      44 Posts
                      ACCEPTED ANSWER

                      Re: TCPSource and packet parsing

                      ‏2012-04-06T18:54:24Z  in response to hnasgaard
                      Sorry for hasty post. That is, I changed the name of my operator to StatusSocketReader everywhere, fixed problem wth punctuation in descriptor and it seems as this class cannot be found.
                • victor42
                  victor42
                  44 Posts
                  ACCEPTED ANSWER

                  Re: TCPSource and packet parsing

                  ‏2012-04-06T18:47:48Z  in response to hnasgaard
                  Never mind,
                  I have CDISP0053E ERROR: Unknown identifier 'StatusSocketReader', if i try to call it from SPL
                  composite MagicO {

                  graph
                  stream <rstring rawObservation> RawObservations = StatusSocketReader() {}
                  • hnasgaard
                    hnasgaard
                    200 Posts
                    ACCEPTED ANSWER

                    Re: TCPSource and packet parsing

                    ‏2012-04-06T19:42:05Z  in response to victor42
                    Have you added a use statement for the namespace containing your operator?
                    • victor42
                      victor42
                      44 Posts
                      ACCEPTED ANSWER

                      Re: TCPSource and packet parsing

                      ‏2012-04-06T20:23:19Z  in response to hnasgaard
                      Not sure. I've been using 'DirectoryList' sample as template and didn't see anything like it there.
                    • victor42
                      victor42
                      44 Posts
                      ACCEPTED ANSWER

                      Re: TCPSource and packet parsing

                      ‏2012-04-06T20:28:10Z  in response to hnasgaard
                      The 'use' stetement doesn't seem to change anything. Even if i put Java operator into the same namespace as SPL, i see the same error.
                    • victor42
                      victor42
                      44 Posts
                      ACCEPTED ANSWER

                      Re: TCPSource and packet parsing

                      ‏2012-04-06T20:49:45Z  in response to hnasgaard
                      Java namespace doesn't even appear in 'toolkit.xml'
                      • mendell
                        mendell
                        219 Posts
                        ACCEPTED ANSWER

                        Re: TCPSource and packet parsing

                        ‏2012-04-07T01:25:19Z  in response to victor42
                        Is the operator defined in the right directory? It has to be in a directory with the same name as the operator model XML file.

                        For example:
                        foo/foo.xml
            • victor42
              victor42
              44 Posts
              ACCEPTED ANSWER

              Re: TCPSource and packet parsing

              ‏2012-04-09T14:00:44Z  in response to mendell
              The samples are not helpful at all - there's no source for actual implementation.
              Should my socket listener be implemented as thread, as "daemon" thread?
              • SystemAdmin
                SystemAdmin
                1245 Posts
                ACCEPTED ANSWER

                Re: TCPSource and packet parsing

                ‏2012-04-10T16:18:48Z  in response to victor42
                The javadoc for the Java operator samples here:

                http://publib.boulder.ibm.com/infocenter/streams/v2r0/index.jsp?topic=%2Fcom.ibm.swg.im.infosphere.streams.javadoc.samples.doc%2Fdoc%2Findex.html

                States:

                Samples are contained in
                $STREAMS_INSTALL/lib/com.ibm.streams.operator.samples.jar

                com.ibm.streams.operator.samples.jar contains the source for the samples. If you include it in the build path an Eclipse Java project then you should be able to see the Java source code of the Java samples.

                Java operators are also in this sample toolkit:

                $STREAMS_INSTALL/samples/spl/feature/JavaOperators

                This sample toolkit contains a Java primitive sample 'DirectoryLister' in the namespace 'sample'.
                Its operator model file is thus in
                sample/DirectoryLister/DirectoryLister.xml
                relative to the root of the toolkit.