Topic
14 replies Latest Post - ‏2013-04-04T23:01:53Z by MrJose
MrJose
MrJose
56 Posts
ACCEPTED ANSWER

Pinned topic UDPSource

‏2013-03-20T15:18:12Z |
Hello,

I have a requirement of getting data from UDP source of packet size can be of 4MB in 20 milliseconds.
Is it possible to achieve that with UDPSource operator. IPV6 support for UDP source and packet size.
Each packet can have lot of records too.

Regards
Paul
  • wbratton
    wbratton
    76 Posts
    ACCEPTED ANSWER

    Re: UDPSource

    ‏2013-03-20T16:41:10Z  in response to MrJose
    I will look into this and get back to you.

    Thanks
    • wbratton
      wbratton
      76 Posts
      ACCEPTED ANSWER

      Re: UDPSource

      ‏2013-03-20T17:12:17Z  in response to wbratton
      Looking at the "IBM InfoSphere Streams V3.0 Addressing volume, velocity, and variety" RedBook a packet size or time limitation is not specified for the UDPSource. You can download the RedBook from the following URL:

      http://www.redbooks.ibm.com/abstracts/sg248108.html

      The RedBook does provide the following information for a UDPSource operator:

      • Each tuple must fit into a single UDP packet and a single UDP packet must contain only a single tuple or punctuation.

      • The UDPSource operator does not have any input ports and configurable with only single output.

      • The UDPSource operator does not accept any window configurations.

      Let me know if this helps.
  • MrJose
    MrJose
    56 Posts
    ACCEPTED ANSWER

    Re: UDPSource

    ‏2013-03-20T17:42:39Z  in response to MrJose
    Hi

    There is a limitation of packet size in UDP, IPv4 64kb.
    I believe in IPV6 its much more. I have to extract the tuples out from each packet
    from incoming packets. If we have to handle that much volume of data then it seems to be impossible
    If each packet fits into a record you can have only 50 records/sec,that wont seems to solve any bigdata problem.

    Jose
  • hnasgaard
    hnasgaard
    200 Posts
    ACCEPTED ANSWER

    Re: UDPSource

    ‏2013-03-21T17:05:52Z  in response to MrJose
    Unfortunately Streams does not support IP V6.
  • MrJose
    MrJose
    56 Posts
    ACCEPTED ANSWER

    Re: UDPSource

    ‏2013-03-21T18:33:20Z  in response to MrJose
    Hi

    Is there any workaround ?

    1) Many UDP connections

    2) Getting the packet maximum of IPv4 support ( 64kB) and split into records down the line.
    Can we overcome the limitation of each packet to be a tuple. ?

    Please advise.

    Regards
    Jose
    • hnasgaard
      hnasgaard
      200 Posts
      ACCEPTED ANSWER

      Re: UDPSource

      ‏2013-03-21T19:09:48Z  in response to MrJose
      I have not tried what I am about to suggest but I think it should work:
      • I think you will have to break up your 4m buffer into individual packets and send them sequentially, followed by a punctuation.
      • On the receiving end the UDPSource should use bin mode with a single blob attribute.
      • Follow the UDPSource with a Custom that aggregates the received blobs until it receives a punctuation. When it gets the punctuation you will have to convert the buffer into a tuple. If your binary data was really csv internally, then you could take the aggregated data from the Custom and send it on to a Parse operator, which would parse it into a tuple.
      The caveats here are:
      • Since it is UDP a packet could get lost and you would not have sufficient data to make up a tuple. You could probably check for that in the Custom and drop the buffer if not sufficient, otherwise the Parse will fail.
      • You would have to figure out the largest UDP packet size that would work
      • I'm not sure what the real throughput would be on this.
      • MrJose
        MrJose
        56 Posts
        ACCEPTED ANSWER

        Re: UDPSource

        ‏2013-03-25T16:49:24Z  in response to hnasgaard
        hnasgaard,

        Punctuation from external source you meant any indicator ?
        I don't know how to generate streams punctuation from outside system. sorry for my ignorance.

        Regards
        Jose
        • hnasgaard
          hnasgaard
          200 Posts
          ACCEPTED ANSWER

          Re: UDPSource

          ‏2013-03-25T17:31:13Z  in response to MrJose
          I was assuming there was a Streams program on the sending side that could break up the 4m into bite-sized chunks and was sending them with a TCPSink, and so could send a punctuation after the final chunk.
  • MrJose
    MrJose
    56 Posts
    ACCEPTED ANSWER

    Re: UDPSource

    ‏2013-04-01T15:56:29Z  in response to MrJose
    I'm sending data from little endian machine to another via a program and reading data via UDP source.
    For some reason, byte order is different, when I read via UDP source(not tried any other method).

    Is there any setting to control this or streams reading wrongly ?
    Jose
  • MrJose
    MrJose
    56 Posts
    ACCEPTED ANSWER

    Re: UDPSource

    ‏2013-04-02T21:04:43Z  in response to MrJose
    Hi

    I have 2 VM's running in server machines. One machine is sending data ( unit8, float32, float32, int32).
    Both machines seems to be little endian.VM1 sends data via udp connection(external application).
    So I don't see any architectural issue here(little-big).The data read by UDP source seems to come in bigendian format.
    If I convert I can get the correct values ( except float has some approximation issues). Client doesn't like byte swapping.

    If streams is reading like any normal C++ program, as it gets then its fine.
    They suspect streams reading in different by order. I have to verify their claim by writing a sample program.

    I posted to confirm streams don't convert byte order n.
    I use binary format to read values.
    The term Network Byte Format seems to be associated with big endian, Is it like streams always reads in big endian format ?
    I will keep you posted, Thanks for the help.
  • MrJose
    MrJose
    56 Posts
    ACCEPTED ANSWER

    Re: UDPSource

    ‏2013-04-04T19:59:26Z  in response to MrJose
    Streams always store data except for decimal in NBF(Big endian). So when a external system sends data in little endian format it always reads wrongly regardless the machine endiness of streams machine.

    Can you please confirm my understanding is correct or not ? Is it because to avoid issues while clustering streams nodes across different endian machines ?

    Thanks
    Jose
    • hnasgaard
      hnasgaard
      200 Posts
      ACCEPTED ANSWER

      Re: UDPSource

      ‏2013-04-04T20:42:52Z  in response to MrJose
      Your understanding is correct. There should be no issue between streams programs on different endian machines, but if you are reading with a TCP/UDP source in binary format from some non-streams program, it would have to format the data as streams expects it.
  • MrJose
    MrJose
    56 Posts
    ACCEPTED ANSWER

    Re: UDPSource

    ‏2013-04-04T23:01:53Z  in response to MrJose
    Thank you hnasgaard !