Topic
  • 14 replies
  • Latest Post - ‏2013-04-04T23:01:53Z by MrJose
MrJose
MrJose
56 Posts

Pinned topic UDPSource

‏2013-03-20T15:18:12Z |
Hello,

I have a requirement of getting data from UDP source of packet size can be of 4MB in 20 milliseconds.
Is it possible to achieve that with UDPSource operator. IPV6 support for UDP source and packet size.
Each packet can have lot of records too.

Regards
Paul
  • wbratton
    wbratton
    76 Posts

    Re: UDPSource

    ‏2013-03-20T16:41:10Z  
    I will look into this and get back to you.

    Thanks
  • wbratton
    wbratton
    76 Posts

    Re: UDPSource

    ‏2013-03-20T17:12:17Z  
    • wbratton
    • ‏2013-03-20T16:41:10Z
    I will look into this and get back to you.

    Thanks
    Looking at the "IBM InfoSphere Streams V3.0 Addressing volume, velocity, and variety" RedBook a packet size or time limitation is not specified for the UDPSource. You can download the RedBook from the following URL:

    http://www.redbooks.ibm.com/abstracts/sg248108.html

    The RedBook does provide the following information for a UDPSource operator:

    • Each tuple must fit into a single UDP packet and a single UDP packet must contain only a single tuple or punctuation.

    • The UDPSource operator does not have any input ports and configurable with only single output.

    • The UDPSource operator does not accept any window configurations.

    Let me know if this helps.
  • MrJose
    MrJose
    56 Posts

    Re: UDPSource

    ‏2013-03-20T17:42:39Z  
    Hi

    There is a limitation of packet size in UDP, IPv4 64kb.
    I believe in IPV6 its much more. I have to extract the tuples out from each packet
    from incoming packets. If we have to handle that much volume of data then it seems to be impossible
    If each packet fits into a record you can have only 50 records/sec,that wont seems to solve any bigdata problem.

    Jose
  • hnasgaard
    hnasgaard
    200 Posts

    Re: UDPSource

    ‏2013-03-21T17:05:52Z  
    Unfortunately Streams does not support IP V6.
  • MrJose
    MrJose
    56 Posts

    Re: UDPSource

    ‏2013-03-21T18:33:20Z  
    Hi

    Is there any workaround ?

    1) Many UDP connections

    2) Getting the packet maximum of IPv4 support ( 64kB) and split into records down the line.
    Can we overcome the limitation of each packet to be a tuple. ?

    Please advise.

    Regards
    Jose
  • hnasgaard
    hnasgaard
    200 Posts

    Re: UDPSource

    ‏2013-03-21T19:09:48Z  
    • MrJose
    • ‏2013-03-21T18:33:20Z
    Hi

    Is there any workaround ?

    1) Many UDP connections

    2) Getting the packet maximum of IPv4 support ( 64kB) and split into records down the line.
    Can we overcome the limitation of each packet to be a tuple. ?

    Please advise.

    Regards
    Jose
    I have not tried what I am about to suggest but I think it should work:
    • I think you will have to break up your 4m buffer into individual packets and send them sequentially, followed by a punctuation.
    • On the receiving end the UDPSource should use bin mode with a single blob attribute.
    • Follow the UDPSource with a Custom that aggregates the received blobs until it receives a punctuation. When it gets the punctuation you will have to convert the buffer into a tuple. If your binary data was really csv internally, then you could take the aggregated data from the Custom and send it on to a Parse operator, which would parse it into a tuple.
    The caveats here are:
    • Since it is UDP a packet could get lost and you would not have sufficient data to make up a tuple. You could probably check for that in the Custom and drop the buffer if not sufficient, otherwise the Parse will fail.
    • You would have to figure out the largest UDP packet size that would work
    • I'm not sure what the real throughput would be on this.
  • MrJose
    MrJose
    56 Posts

    Re: UDPSource

    ‏2013-03-25T16:49:24Z  
    • hnasgaard
    • ‏2013-03-21T19:09:48Z
    I have not tried what I am about to suggest but I think it should work:
    • I think you will have to break up your 4m buffer into individual packets and send them sequentially, followed by a punctuation.
    • On the receiving end the UDPSource should use bin mode with a single blob attribute.
    • Follow the UDPSource with a Custom that aggregates the received blobs until it receives a punctuation. When it gets the punctuation you will have to convert the buffer into a tuple. If your binary data was really csv internally, then you could take the aggregated data from the Custom and send it on to a Parse operator, which would parse it into a tuple.
    The caveats here are:
    • Since it is UDP a packet could get lost and you would not have sufficient data to make up a tuple. You could probably check for that in the Custom and drop the buffer if not sufficient, otherwise the Parse will fail.
    • You would have to figure out the largest UDP packet size that would work
    • I'm not sure what the real throughput would be on this.
    hnasgaard,

    Punctuation from external source you meant any indicator ?
    I don't know how to generate streams punctuation from outside system. sorry for my ignorance.

    Regards
    Jose
  • hnasgaard
    hnasgaard
    200 Posts

    Re: UDPSource

    ‏2013-03-25T17:31:13Z  
    • MrJose
    • ‏2013-03-25T16:49:24Z
    hnasgaard,

    Punctuation from external source you meant any indicator ?
    I don't know how to generate streams punctuation from outside system. sorry for my ignorance.

    Regards
    Jose
    I was assuming there was a Streams program on the sending side that could break up the 4m into bite-sized chunks and was sending them with a TCPSink, and so could send a punctuation after the final chunk.
  • MrJose
    MrJose
    56 Posts

    Re: UDPSource

    ‏2013-04-01T15:56:29Z  
    I'm sending data from little endian machine to another via a program and reading data via UDP source.
    For some reason, byte order is different, when I read via UDP source(not tried any other method).

    Is there any setting to control this or streams reading wrongly ?
    Jose
  • hnasgaard
    hnasgaard
    200 Posts

    Re: UDPSource

    ‏2013-04-02T11:45:27Z  
    • MrJose
    • ‏2013-04-01T15:56:29Z
    I'm sending data from little endian machine to another via a program and reading data via UDP source.
    For some reason, byte order is different, when I read via UDP source(not tried any other method).

    Is there any setting to control this or streams reading wrongly ?
    Jose
    Streams uses the same encoding "on the wire" regardless of the endian-ness of the systems involved. There are no settings to modify the ordering it uses. The format is described here:
    http://pic.dhe.ibm.com/infocenter/streams/v3r0/index.jsp?topic=%2Fcom.ibm.swg.im.infosphere.streams.spl-compiler-usage-reference.doc%2Fdoc%2FSPLbinaryandcharacterencoding.html
    Can you tell me what data type that is arriving out of correct byte order?
  • MrJose
    MrJose
    56 Posts

    Re: UDPSource

    ‏2013-04-02T21:04:43Z  
    Hi

    I have 2 VM's running in server machines. One machine is sending data ( unit8, float32, float32, int32).
    Both machines seems to be little endian.VM1 sends data via udp connection(external application).
    So I don't see any architectural issue here(little-big).The data read by UDP source seems to come in bigendian format.
    If I convert I can get the correct values ( except float has some approximation issues). Client doesn't like byte swapping.

    If streams is reading like any normal C++ program, as it gets then its fine.
    They suspect streams reading in different by order. I have to verify their claim by writing a sample program.

    I posted to confirm streams don't convert byte order n.
    I use binary format to read values.
    The term Network Byte Format seems to be associated with big endian, Is it like streams always reads in big endian format ?
    I will keep you posted, Thanks for the help.
  • MrJose
    MrJose
    56 Posts

    Re: UDPSource

    ‏2013-04-04T19:59:26Z  
    Streams always store data except for decimal in NBF(Big endian). So when a external system sends data in little endian format it always reads wrongly regardless the machine endiness of streams machine.

    Can you please confirm my understanding is correct or not ? Is it because to avoid issues while clustering streams nodes across different endian machines ?

    Thanks
    Jose
  • hnasgaard
    hnasgaard
    200 Posts

    Re: UDPSource

    ‏2013-04-04T20:42:52Z  
    • MrJose
    • ‏2013-04-04T19:59:26Z
    Streams always store data except for decimal in NBF(Big endian). So when a external system sends data in little endian format it always reads wrongly regardless the machine endiness of streams machine.

    Can you please confirm my understanding is correct or not ? Is it because to avoid issues while clustering streams nodes across different endian machines ?

    Thanks
    Jose
    Your understanding is correct. There should be no issue between streams programs on different endian machines, but if you are reading with a TCP/UDP source in binary format from some non-streams program, it would have to format the data as streams expects it.
  • MrJose
    MrJose
    56 Posts

    Re: UDPSource

    ‏2013-04-04T23:01:53Z  
    Thank you hnasgaard !