Topic
3 replies Latest Post - ‏2013-02-21T23:39:56Z by SystemAdmin
SystemAdmin
SystemAdmin
1245 Posts
ACCEPTED ANSWER

Pinned topic Streams 3, XML, and !CDATA ..

‏2013-02-13T18:12:12Z |
Hello Experts ..
Re;, Streams 3, XML, and CDATA ..
Let me apologize in advance; I consider myself a pragmatist regarding
XML, so I might not use all of the proper terms below-
I have an XML document and associated XSD. The document is well formed,
but has something I am not used to working with; inside the document is
a !CDATA section with text that is itself XML formatted.

o Given what !CDATA is, Can I extract just that element of my XML document ?

How do I do that ? Do I reference whatever element that contains/encapsulates
!CDATA ?

o I've used XML in other environments, and I see the examples in the
Streams on line information center.

question: How am I supposed to feed a Streams XMLParse operator ?

Should a read a file line by line, then just send it downstream ?
What if the input XML file contains no line feeds ? Do I need to
worry about sizes here, What is best practice ?
Thanks in advance !
  • SystemAdmin
    SystemAdmin
    1245 Posts
    ACCEPTED ANSWER

    Re: Streams 3, XML, and !CDATA ..

    ‏2013-02-14T13:39:32Z  in response to SystemAdmin
    Hi Daniel,

    I'll look into this one for you and get the appropriate resource, if needed.

    Regards,
    John
  • hnasgaard
    hnasgaard
    200 Posts
    ACCEPTED ANSWER

    Re: Streams 3, XML, and !CDATA ..

    ‏2013-02-14T18:34:14Z  in response to SystemAdmin
    Hi Daniel,

    I'll assume you have some XML that looks like the following:
    
    <root> <a><![CDATA[<message>Message text</message>]]></a> </root>
    


    XMLParse can be used to parse out the CDATA content. If you use the XMLParse operator then you have some choices as to how to read the data, but the easiest is probably to read it a line at a time. It doesn't matter whether or not the XML has line feeds or not. The XMLParse operator can aggregate the incoming data and then parse it.

    The following code actually uses XMLParse twice. First to parse out the CDATA, and second to parse the CDATA content since you said it was also XML. I'm not sure if you wanted that, but you can stop after the first parse if you wish. Here's the code:
    
    use spl.XML::*;   composite Main() 
    { graph stream<rstring s> Input = FileSource() 
    { param file : 
    "in.xml"; format : line; 
    } stream<rstring s> P1 = XMLParse(Input) 
    { param trigger : 
    "/root/a"; parsing : permissive; output P1 : s = XPath(
    "text()"); 
    } stream<rstring s> P2 = XMLParse(P1) 
    { param trigger : 
    "/message"; parsing : permissive; output P2 : s = XPath(
    "text()"); 
    } () as O = Custom(P2) 
    { logic onTuple P2 : println(s); 
    } 
    }
    


    If you compile and run this with the given data then the rstring in stream P1 will have the XML data content of the CDATA, and the content of that will be printed by the Custom operator at the bottom.

    I hope this helps.
    • SystemAdmin
      SystemAdmin
      1245 Posts
      ACCEPTED ANSWER

      Re: Streams 3, XML, and !CDATA ..

      ‏2013-02-21T23:39:56Z  in response to hnasgaard
      Thanks Howard !!

      You rock.

      Attached are the items you helped me develop.

      Attachments