Topic
  • 11 replies
  • Latest Post - ‏2012-12-06T05:52:13Z by MFXX_Srinivasa_Reddy
MFXX_Srinivasa_Reddy
34 Posts

Pinned topic XML Parsing failure due to International characters

‏2012-12-02T10:13:57Z |
Hi All,
I'm facing xml parsing failure if I pass international chars like "<NickName>Säll Karlsson, Jenny</NickName>".
In xml declaration I have used UTF-8 format.

I fallowed below link and did required changes
http://pic.dhe.ibm.com/infocenter/wsdatap/v5r0m0/index.jsp?topic=%2Fcom.ibm.dp.doc%2Fproblemdetermination137.htm

Still facing parsing issue.

Please help me to sort out this issue.

Thanks in advance.
Srini
Updated on 2012-12-06T05:52:13Z at 2012-12-06T05:52:13Z by MFXX_Srinivasa_Reddy
  • HermannSW
    HermannSW
    4657 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-02T19:27:30Z  
    Hi Srini,

    it is important that your document encoding matches what is stated in the xml declaration.
    Please attach a sample document (do not copy into posting, that will change things) here.
    Then we can see what is wrong with that document.

     
    Hermann<myXsltBlog/> <myXsltTweets/>
  • MFXX_Srinivasa_Reddy
    34 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-03T05:07:42Z  
    • HermannSW
    • ‏2012-12-02T19:27:30Z
    Hi Srini,

    it is important that your document encoding matches what is stated in the xml declaration.
    Please attach a sample document (do not copy into posting, that will change things) here.
    Then we can see what is wrong with that document.

     
    Hermann<myXsltBlog/> <myXsltTweets/>
    Hi Hermann,

    Please find the attached xml file which is coming to Datapower and failed to parse.

    Thanks,
    Srini
  • MFXX_Srinivasa_Reddy
    34 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-03T07:18:05Z  
    Hi Hermann,

    Please find the attached xml file which is coming to Datapower and failed to parse.

    Thanks,
    Srini
    Hi Hermann,
    To figure out the issue I have created simple xml loopback firewall and mentioned input type as XML(XML contains sweedish chars) and fired request.
    getting the parse exception.
    If i made input as binary data its working fine.

    I have used same iput xml which is attached in my previous post.

    I have used Default xml manager only.

    Firmware version 3.8.1

    Help me to sort out this issue

    Thanks in advance.

    thx,
    Srini
  • HermannSW
    HermannSW
    4657 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-03T14:33:10Z  
    Hi Hermann,
    To figure out the issue I have created simple xml loopback firewall and mentioned input type as XML(XML contains sweedish chars) and fired request.
    getting the parse exception.
    If i made input as binary data its working fine.

    I have used same iput xml which is attached in my previous post.

    I have used Default xml manager only.

    Firmware version 3.8.1

    Help me to sort out this issue

    Thanks in advance.

    thx,
    Srini
    Hi Srini,

    I do not have any problems processing your attached XML input file on a 3.8.1.20 box ...
    $ coproc2 identity.xsl IC0517Ilearrequest010.xml http://dp5-l3:2223 -s > out.xml
    $
    $ xpath++ "count(//*)" IC0517Ilearrequest010.xml 
    10028
    $ xpath++ "count(//*)" out.xml 
    10028
    $ 
    $ ssh dp5-l3
    (unknown)
    Unauthorized access prohibited.
    login: admin
    Password: ******
     
    Welcome to DataPower XI50 console configuration. 
    Copyright IBM Corporation 1999-2012 
     
    Version: XI50.3.8.1.20 build 211237 on 2012/03/24 14:32:23
    Serial number: 68A1781
     
    xi50# exit
    Goodbye.
    Connection to dp5-l3 closed.
    $
    


     
    Hermann <myXsltBlog/> <myXsltTweets/>
    Updated on 2014-03-25T02:45:07Z at 2014-03-25T02:45:07Z by iron-man
  • MFXX_Srinivasa_Reddy
    34 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-03T15:48:59Z  
    • HermannSW
    • ‏2012-12-03T14:33:10Z
    Hi Srini,

    I do not have any problems processing your attached XML input file on a 3.8.1.20 box ...
    <pre class="java dw" data-editor-lang="java" data-pbcklang="java" dir="ltr">$ coproc2 identity.xsl IC0517Ilearrequest010.xml http://dp5-l3:2223 -s > out.xml $ $ xpath++ "count(//*)" IC0517Ilearrequest010.xml 10028 $ xpath++ "count(//*)" out.xml 10028 $ $ ssh dp5-l3 (unknown) Unauthorized access prohibited. login: admin Password: ****** Welcome to DataPower XI50 console configuration. Copyright IBM Corporation 1999-2012 Version: XI50.3.8.1.20 build 211237 on 2012/03/24 14:32:23 Serial number: 68A1781 xi50# exit Goodbye. Connection to dp5-l3 closed. $ </pre>

     
    Hermann <myXsltBlog/> <myXsltTweets/>
    Hi Hermann,

    Thanks your reply.

    In request xml if I change encoding format as UTF-16 or ISO-8859-1, its working fine.

    what could be the problem I'm trying to understand where its going wrong.

    Please find the below log and help me to understand this.

    06:31:37 multistep debug 88321552 error 0x80c0004e mpgw (IC4102_MPG): Stylesheet URL to compile is 'local:///IC4102/IC4102_ErrorHandler.xsl'
    06:31:37 multistep warn 88321552 error 0x00340027 mpgw (IC4102_MPG): Multistep Probe enabled
    06:31:37 mpgw info 88321552 error 0x80e000b7 mpgw (IC4102_MPG): rule (IC4102_Policy_rule_3): selected via match 'MatchAllURLs' from processing policy 'IC4102_Policy' for code '0x00030001'
    06:31:37 mpgw debug 88321552 0x81000171 Matching (MatchAllURLs): Match: Received URL dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE matches rule '*'
    06:31:37 mpgw error 88321552 error 0x00030001 mpgw (IC4102_MPG): Parse error
    06:31:37 multistep error 88321552 request 0x80c00008 mpgw (IC4102_MPG): rule (IC4102_Policy_rule_0): implied action Parse input as XML failed: illegal character 'l' at offset 122100 of dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE
    06:31:37 xmlparse error 88321552 request 0x80e003aa mpgw (IC4102_MPG): illegal character 'l' at offset 122100 of dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE
    06:31:37 xmlparse debug 88321552 request 0x80e003a6 mpgw (IC4102_MPG): Parsing document: 'dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE'
    06:31:37 multistep debug 88321552 request 0x80c00004 mpgw (IC4102_MPG): Protocol layer did not supply content-type
    06:31:37 multistep warn 88321552 request 0x00340027 mpgw (IC4102_MPG): Multistep Probe enabled
    06:31:37 mpgw info 88321552 request 0x80e000b4 stylepolicy (IC4102_Policy): rule (IC4102_Policy_rule_0): selected via match 'Test_match_all' from processing policy 'IC4102_Policy'
    06:31:37 mpgw debug 88321552 0x81000171 Matching (Test_match_all): Match: Received URL dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE matches rule '*'
    06:31:37 mq info 88321552 0x80e0032a source-mq (IC4102_FSH): Front side protocol handler received request of 464402 bytes
    06:31:37 mq debug 88321552 0x80e00554 source-mq (IC4102_FSH): Disable the message property parsing.

    Thanks inadvance,

    Regards,
    Srini
  • MFXX_Srinivasa_Reddy
    34 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-03T16:12:50Z  
    Hi Hermann,

    Thanks your reply.

    In request xml if I change encoding format as UTF-16 or ISO-8859-1, its working fine.

    what could be the problem I'm trying to understand where its going wrong.

    Please find the below log and help me to understand this.

    06:31:37 multistep debug 88321552 error 0x80c0004e mpgw (IC4102_MPG): Stylesheet URL to compile is 'local:///IC4102/IC4102_ErrorHandler.xsl'
    06:31:37 multistep warn 88321552 error 0x00340027 mpgw (IC4102_MPG): Multistep Probe enabled
    06:31:37 mpgw info 88321552 error 0x80e000b7 mpgw (IC4102_MPG): rule (IC4102_Policy_rule_3): selected via match 'MatchAllURLs' from processing policy 'IC4102_Policy' for code '0x00030001'
    06:31:37 mpgw debug 88321552 0x81000171 Matching (MatchAllURLs): Match: Received URL dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE matches rule '*'
    06:31:37 mpgw error 88321552 error 0x00030001 mpgw (IC4102_MPG): Parse error
    06:31:37 multistep error 88321552 request 0x80c00008 mpgw (IC4102_MPG): rule (IC4102_Policy_rule_0): implied action Parse input as XML failed: illegal character 'l' at offset 122100 of dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE
    06:31:37 xmlparse error 88321552 request 0x80e003aa mpgw (IC4102_MPG): illegal character 'l' at offset 122100 of dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE
    06:31:37 xmlparse debug 88321552 request 0x80e003a6 mpgw (IC4102_MPG): Parsing document: 'dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE'
    06:31:37 multistep debug 88321552 request 0x80c00004 mpgw (IC4102_MPG): Protocol layer did not supply content-type
    06:31:37 multistep warn 88321552 request 0x00340027 mpgw (IC4102_MPG): Multistep Probe enabled
    06:31:37 mpgw info 88321552 request 0x80e000b4 stylepolicy (IC4102_Policy): rule (IC4102_Policy_rule_0): selected via match 'Test_match_all' from processing policy 'IC4102_Policy'
    06:31:37 mpgw debug 88321552 0x81000171 Matching (Test_match_all): Match: Received URL dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE matches rule '*'
    06:31:37 mq info 88321552 0x80e0032a source-mq (IC4102_FSH): Front side protocol handler received request of 464402 bytes
    06:31:37 mq debug 88321552 0x80e00554 source-mq (IC4102_FSH): Disable the message property parsing.

    Thanks inadvance,

    Regards,
    Srini
    Hi Hermann,

    Sorry for inconvenience, UTF-16 also giving same parsing exception.

    Thanks,
    Srini
  • HermannSW
    HermannSW
    4657 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-03T18:11:13Z  
    Hi Hermann,

    Sorry for inconvenience, UTF-16 also giving same parsing exception.

    Thanks,
    Srini
    The file you attached has UTF-8 encoding.
    Where does UTF-16 come from?
    We need the file attached which is read from
    dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE

    As the error logged states, that file has an invalid character at offset 122100 ...
    implied action Parse input as XML failed: illegal character 'l' at offset 122100 of

     
    Hermann<myXsltBlog/> <myXsltTweets/>
  • MFXX_Srinivasa_Reddy
    34 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-04T05:14:11Z  
    • HermannSW
    • ‏2012-12-03T18:11:13Z
    The file you attached has UTF-8 encoding.
    Where does UTF-16 come from?
    We need the file attached which is read from
    dpmq://IC0430_MQMgr/IC4102_FSH?RequestQueue=DATAPOWER.IC0517.REQUEST&ReplyQueue=DATAPOWER.IC0517.RESPONSE

    As the error logged states, that file has an invalid character at offset 122100 ...
    implied action Parse input as XML failed: illegal character 'l' at offset 122100 of

     
    Hermann<myXsltBlog/> <myXsltTweets/>
    Hi Hermann,
    Instead of UTF-8 we have edited to UTF-16 and sent via MQ.
    here I'm attaching file which read from DATAPOWER.IC0517.REQUEST.

    that file has an invalid character at offset 122100 ...
    implied action Parse input as XML failed: illegal character 'l' at offset 122100 of

    This means how can i check in which position its causing error?

    Thanks for your help.

    Regards,
    Srini
  • HermannSW
    HermannSW
    4657 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-04T09:46:14Z  
    Hi Hermann,
    Instead of UTF-8 we have edited to UTF-16 and sent via MQ.
    here I'm attaching file which read from DATAPOWER.IC0517.REQUEST.

    that file has an invalid character at offset 122100 ...
    implied action Parse input as XML failed: illegal character 'l' at offset 122100 of

    This means how can i check in which position its causing error?

    Thanks for your help.

    Regards,
    Srini
    Hi Srini,

    the file you attached is a Binary (MQ) file.
    It contatains some binary prefix, then an <mcd>, <jms>, <usr> and <SyncPersonnel> XML parts, seperated by binary data.
    <SyncPersonnel> does have UTF-8 encoding in xml declaration.

    > Instead of UTF-8 we have edited to UTF-16 and sent via MQ.
    >
    You should never ever "edit" a file's encoding.
    If you change the encoding of a UTF-8 encoded file to UTF-16 then the encoding does not match the character encoding.

    If you need a UTF-16 encoded file for testing, then please generate it by a stylesheet like this:
    
    $ cat to-utf-16.xsl <xsl:stylesheet version=
    "1.0" xmlns:xsl=
    "http://www.w3.org/1999/XSL/Transform" > <xsl:output encoding=
    "UTF-16"/>   <xsl:template match=
    "/"> <xsl:copy-of select=
    "."/> </xsl:template> </xsl:stylesheet> $
    


    Here you can see that every character is affected for UTF-8 to UTF-16 conversion:
    
    $ echo 
    "<x>1</x>" | coproc2 to-utf-16.xsl - http:
    //dp5-l3:2223 ; echo <?xml version=
    "1.0" encoding=
    "UTF-16"?> <x>1</x> $ $ echo 
    "<x>1</x>" | coproc2 to-utf-16.xsl - http:
    //dp5-l3:2223 -s | od -tcx1 0000000  \0   <  \0   ?  \0   x  \0   m  \0   l  \0      \0   v  \0   e 00  3c  00  3f  00  78  00  6d  00  6c  00  20  00  76  00  65 0000020  \0   r  \0   s  \0   i  \0   o  \0   n  \0   =  \0   
    "  \0   1 00  72  00  73  00  69  00  6f  00  6e  00  3d  00  22  00  31 0000040  \0   .  \0   0  \0   
    "  \0      \0   e  \0   n  \0   c  \0   o 00  2e  00  30  00  22  00  20  00  65  00  6e  00  63  00  6f 0000060  \0   d  \0   i  \0   n  \0   g  \0   =  \0   
    "  \0   U  \0 T 00  64  00  69  00  6e  00  67  00  3d  00  22  00  55  00  54 0000100  \0   F  \0   -  \0   1  \0   6  \0   
    "  \0   ?  \0   >  \0  \n 00  46  00  2d  00  31  00  36  00  22  00  3f  00  3e  00  0a 0000120  \0   <  \0   x  \0   >  \0   1  \0   <  \0   /  \0   x  \0   > 00  3c  00  78  00  3e  00  31  00  3c  00  2f  00  78  00  3e 0000140 $ $ echo 
    "<x>1</x>" | od -tcx1 0000000   <   x   >   1   <   / x   >  \n 3c  78  3e  31  3c  2f  78  3e  0a 0000011 $
    


    Hermann<myXsltBlog/> <myXsltTweets/>
  • MFXX_Srinivasa_Reddy
    34 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-06T05:52:09Z  
    • HermannSW
    • ‏2012-12-04T09:46:14Z
    Hi Srini,

    the file you attached is a Binary (MQ) file.
    It contatains some binary prefix, then an <mcd>, <jms>, <usr> and <SyncPersonnel> XML parts, seperated by binary data.
    <SyncPersonnel> does have UTF-8 encoding in xml declaration.

    > Instead of UTF-8 we have edited to UTF-16 and sent via MQ.
    >
    You should never ever "edit" a file's encoding.
    If you change the encoding of a UTF-8 encoded file to UTF-16 then the encoding does not match the character encoding.

    If you need a UTF-16 encoded file for testing, then please generate it by a stylesheet like this:
    <pre class="jive-pre"> $ cat to-utf-16.xsl <xsl:stylesheet version= "1.0" xmlns:xsl= "http://www.w3.org/1999/XSL/Transform" > <xsl:output encoding= "UTF-16"/> <xsl:template match= "/"> <xsl:copy-of select= "."/> </xsl:template> </xsl:stylesheet> $ </pre>

    Here you can see that every character is affected for UTF-8 to UTF-16 conversion:
    <pre class="jive-pre"> $ echo "<x>1</x>" | coproc2 to-utf-16.xsl - http: //dp5-l3:2223 ; echo <?xml version= "1.0" encoding= "UTF-16"?> <x>1</x> $ $ echo "<x>1</x>" | coproc2 to-utf-16.xsl - http: //dp5-l3:2223 -s | od -tcx1 0000000 \0 < \0 ? \0 x \0 m \0 l \0 \0 v \0 e 00 3c 00 3f 00 78 00 6d 00 6c 00 20 00 76 00 65 0000020 \0 r \0 s \0 i \0 o \0 n \0 = \0 " \0 1 00 72 00 73 00 69 00 6f 00 6e 00 3d 00 22 00 31 0000040 \0 . \0 0 \0 " \0 \0 e \0 n \0 c \0 o 00 2e 00 30 00 22 00 20 00 65 00 6e 00 63 00 6f 0000060 \0 d \0 i \0 n \0 g \0 = \0 " \0 U \0 T 00 64 00 69 00 6e 00 67 00 3d 00 22 00 55 00 54 0000100 \0 F \0 - \0 1 \0 6 \0 " \0 ? \0 > \0 \n 00 46 00 2d 00 31 00 36 00 22 00 3f 00 3e 00 0a 0000120 \0 < \0 x \0 > \0 1 \0 < \0 / \0 x \0 > 00 3c 00 78 00 3e 00 31 00 3c 00 2f 00 78 00 3e 0000140 $ $ echo "<x>1</x>" | od -tcx1 0000000 < x > 1 < / x > \n 3c 78 3e 31 3c 2f 78 3e 0a 0000011 $ </pre>

    Hermann<myXsltBlog/> <myXsltTweets/>
    Hi Hermann,

    Thanks for your clarification due to this info i'm able to fix the issue.

    Thanks,
    Srini
  • MFXX_Srinivasa_Reddy
    34 Posts

    Re: XML Parsing failure due to International characters

    ‏2012-12-06T05:52:13Z  
    • HermannSW
    • ‏2012-12-04T09:46:14Z
    Hi Srini,

    the file you attached is a Binary (MQ) file.
    It contatains some binary prefix, then an <mcd>, <jms>, <usr> and <SyncPersonnel> XML parts, seperated by binary data.
    <SyncPersonnel> does have UTF-8 encoding in xml declaration.

    > Instead of UTF-8 we have edited to UTF-16 and sent via MQ.
    >
    You should never ever "edit" a file's encoding.
    If you change the encoding of a UTF-8 encoded file to UTF-16 then the encoding does not match the character encoding.

    If you need a UTF-16 encoded file for testing, then please generate it by a stylesheet like this:
    <pre class="jive-pre"> $ cat to-utf-16.xsl <xsl:stylesheet version= "1.0" xmlns:xsl= "http://www.w3.org/1999/XSL/Transform" > <xsl:output encoding= "UTF-16"/> <xsl:template match= "/"> <xsl:copy-of select= "."/> </xsl:template> </xsl:stylesheet> $ </pre>

    Here you can see that every character is affected for UTF-8 to UTF-16 conversion:
    <pre class="jive-pre"> $ echo "<x>1</x>" | coproc2 to-utf-16.xsl - http: //dp5-l3:2223 ; echo <?xml version= "1.0" encoding= "UTF-16"?> <x>1</x> $ $ echo "<x>1</x>" | coproc2 to-utf-16.xsl - http: //dp5-l3:2223 -s | od -tcx1 0000000 \0 < \0 ? \0 x \0 m \0 l \0 \0 v \0 e 00 3c 00 3f 00 78 00 6d 00 6c 00 20 00 76 00 65 0000020 \0 r \0 s \0 i \0 o \0 n \0 = \0 " \0 1 00 72 00 73 00 69 00 6f 00 6e 00 3d 00 22 00 31 0000040 \0 . \0 0 \0 " \0 \0 e \0 n \0 c \0 o 00 2e 00 30 00 22 00 20 00 65 00 6e 00 63 00 6f 0000060 \0 d \0 i \0 n \0 g \0 = \0 " \0 U \0 T 00 64 00 69 00 6e 00 67 00 3d 00 22 00 55 00 54 0000100 \0 F \0 - \0 1 \0 6 \0 " \0 ? \0 > \0 \n 00 46 00 2d 00 31 00 36 00 22 00 3f 00 3e 00 0a 0000120 \0 < \0 x \0 > \0 1 \0 < \0 / \0 x \0 > 00 3c 00 78 00 3e 00 31 00 3c 00 2f 00 78 00 3e 0000140 $ $ echo "<x>1</x>" | od -tcx1 0000000 < x > 1 < / x > \n 3c 78 3e 31 3c 2f 78 3e 0a 0000011 $ </pre>

    Hermann<myXsltBlog/> <myXsltTweets/>
    Hi Hermann,

    Thanks for your clarification due to this info i'm able to fix the issue.

    Thanks,
    Srini