Topic
  • 11 replies
  • Latest Post - ‏2014-12-31T01:15:34Z by HermannSW
Jaango
Jaango
267 Posts

Pinned topic convert utf8 to ebcdic using xslt

‏2012-08-07T13:22:30Z |
Can we convert a normal UTF-8 string , say "abc" to ebcdic-de, just using XSLT?

May be we have to use encoding function like below?
<xsl:output omit-xml-declaration="yes" encoding="ebcdic-de"/>
Updated on 2014-03-25T02:46:49Z at 2014-03-25T02:46:49Z by iron-man
  • HermannSW
    HermannSW
    4877 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2012-08-07T14:31:20Z  
    Yes!

    
    $ echo 
    "<x>abc</x>" | coproc2 to-ebcdic-de.xsl - http:
    //dp5-l3:2223 -s|od -tcx1 0000000 201 202 203 81  82  83 0000003 $ $ cat to-ebcdic-de.xsl <xsl:stylesheet version=
    "1.0" xmlns:xsl=
    "http://www.w3.org/1999/XSL/Transform" > <xsl:output omit-xml-declaration=
    "yes" encoding=
    "ebcdic-de"/>   <xsl:template match=
    "/"> <xsl:value-of select=
    "."/> </xsl:template> </xsl:stylesheet> $
    


    Btw, I do use "ebcdic-de" in my samples because I am a German -- there are many other EBCDIC code pages (ebcdic-us, ...)

    You may want to explore under this link:
    http://demo.icu-project.org/icu-bin/convexp
    For conversion back from EBCDIC to something else you may want to see slides 12-14 from this WSTE webcast:
    http://www-01.ibm.com/support/docview.wss?uid=swg27022979

     
    Hermann<myXsltBlog/> <myXsltTweets/>
  • Jaango
    Jaango
    267 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2012-08-08T12:09:08Z  
    • HermannSW
    • ‏2012-08-07T14:31:20Z
    Yes!

    <pre class="jive-pre"> $ echo "<x>abc</x>" | coproc2 to-ebcdic-de.xsl - http: //dp5-l3:2223 -s|od -tcx1 0000000 201 202 203 81 82 83 0000003 $ $ cat to-ebcdic-de.xsl <xsl:stylesheet version= "1.0" xmlns:xsl= "http://www.w3.org/1999/XSL/Transform" > <xsl:output omit-xml-declaration= "yes" encoding= "ebcdic-de"/> <xsl:template match= "/"> <xsl:value-of select= "."/> </xsl:template> </xsl:stylesheet> $ </pre>

    Btw, I do use "ebcdic-de" in my samples because I am a German -- there are many other EBCDIC code pages (ebcdic-us, ...)

    You may want to explore under this link:
    http://demo.icu-project.org/icu-bin/convexp
    For conversion back from EBCDIC to something else you may want to see slides 12-14 from this WSTE webcast:
    http://www-01.ibm.com/support/docview.wss?uid=swg27022979

     
    Hermann<myXsltBlog/> <myXsltTweets/>
    Perfect. Thanks, Hermann.
  • Jaango
    Jaango
    267 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2012-10-24T11:23:46Z  
    • Jaango
    • ‏2012-08-08T12:09:08Z
    Perfect. Thanks, Hermann.
    Hermann,
    Could you tell us the data language to be used for japan? We tried using IBM930. However the response seems to be blank.

    Link: http://en.wikipedia.org/wiki/EBCDIC_930
    <File name="object" syntax="syn">
      <Syntax name="syn" encoding="IBM930"/>
      <Field name='message' type='String'/>
    </File>
    
    Updated on 2014-03-25T02:47:02Z at 2014-03-25T02:47:02Z by iron-man
  • HermannSW
    HermannSW
    4877 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2012-10-24T19:29:46Z  
    • Jaango
    • ‏2012-10-24T11:23:46Z
    Hermann,
    Could you tell us the data language to be used for japan? We tried using IBM930. However the response seems to be blank.

    Link: http://en.wikipedia.org/wiki/EBCDIC_930
    <pre class="java dw" data-editor-lang="java" data-pbcklang="java" dir="ltr"><File name="object" syntax="syn"> <Syntax name="syn" encoding="IBM930"/> <Field name='message' type='String'/> </File> </pre>
    Hi,

    as I said earlier this link is really helpful:
    http://demo.icu-project.org/icu-bin/convexp

    Searching for "930" there gives you this link:
    http://demo.icu-project.org/icu-bin/convexp?conv=ibm-930_P120-1999&s=ALL
    I tried this out with a simple stylesheet, and the really strange(?) arrangement of the locwer case letters in the codepage
    http://demo.icu-project.org/icu-bin/convexp?conv=ibm-930_P120-1999&s=ALL#layout

    gets correctly replicated in the output:
    
    $ echo 
    "<x>abcdefghijklmnopqrstuvwxyz</x>" | coproc2 to-ebcdic-930.xsl - http:
    //dp9-l3:2223 -s | od -Ax -tx1 000000 62 63 64 65 66 67 68 69 71 72 73 74 75 76 77 78 000010 8b 9b ab b3 b4 b5 b6 b7 b8 b9 00001a $ $ cat to-ebcdic-930.xsl <xsl:stylesheet version=
    "1.0" xmlns:xsl=
    "http://www.w3.org/1999/XSL/Transform" > <xsl:output omit-xml-declaration=
    "yes" encoding=
    "930"/>   <xsl:template match=
    "/"> <xsl:value-of select=
    "."/> </xsl:template> </xsl:stylesheet> $
    


     
    Hermann<myXsltBlog/> <myXsltTweets/>
  • Jaango
    Jaango
    267 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2012-10-25T11:23:47Z  
    • HermannSW
    • ‏2012-10-24T19:29:46Z
    Hi,

    as I said earlier this link is really helpful:
    http://demo.icu-project.org/icu-bin/convexp

    Searching for "930" there gives you this link:
    http://demo.icu-project.org/icu-bin/convexp?conv=ibm-930_P120-1999&s=ALL
    I tried this out with a simple stylesheet, and the really strange(?) arrangement of the locwer case letters in the codepage
    http://demo.icu-project.org/icu-bin/convexp?conv=ibm-930_P120-1999&s=ALL#layout

    gets correctly replicated in the output:
    <pre class="jive-pre"> $ echo "<x>abcdefghijklmnopqrstuvwxyz</x>" | coproc2 to-ebcdic-930.xsl - http: //dp9-l3:2223 -s | od -Ax -tx1 000000 62 63 64 65 66 67 68 69 71 72 73 74 75 76 77 78 000010 8b 9b ab b3 b4 b5 b6 b7 b8 b9 00001a $ $ cat to-ebcdic-930.xsl <xsl:stylesheet version= "1.0" xmlns:xsl= "http://www.w3.org/1999/XSL/Transform" > <xsl:output omit-xml-declaration= "yes" encoding= "930"/> <xsl:template match= "/"> <xsl:value-of select= "."/> </xsl:template> </xsl:stylesheet> $ </pre>

     
    Hermann<myXsltBlog/> <myXsltTweets/>
    Hermann, There was a problem with hex 'EC'
    I just converted this hex to binary and read the binary using an xslt, which refers the 930 EBCDIC ffd. The output is blank. If we remove this hex, from the stream of hex, it is fine.
  • HermannSW
    HermannSW
    4877 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2012-10-26T22:58:41Z  
    • Jaango
    • ‏2012-10-25T11:23:47Z
    Hermann, There was a problem with hex 'EC'
    I just converted this hex to binary and read the binary using an xslt, which refers the 930 EBCDIC ffd. The output is blank. If we remove this hex, from the stream of hex, it is fine.
    Hi Maneesh,

    the empty string you experience for 0xEC is the result of your input breaking the contract (of being "930" encoded)!
    Please see the 930 codepage:
    http://demo.icu-project.org/icu-bin/convexp?conv=ibm-930_P120-1999&s=ALL#layout

    All empty grey fields do NOT belog to 930 codepage.

    As you know you get parse errors if you pass Non-XML to a normal stylesheet.
    Here you pass Non-930 data into a dp:input-mapping for 930 codepage.
    That results in empty string (for good reason).

    I assume that you want to be able to process 930-contract-breaking input data.
    If so, you need to do the same as if you want to process Non-XML data in a normal stylesheet.
    You have to "repair" the input before processing.
    See "conversion-wrapper2.xsl" from slide 7 of this WSTE webcast on how to do that:
    http://www-01.ibm.com/support/docview.wss?uid=swg27019119

    Here is the "proof" that indeed the empty fields result in empty strings.
    (below is a bit tricky, "hexedit" allows to "edit" binary files, "F2" does store changes (1b4f51), and CTRL-C (03) leaves "hexedit")


     
    Hermann<myXsltBlog/> <myXsltTweets/>
  • HermannSW
    HermannSW
    4877 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2012-10-26T23:06:15Z  
    • HermannSW
    • ‏2012-10-26T22:58:41Z
    Hi Maneesh,

    the empty string you experience for 0xEC is the result of your input breaking the contract (of being "930" encoded)!
    Please see the 930 codepage:
    http://demo.icu-project.org/icu-bin/convexp?conv=ibm-930_P120-1999&s=ALL#layout

    All empty grey fields do NOT belog to 930 codepage.

    As you know you get parse errors if you pass Non-XML to a normal stylesheet.
    Here you pass Non-930 data into a dp:input-mapping for 930 codepage.
    That results in empty string (for good reason).

    I assume that you want to be able to process 930-contract-breaking input data.
    If so, you need to do the same as if you want to process Non-XML data in a normal stylesheet.
    You have to "repair" the input before processing.
    See "conversion-wrapper2.xsl" from slide 7 of this WSTE webcast on how to do that:
    http://www-01.ibm.com/support/docview.wss?uid=swg27019119

    Here is the "proof" that indeed the empty fields result in empty strings.
    (below is a bit tricky, "hexedit" allows to "edit" binary files, "F2" does store changes (1b4f51), and CTRL-C (03) leaves "hexedit")


     
    Hermann<myXsltBlog/> <myXsltTweets/>
    $ cat 930.xsl 
    <xsl:stylesheet version="1.0" 
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
      xmlns:dp="http://www.datapower.com/extensions" 
      extension-element-prefixes="dp" 
    >
      <xsl:output omit-xml-declaration="yes"/>
     
      <dp:input-mapping href="temporary:///930.ffd" type="ffd"/>
     
      <xsl:template match="/">
        <xsl:copy-of select="."/>
      </xsl:template>
    </xsl:stylesheet>
    $ 
    $ cat 930.ffd 
    <?xml version="1.0" encoding="UTF-8" ?>
    <File version="2.1" name="Conversion" syntax="syn1">
        <Syntax name="syn1" encoding="930"/>
        <Field name="string" type="String" />
    </File>
    $
    
    Updated on 2014-03-25T02:46:53Z at 2014-03-25T02:46:53Z by iron-man
  • SunnyG87
    SunnyG87
    88 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2014-12-30T14:24:56Z  
    • HermannSW
    • ‏2012-08-07T14:31:20Z
    Yes!

    <pre class="jive-pre"> $ echo "<x>abc</x>" | coproc2 to-ebcdic-de.xsl - http: //dp5-l3:2223 -s|od -tcx1 0000000 201 202 203 81 82 83 0000003 $ $ cat to-ebcdic-de.xsl <xsl:stylesheet version= "1.0" xmlns:xsl= "http://www.w3.org/1999/XSL/Transform" > <xsl:output omit-xml-declaration= "yes" encoding= "ebcdic-de"/> <xsl:template match= "/"> <xsl:value-of select= "."/> </xsl:template> </xsl:stylesheet> $ </pre>

    Btw, I do use "ebcdic-de" in my samples because I am a German -- there are many other EBCDIC code pages (ebcdic-us, ...)

    You may want to explore under this link:
    http://demo.icu-project.org/icu-bin/convexp
    For conversion back from EBCDIC to something else you may want to see slides 12-14 from this WSTE webcast:
    http://www-01.ibm.com/support/docview.wss?uid=swg27022979

     
    Hermann<myXsltBlog/> <myXsltTweets/>

    Hi Hermann,

    We also have a use-case where we are looking to convert UTF-8 encoded XML message to EBCDIC format using XSLT. Code page value on Mainframe platform is 37.

    My question is - what value should we use in encoding attribute in <xsl:output> statement for code page 37? Is there any table which reveals encoding attribute values for various code pages?

    <xsl:output method="text" omit-xml-declaration="yes" encoding=?/>

    Thanks,

    Sunny

     

  • HermannSW
    HermannSW
    4877 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2014-12-30T18:56:23Z  
    • SunnyG87
    • ‏2014-12-30T14:24:56Z

    Hi Hermann,

    We also have a use-case where we are looking to convert UTF-8 encoded XML message to EBCDIC format using XSLT. Code page value on Mainframe platform is 37.

    My question is - what value should we use in encoding attribute in <xsl:output> statement for code page 37? Is there any table which reveals encoding attribute values for various code pages?

    <xsl:output method="text" omit-xml-declaration="yes" encoding=?/>

    Thanks,

    Sunny

     

    Hi Sunny,

    what you are looking for is this page (DataPower is based on the ICU library):
    http://demo.icu-project.org/icu-bin/convexp

    Search for the "37" row, and then take any of the encoding strings:
    IBM037, ibm-037, ebcdic-cp-us, ...

    You can see the codepage layout by clicking on the link for that row:
    http://demo.icu-project.org/icu-bin/convexp?conv=ibm-37_P100-1995&s=ALL


    Hermann.

    Updated on 2014-12-30T18:56:50Z at 2014-12-30T18:56:50Z by HermannSW
  • SunnyG87
    SunnyG87
    88 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2014-12-30T20:20:46Z  
    • HermannSW
    • ‏2014-12-30T18:56:23Z

    Hi Sunny,

    what you are looking for is this page (DataPower is based on the ICU library):
    http://demo.icu-project.org/icu-bin/convexp

    Search for the "37" row, and then take any of the encoding strings:
    IBM037, ibm-037, ebcdic-cp-us, ...

    You can see the codepage layout by clicking on the link for that row:
    http://demo.icu-project.org/icu-bin/convexp?conv=ibm-37_P100-1995&s=ALL


    Hermann.

    Thanks Hermann. This is exactly what I was looking for.  Btw why there are so many Aliases for 1 code page? Any specific reason?

    Thanks,

    Sunny

  • HermannSW
    HermannSW
    4877 Posts

    Re: convert utf8 to ebcdic using xslt

    ‏2014-12-31T01:15:34Z  
    • SunnyG87
    • ‏2014-12-30T20:20:46Z

    Thanks Hermann. This is exactly what I was looking for.  Btw why there are so many Aliases for 1 code page? Any specific reason?

    Thanks,

    Sunny

    Hi Sunny,

    Btw why there are so many Aliases for 1 code page?
    >
    you raise an interesting question; I just accepted the fact that there are many aliases because I knew where to search for (the above link) and ICU library just makes all of them work (with DataPower in <xsl:output>'s @encoding as well as in FFD Transform Binary action's <Syntax>'s @encoding).

    Based on your question I did search and there is a huge amount of information, some links here:

    http://www-01.ibm.com/software/globalization/g11n-res.html

    http://site.icu-project.org/charts/charset

    http://source.icu-project.org/repos/icu/data/trunk/charset/data/ucm/


    From the last link these are two of your "37" codepages:

    http://source.icu-project.org/repos/icu/data/trunk/charset/data/ucm/ibm-37_P100-1999.ucm

    http://source.icu-project.org/repos/icu/data/trunk/charset/data/ucm/glibc-IBM037-2.1.2.ucm


    It seems that the different aliases arose from different computing environments, different uses, different organizations, ...
    ICU just listed and compared all and determined the aliases.


    Hermann.