These may be useful ressources to you if you use a DataPower appliance or have interest in XSLT or XML. The xpath++ tool described on this slide and available in this posting may be used even without DataPower.
The service accepts Non-XML traffic, attaches the binary input to a dummy SOAP document and therefore creates the SWA file needed for further processing.
The binary input of Non-XML transformations needs to be "consumed", otherwise the default behavior is that it will be copied to the output. Attaching the input to the dummy SOAP file does not consume the input.
Therefore stylesheet "bin.xsl" needs to invoke a <dp:input-mapping>, whose output is just discarded. In the posting above a separate FFD file is provided and used. Because its output is not used at all bin.xsl's line
<dp:input-mapping href="bin.ffd" type="ffd"/>
can be simplified to refer to a FFD file that is offically shipped as part of the DataPower product:
Several years ago I had the need to create many little chess problem solution animations. With some shell scripts, netpbm tools and gifsicle I was able to create animated gifs like that on the left (of size 80x80 pixels for fitting on my Siemens S55 cell phone display at that time). The chess problems were so called shortest construction tasks, have a look at "Construction task" on this page http://en.wikipedia.org/wiki/Chess_problem and you will find my animation as well as the definition and a reference to Shortest construction tasks maphttp://stamm-wilbrandt.de/chess/en/sct_map.html (the solution shown in the animation is for problem "Black mates White by changing a pawn into a knight").
As seen above animated gifs are good for uninterrupted animations. But later I wanted some kind of "solution player" like a CD player.
Animation (play) shoud be interruptible and single stepping (back and forth) should be possible. The solution (on the left) was to have an iframe element with the following elements inside :
anim.gif (animated .gif file)
anim.html (referencing anim.gif)
uD.gif (for D=0,...,10, single position .gif file)
anim.D.html (for D=0,...,10, referencing uD.gif)
Each .html file defines an image map with the active areas making the left side controls (the well known symbols for "stop", "back to start", "single step back", ...,"play") active by defining the needed URLs to anim.html, anim.0.html, ..., anim.10.html. You may inspect the details by clicking on link anim.3.html and inspect the HTML page source.
To start generating chess animations with XML first you have to create a single position picture. This can be done by a stylesheet putting pictures for each field on the chessboard together. Having a single .gif for each chess piece (bishop, king, (k)night, pawn, queen, "empty"), piece color (black/white) and field color (black/white), with naming allowing easy referencing by XSLT is helpful. Find below the .gifs "b.gif", "bbb.gif", "bbw.gif", "bkb.gif", ..., "wrb.gif", "wrw.gif":
This XML file is an encoding of the mate position of above chess animations:
This is a real customer problem and the first of several postings on CDATA sections.
The input XML document containted CDATA sections with an embedded embedded XML document.
Both XML declarations (see here), the XML file one and the one inside CDATA section are ISO-8859-1 encoded.
For ease of demonstration the XML file inside the CDATA section contains one element containing a single text character (German 'ü'):
Now what happens after the parser has parsed the XML file and passed it to DataPower XSLT processor? The CDATA section is removed and the complete input document will be converted to DataPower internal data format. And this internal data format is UTF-8.
So now the problem is that a dp:parse() on the inner XML document finds a mismatch of the encoding in XML declaration (ISO-8859-1) and the current (UTF-8) encoding of the document. While this is working as designed, there are the five options to get parsing done correctly:
do not use CDATA section
use CDATA section without an XML declaration inside
use CDATA section with an XML declaration inside without encoding
use CDATA section with an XML declaration inside with UTF-8 encoding
skip the XML declaration when parsing the inner XML document
Skipping the XML declaration can be simply done by "substring-after(_, '?>')".
Below stylesheet demonstrates
the difference. lenA is 2 because the UTF-8 encoding "fc 3c" for 'ü' will be counted as two (ISO-8859-1) characters. lenB is 1 because by skipping the XML declaration the UTF-8 default encoding is used which perfectly matches the (internal) encoding.
During development sometimes the real backend service a DataPower service should connect to might not be available.
In this case we could set variable var://service/mpgw/skip-backside to 1 for a Multi-Protocol Gateway service. This makes the MPGW service a loopback service and we could provide dummy backend response data.
The difference to connecting to a real backend is just the latency -- no real backend has 0 latency.
While there is no specific DataPower command to add latency to a service or stylesheet (like Unix "sleep" command) we can use a simple workaround I learned from a colleague yesterday. Just try to open a network connection to a non-existent IP address with <dp:url-open ...> and specify the timeout you are interested in:
... <!-- do 10 second delay --> <dp:url-open target="http://126.96.36.199" response="ignore" timeout="10" /> ...
This has the big advantage that it does not affect CPU-usage (non-active wait).
Because most browsers support XSLT 1.0 only and I am interested in XSLT in browsers ("As of 2010, however, XSLT 1.0 is still widely used, as there are no products that support XSLT 2.0 running in the browser, ...", http://en.wikipedia.org/wiki/XSLT#Origins)
Because of the many EXSLT functions one could define "XSLT 1.0++ = XSLT 1.0 + EXSLT", but what do I mean with "XSLT 1.0+"?
Browser support for EXSLT is either not complete or minimal. The biggest deficience of XSLT 1.0 spec is that the result of a transformation is a so called result tree fragment (RTF). Only a restricted set of operations is allowed on RTF's. The EXSLT function node-set() resolves this deficiency, and my definition is "XSLT 1.0+ = XSLT 1.0 + exslt:node-set()".
While the node-set() function is available in all browsers, it is not available in the EXSLT namespace for Microsoft Internet Explorer.
This looked interesting, but I thought that "even more" obfuscation was possible for his sample. I came up with this 992 bytes long 1-line version of Oleg's XSLT (line breaks here for FF browser display only):
<!--evenMore("http://www.tkachenko.com/blog/archives/000732.html")--><!DOCTYPE p [<!ENTITY e 'uri'><!ENTITY x 'string'><!ENTITY P 'number'><!ENTITY q 'before'>< !ENTITY w 'name'><!ENTITY L 'concat'><!ENTITY p '&w;space'><!ENTITY W 'translate' ><!ENTITY M 'length'><!ENTITY Y 'contains'><!ENTITY y 'div'><!ENTITY u 'sub&x;'> <!ENTITY Q 'not'><!ENTITY _ 'after'><!ENTITY m '-uri'><!ENTITY E 'document'>]><x :stylesheet version="1.0" xmlns:x="http://www.w3.org/1999/XSL/Transform"><x:temp late match="/"><x:variable name="_" select="&E;('')"/><x:variable name="_-_" sel ect="&P;(&Q;(_-_=_-_=_-_=_-_))"/><html><x:value-of select="&L;(&u;(&p;-&e;($_/*/ *[$_-_]),$_-_,$_-_),&u;(&w;($_/*/*[$_-_]),&x;-&M;(*>*)*2,$_-_),&u;(@_>_-,&x;-&M; (******* div @_),$_-_),&W;(&w;(($_//@*)),&W;(&w;(($_//@*)),'l',''),''),&u; (&w;($_/*/@*),6,$_-_),' ',&W;(&u;(&p;&m;($_/*),12,&x;-&M;('&_;')),'.3',''), &u;(_/_/_=//_//_,3,$_-_),&u;($_/*/*/*/@*[&Y;(.,'(')],$_-_,$_-_),'!')"/></ht ml></x:template></x:stylesheet>
So why is Obfuscated XSLT more difficult to create than Obfuscated C? See Cheat 2.
After my previous posting CDATA 1/x I wanted to make some comments on CDATA sections in general and then for CDATA and DataPower.
Doing a simple search on developerWorks for "CDATA" showed an interesting hit: Dealing with data in XML It talks about character data, parsed character data, ... and so with that link I am done with the first part.
As already said in the previous posting XML data is stored internally as UTF-8 by DataPower XSLT processor.
Document cdata-alpha-iso.xml is used to show that there is no difference for DataPower XSLT processor if the greek alpha character is given in the input
in clear inside CDATA section
(the hexdump shows that alpha is encoded as e1 in ISO-8859-7 encoding):
$ cat cdata-alpha-iso.xml <?xml version="1.0" encoding="ISO-8859-7"?> <!DOCTYPE root [ <!ENTITY alpha "α"> ]> <tag>�<![CDATA[�]]>α</tag> $ $ od -Ax -tcx1 cdata-alpha-iso.xml | tail -5 000060 341 < ! [ C D A T A [ 341 ] ] > & a e1 3c 21 5b 43 44 41 54 41 5b e1 5d 5d 3e 26 61 000070 l p h a ; < / t a g > \n 6c 70 68 61 3b 3c 2f 74 61 67 3e 0a 00007c $
In previous CDATA posting I showed how to use string-length() to get an idea of DataPower internal data representation.
Here I want to show the "magnifying glass" technique, which may be used to inspect more internals than just CDATA.
The extension funtion dp:binary-encode() Base64-encodes arbitrary binary data for DataPower XSLT processor. Since base64 data is not that easy to inspect by humans extension function dp:radix-convert(_,64,16) may be used to generate hexadecimal output from base64 data.
The second line is the "magnifying glass" output that demonstrates that every time CEB1 is stored internally, which is the UTF-8 encoding of alpha,
One last comment on converting a base64 string to a hex string.
dp:radix-convert() is a "number" function, and therefore it will strip leading 0x00 bytes (no problem for above application).
In case you want to perserve the length of the encoded base64 string use this function:
<!-- convert base64 string $b64str to hex string (2 hex digits per byte) --> <func:function name="ub:base64-to-hex"> <xsl:param name="b64str"/>
I asked myself back in 1983 at school:
What is the sum of the reciprocals of Pascal's triangle?
Of course the sum is infinite because of the 1's on left and right border.
Next question: What is the sum without the 1's?
Of course the sum is infinite again because of the harmonic series on left and right border.
Final question: What is the sum of Pascal's triangle reciprocals without the 1's and without the two harmonic series?
And the answer was (and is) 3/2 !
So based on the Theorem further below the Corollary (sum being 3/2) can be prooven pretty easily.
And here is the main Theorem, a nice decomposition of each unit fraction into binomial coefficient reciprocals.
Starting summation of binomial coefficient reciprocals at row j+1 for column j gives 1/(j-1).
It is left as exercise to the reader ;-)
(or as challenge for one being significantly shorter than two pages).
So, why is this posting marked with XSLT tag?
Easier to proof it the theorem validity can just be "seen" by actually doing the summations.
Of course these summations are done by a styesheet.
Click on Pascal.xml to compute the sums in your browser (by stylesheet Pascal.xsl). (interesting what pure HTML allows for in typesetting mathematical formulas itself)
a "prepend" service making use of special binary data processing behavior mentioned in my previous Blog posting Sending zip archives to DataPower (this just converts Non-XML input data base64string to prepend=base64string)
a normal Non-XML service with a convert-http action to convert this "HTTP-Form" input to XML (convert-http action does not need DataGlue license)
This is just another example for "does not work out-of-the-box on DataPower but can be made working".
While embedding stylesheets is part of the spec it is not supported by all browsers. Especially Internet Exploerer 6/7/8 browsers do not support it.
I found a solution on how to enable Internet Explorer browsers for embedded stylesheet processing. The first solution had the drawback that now stylesheet embedding was possible for FF and IE, but the solution broke the ability of the other big5 browsers to process embedded stylesheets although they could process them natively.
To give you an idea of an embedded stylesheet find listing of supportALL.xml (generates the table on the right) below:
<!-- This stylesheet is a modification from this posting: http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/200010/msg01150.html --> <xsl:stylesheet id="style1" version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- only needed by "Opera Mini"; output method here IS "html" even without explicit declaration, see http://www.w3.org/TR/xslt#output --> <xsl:output method="html"/>
<xsl:key name="b" match="@os" use="."/>
<xsl:template match="/"> <html> <body>
<i>ApplyStylesheetEmbedding.xsl technique</i> works for: <table border="1" cellspacing="0">
<!-- Generate the row of table header cells --> <tr> <th>Browser \ OS</th> <!-- | Generate a header cell for each unique os name | | See Chapter 9 of "Building Oracle XML Applications" from O'Reilly
| for a detailed explanation of how this <xsl:key> based technique works +--> <xsl:for-each select="//@os[generate-id(.)=generate-id(key('b',.))]"> <!-- Sort by the os name (the value of the current @os attribute -->
could have been titled "Processing binary data in DataPower stylesheets 1/x". This posting is the second of several postings on processing of binary data in DataPower stylesheets.
Again this is on CDATA as in the postings "CDATA .../x". This time the (customer) problem was as follows:
accept XML input data
do XML threat protection
if no threat, pass input unmodified to backend (preserve any CDATA elements).
While preserving of CDATA might be done with cdata-section-elements of <xsl:output ... /> you have to know which elements you want to have CDATA output for, In case of DataPowwer as generic proxy any element might contain CDATA section or might not contain CDATA sections.
If the service would have XML request type CDATA sections are gone -- therefore we need a Non-XML service.
But XML threat protection does not work in Non-XML service -- therefore we need a second XML service.
Now lets discuss the solution I came up with.
Setup XML-FW with XML request type with its own XML Manager and any specific XML Parser settings. (I did set "XML Element Depth" to 1, service is on port 2060)
Setup XML-FW with Non-XML request type, and these two actions: 1) binary transform action from INPUT to NULL with stylesheet check.xsl 2) results action from INPUT to OUTPUT
This is the binary transform stylesheet check.xsl I used.
The <dp:input-mapping ... /> uses Flat File Descriptor (FFD) store:///pkcs7-convert-input.ffd shipped with DataPower and used by five PKCS7 stylesheets located in store:/// folder. As documented in the FFD itself (see below) it generates structure <object><message>***binary data***</message></object> with a binaryNode as child of <message> element. Find the details on binaryNode in previous posting.
Now the "binary data" (which in this case is XML data, but we want pass it unmodified) needs to be posted to second XML (threat protection) service by <dp:url-open>. Posting binary data with <dp:url-open> is done with data-type="base64". Therefore we just convert the binaryNode data to base64 by dp:binary-encode(/object/message) in <dp:url-open>. We are not interested in the response of the service called but only in its response code, therefore response="responsecode". The result of <dp:url-open> call is stored in variable $reponse. In case the HTTP response code is not equal to 200 we know that something is wrong, and for the service called we know that XML threat protection found a problem. Therefore we abort service execution by <dp:reject>.
So if service execution does not get aborted by <dp:reject> the results action from INPUT to OUTPUT just copies the unmodified INPUT to OUTPUT (preserving any CDATA section). Otherwise (in case of a detected XML threat) the service returns a generic error message to the client.
What we have seen is
how to process binary (Non-XML) input data (resulting in a binaryNode)
how to convert a binaryNode to a base64 encoded string
how to pass "binary" data to dp:url-open() by data-type="base64"
how to process response code of dp:url-open
By the arguments above there cannot be a solution without a side call.
And this is from stylesheet "store:///pkcs7-convert-input.ffd":
... <!-- This FFD converts the input into an XML tree like this: