stylesheet classes.xsl which generated that diagram is attached and discussed a little bit.
You might be interested in "graph extraction" from XML data and Depth-First-Search traversal of the extracted graph which was used to generate the "block" hierarchy diagram. Also the simple and nice heuristic which allows
to allocate exactly the vertical space needed for a node in the diagram is interesting -- in "XML speak" instead of "graph speak"
since quite some time there exists an Enhancement request for a rawTCP Non-XML Front Side Handler
1.5 years ago I developed a prototype of a rawTCP2HTTP converter in DataPower Firmware which allows to convert rawTCP Non-XML data to HTTP chunked data which can then be processed by any (DataPower) HTTP service (see IBM publication Bridging raw TCP binary data to HTTP, http://priorartdatabase.com/IPCOM/000197959)
So it seems that processing rawTCP binary data is not possible with DataPower appliances.
Even one week ago I would have said "it is impossible to process Non-XML rawTCP data on DataPower appliances". But that is not true, and this posting is just another one in the series of "does not work out-of-the-box on DataPower but can be made working"
So now that we know that it can be done the question is how.
It turns out to be really easy to create a rawTCP2HTTP service as MPGW service on DataPower. Even though XA35 and XS40 do not allow for binary data processing they at least allow for Non-XML UTF-8 data processing by the first link of above list.
We make use of the fact that binary input of Non-XML transformations needs to be "consumed", otherwise the default behavior is that it will be copied to the output (see 3rd and 1st link above). In addition we use a rawTCP XML Front Side Handler as that is the only one not complaining about protocol violations. And then we make sure that INPUT is guaranteed to not being consumed -- thats all.
So this is rawTCP2HTTP MPGW service:
Front Side Protocol: Stateless Raw XML Handler
listening on port1
Request Type: Non-XML
Type: static backend
Backend URL: http://127.0.0.1:port2
Response Type: Pass-Thru
And this is rawTCP2HTTP policy:
Results action from NULL to OUTPUT
In the example port1 is 2091 and port2 is 2092.
Sending data with a HTTP POST to DataPower can be easily done with with cURL or ApacheBench tool:
curl --data-binary @file ... http://dpbox:port1
ab -p file ... http://dpbox:port1
Sending rawTCP data can be done by Netcat tool (nc), this statement sends file to service listening on port at host:
nc host port < file
To have a basis for measurements I chose a very simple Non-XML example service (toHexb.xsl). It just returns the (Non-XML) input as hexadecimal string, see posting DataPower "binary" node type (binaryNode) for details:
Now sending a request is not that spectacular -- we get the same hexadecimal output for file te3t we can also get by octal dump (od) tool:
[stammw@bl3d2027@de ~]$ nc 192.168.10.1 2091 <te3t ; echo 74650374 [stammw@bl3d2027@de ~]$ od -Ax -tcx1 te3t 000000 t e 003 t 74 65 03 74 000004 [stammw@bl3d2027@de ~]$
But this is really over rawTCP! See the following screenshot of a "All Interfaces" (3.8.2 firmware) packet capture. The Filter for ports 2091 and 2092 just selects the "interesting" part of packect capture. At the bottom left you see the command execution and output. On the bottom right you see Follow TCP Stream for services on port 2091 (rawTCP) and 2092 (HTTP). Hexadecimally displayed content of packet 116 is just the HTTP post of 2091 service to 2092 service after FIN ACK has been received. This is the first restriction of rawTCP2HTTP service -- it cannot deal with HTTP persistent connections. (http://stamm-wilbrandt.de/en/blog/capture.all.pcap.gif)
Time stamps for packets 110 and 128 show that the transaction took 2.003ms.
But that is not the value which is important. Important is the average transaction overhead of rawTCP service on 2091 over HTTP service on 2092.
There are quite some tools like ApacheBench (ab) mentioned above for getting performance information of HTTP services. I was not able to find a similar tool for benchmarking rawTCP performance. Therefore I wrote one myself based on TCPClient.c (http://en.wikipedia.org/wiki/Berkeley_sockets#Client). It was able to benchamrk rawTCP as well as HTTP data (by sending the necessary HTTP Post data). For persistent and non-persistent connection HTTP benchmarking my numbers were similar to ApacheBench results (-k option for "keepalive").
These are the average results for sending 20000 requests with a concurrency of 20 to a 9004 DataPower from a directly connected Laptop:
0.1985ms per request for rawTCP service on 2091
0.0809ms per request for HTTP service on 2092 (without persistent connections)
0.0647ms per request for HTTP service on 2092 (with persistent connections)
So the overhead of rawTCP2HTTP service is 0.1985ms - 0.0647ms = 0.1338ms.
While for this very simple example service the overhead time dominates the total transaction time, 0.14ms overhead for rawTCP processing enablement is a good price for many real world Non-XML processing scenarios -- until now this is the only way to process rawTCP Non-XML data on DataPower appliances.
These are reasons why the "real" Non-XML rawTCP Front Side Handler of above mentioned Enhancement Request is still needed:
enabling persistent connections
reducing/eliminating 0.1338ms overhead of adding/removing of HTTP headers
having to use "XML rawTCP Front Side Handler" for dealing with "Non-XML data" sounds "interesting" at least :-)
Last, but not least, see that NonXML rawTCP processing on a XA35 or XS40 DataPower appliance is possible by above technique and the technique from 1st link posting above (by "prepend=" trick with convert-http action):
$ od -Ax -tcx1 test 000000 t e s t 1 2 3 t e s t 1 2 3 t e 74 65 73 74 31 32 33 74 65 73 74 31 32 33 74 65 000010 s t \n 73 74 0a 000013 $ $ nc cosmopolitan 2091 < test | tidy -q -xml <?xml version="1.0" encoding="utf-8"?> <request> <url>/basic.xml?transaction=1</url> <base-url>/basic.xml</base-url> <args src="url"> <arg name="transaction">1</arg> </args> <args src="body"> <arg name="prepend">test123test123test</arg> </args> </request>
$ soma/doSoma admin soma/version.xml cosmopolitan:5550 | xpath++ "//Version/text()" - Enter host password for user 'admin':
I tested rawTCP2HTTP on all supported release branches. Finally I just tested it on a box with 18.104.22.168 firmware without problems. 22.214.171.124 DataPower Firmware was released in 1H2009 -- since then rawTCP2HTTP service waited to be discovered ... ;-)
could have been titled "Processing binary data in DataPower stylesheets 1/x". This posting is the second of several postings on processing of binary data in DataPower stylesheets.
Again this is on CDATA as in the postings "CDATA .../x". This time the (customer) problem was as follows:
accept XML input data
do XML threat protection
if no threat, pass input unmodified to backend (preserve any CDATA elements).
While preserving of CDATA might be done with cdata-section-elements of <xsl:output ... /> you have to know which elements you want to have CDATA output for, In case of DataPowwer as generic proxy any element might contain CDATA section or might not contain CDATA sections.
If the service would have XML request type CDATA sections are gone -- therefore we need a Non-XML service.
But XML threat protection does not work in Non-XML service -- therefore we need a second XML service.
Now lets discuss the solution I came up with.
Setup XML-FW with XML request type with its own XML Manager and any specific XML Parser settings. (I did set "XML Element Depth" to 1, service is on port 2060)
Setup XML-FW with Non-XML request type, and these two actions: 1) binary transform action from INPUT to NULL with stylesheet check.xsl 2) results action from INPUT to OUTPUT
This is the binary transform stylesheet check.xsl I used.
The <dp:input-mapping ... /> uses Flat File Descriptor (FFD) store:///pkcs7-convert-input.ffd shipped with DataPower and used by five PKCS7 stylesheets located in store:/// folder. As documented in the FFD itself (see below) it generates structure <object><message>***binary data***</message></object> with a binaryNode as child of <message> element. Find the details on binaryNode in previous posting.
Now the "binary data" (which in this case is XML data, but we want pass it unmodified) needs to be posted to second XML (threat protection) service by <dp:url-open>. Posting binary data with <dp:url-open> is done with data-type="base64". Therefore we just convert the binaryNode data to base64 by dp:binary-encode(/object/message) in <dp:url-open>. We are not interested in the response of the service called but only in its response code, therefore response="responsecode". The result of <dp:url-open> call is stored in variable $reponse. In case the HTTP response code is not equal to 200 we know that something is wrong, and for the service called we know that XML threat protection found a problem. Therefore we abort service execution by <dp:reject>.
So if service execution does not get aborted by <dp:reject> the results action from INPUT to OUTPUT just copies the unmodified INPUT to OUTPUT (preserving any CDATA section). Otherwise (in case of a detected XML threat) the service returns a generic error message to the client.
What we have seen is
how to process binary (Non-XML) input data (resulting in a binaryNode)
how to convert a binaryNode to a base64 encoded string
how to pass "binary" data to dp:url-open() by data-type="base64"
how to process response code of dp:url-open
By the arguments above there cannot be a solution without a side call.
And this is from stylesheet "store:///pkcs7-convert-input.ffd":
... <!-- This FFD converts the input into an XML tree like this: