Passing files to a Web service

SOAP and binary data

Comments

Content series:

This content is part # of # in the series: Tip

Stay tuned for additional content in this series.

This content is part of the series:Tip

Stay tuned for additional content in this series.

The evolution of Web service protocols has gone from supporting very simple requests with simple parameters to fully supporting modern, object-oriented languages. XML-RPC, arguably one of the earliest forms of Web services, only supported simple types -- strings, integers, booleans, and the like. SOAP took this one step further with its encoding rules for objects. The last step -- improving on the binary -- came with SOAP with attachments.

SOAP with attachments was originally introduced as an extension to SOAP 1.1, and it is supported by the major SOAP kits. Although SOAP 1.2, the official W3C release, does not support attachments yet, work is under way to include them in the (ideally) near future.

Web services and binary data

I have little doubt that XML's success in application integration comes from its reliance on a textual encoding (as opposed to binary protocols such as CORBA, an object-oriented RPC standard, or RMI, a Java-specific RPC standard). Textual encoding is preferable for several reasons, the most critical of which may be that it is easier to debug and easier to roll up a special implementation when the need arises.

Still, the reliance on textual encoding has a darker side, and XML offers no efficient solution for including binary data. According to the W3C XML Schema specification, binary data should be encoded in base 64 or hexadecimal. Unfortunately, 64-encoded data is 50% larger than non-encoded data. Hexadecimal encoding doubles the size. This overhead is acceptable for small pieces of binary data, but it is clearly an issue for larger sets.

Binary data is useful in many applications. For example:

  • Security applications need keys, hashes, certificates, and the encrypted data itself.
  • Multimedia applications work with photos, music, and movies.
  • In some applications, an XML representation of the data is deemed too inefficient -- CAD/CAM comes to mind.
  • Thousands of file formats predate XML: word processing, spreadsheets, fonts, vector graphics, genealogy, and many others.

While it is possible to create XML versions of these file formats (similar to SVG for vector graphics), binary data has been around for a long time and will likely remain popular.

Finally, there is the issue of XML itself! It is not trivial to include an XML document inside another XML document (the syntactically correct solution relies on CDATA sections and character escaping).

To address the needs of all these applications, Web services must support binary data efficiently. The proposed solution is SOAP with attachments which, in a nutshell, removes binary information from the XML payload and stores it directly in the HTTP request as multipart/related MIME content.

Your options, when designing a Web service that works with binary data, are:

  • If the dataset is small, you might consider base 64 encoding within the XML payload; the overhead is less of a problem with small datasets.
  • If the dataset is larger, an attachment is the only practical option.

Listing 1 is a SOAP request with a base 64-encoded parameter. Note the address element.

Listing 1. base 64-encoded parameter
POST /ws/retrieve HTTP/1.0
Content-Type: text/xml; charset=utf-8
Accept: application/soap+xml multipart/related, text/*
Host: localhost:8080
SOAPAction: ""
Content-Length: 540

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
                  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <soapenv:Body>
  <ps:retrieve 
           soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
           xmlns:ps="http://psol.com/2004/ws/retrieve">
   <address xsi:type="xsd:base64Binary">d3d3Lm1hcmNoYWwuY29t</address>
  </ps:retrieve>
 </soapenv:Body>
</soapenv:Envelope>

Implementing attachments

Attachments are available to Java developers through both JAX-RPC (the Java API for XML-based RPC) and SAAJ (SOAP with Attachments API for Java). Don't let the SAAJ acronym fool you: JAX-RPC supports attachments (see Related topics for an example). The difference between JAX-RPC and SAAJ is the level of abstraction, not the capabilities.

JAX-RPC is a high-level API that's more abstract than SAAJ. It hides most of the protocol-oriented aspects of SOAP behind an RMI layer. The developer works on Java objects and the pre-processor turns them into SOAP nodes. JAX-RPC uses the java.awt.Image and javax.activation.DataHandler classes to represent attachments.

SAAJ is closer to the protocol. It takes more work to create a SOAP message with SAAJ than with JAX-RPC (and furthermore it offers no automatic link to WSDL), so in most cases you will want to use JAX-RPC. Still the low-level aspects of SAAJ make it more suitable for illustrating how attachments really work. Listing 2 is a SOAP request with an attachment. The request asks the server to resize a photo; because photo files are large, an attachment is more efficient.

Listing 2. Attachment parameter
POST /ws/resize HTTP/1.0
Content-Type: multipart/related; type="text/xml"; 
     start="<EB6FC7EDE9EF4E510F641C481A9FF1F3>"; 
     boundary="----=_Part_0_7145370.1075485514903"
Accept: application/soap+xml, multipart/related, text/*
Host: localhost:8080
SOAPAction: ""
Content-Length: 1506005

------=_Part_0_7145370.1075485514903
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: binary
Content-Id: <EB6FC7EDE9EF4E510F641C481A9FF1F3>

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" 
                  xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <soapenv:Body>
  <ps:resize 
          soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" 
          xmlns:ps="http://psol.com/2004/ws/resize" 
          xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
   <source href="cid:E1A97E9D40359F85CA19D1B8A7C52AA3"/>
   <percent>20</percent>
  </ps:resize>
 </soapenv:Body>
</soapenv:Envelope>

------=_Part_0_7145370.1075485514903
Content-Type: image/jpeg
Content-Transfer-Encoding: binary
Content-Id: <E1A97E9D40359F85CA19D1B8A7C52AA3>

note: binary data deleted...

------=_Part_0_7145370.1075485514903--

Listing 3 illustrates the creation of the SOAP request. The request asks a server to resize an image. The procedure is as follows:

  • Create SOAP connection and SOAP message objects through factories.
  • Retrieve the message body from the message object (intermediary steps: retrieve the SOAP part and envelope).
  • Create a new XML element to represent the request and set the encoding style.
  • Create the attachment and initialize it with a DataHandler object.
  • Create more elements to represent the two parameters (source and percent).
  • Associate the attachment to the first parameter by adding an href attribute. The attachment is referred to through a cid (content-id) URL.
  • Set the value of the second parameter directly as text and call the service.

The service replies with the resized image, again as an attachment. To retrieve it, you can test for a SOAP fault (which indicates an error). If there are no faults, retrieve the attachment as a file and process it.

Listing 3. Using SAAJ
public File resize(String endPoint,File file)
{
   SOAPConnection connection =
      SOAPConnectionFactory.newInstance().createConnection();
   SOAPMessage message = MessageFactory.newInstance().createMessage();
   SOAPPart part = message.getSOAPPart();
   SOAPEnvelope envelope = part.getEnvelope();
   SOAPBody body = envelope.getBody();
   SOAPBodyElement operation =
      body.addBodyElement(
         envelope.createName("resize",
                             "ps",
                             "http://psol.com/2004/ws/resize"));
   operation.setEncodingStyle("http://schemas.xmlsoap.org/soap/encoding/");

   DataHandler dh = new DataHandler(new FileDataSource(file));
   AttachmentPart attachment = message.createAttachmentPart(dh);
   SOAPElement source = operation.addChildElement("source",""),
               percent = operation.addChildElement("percent","");
   message.addAttachmentPart(attachment);
   source.addAttribute(envelope.createName("href"),
                       "cid:" + attachment.getContentId());
   width.addTextNode("20");

   SOAPMessage result = connection.call(message,endPoint);
   part = result.getSOAPPart();
   envelope = part.getEnvelope();
   body = envelope.getBody();
   if(!body.hasFault())
   {
      Iterator iterator = result.getAttachments();
      if(iterator.hasNext())
      {
         dh = ((AttachmentPart)iterator.next()).getDataHandler();
         String fname = dh.getName();
         if(null != fname)
            return new File(fname);
      }
   }
   return null;
}

Note that Listing 3 makes it clear that the attachment is outside of the XML message! This is necessary for efficiency.

Speaking of efficiency, take a look at Listing 4, which illustrates the more common (and dramatically shorter) JAX-RPC version of Listing 3. The JAX-RPC precompiler generates a stub that greatly simplifies coding. You pass a DataHandler object as a method parameter and JAX-RPC automatically generates the attachment.

Listing 4. The more efficient JAX-RPC
public File resize(File file)
   throws ServiceException, RemoteException
{
   AttachmentService service = new AttachmentServiceLocator();
   AttachmentTip port = service.getAttachmentTip();   // get stub
   DataHandler dh = new DataHandler(new FileDataSource(file));
   DataHandler result = port.resize(dh,20);
   return new File(result.getName());
}

Conclusion

Choice is good, and SOAP gives you a choice when working with binary data: You can either encode it as base 64 within the XML payload, which is good for small datasets, or you can attach larger binary files, unencoded, to the request.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, SOA and web services
ArticleID=12372
ArticleTitle=Tip: Passing files to a Web service
publish-date=02132004