Tip: Passing files to a Web service

SOAP and binary data

In this tip, Benoît discusses the different solutions available for passing binary data (typically files) to a Web service.

Share:

Benoit Marchal (bmarchal@pineapplesoft.com), Consultant, Pineapplesoft

Photo of Benoit MarchalBenoît Marchal is a Belgian consultant. He is the author of XML by Example, Second Edition and other XML books. He works mostly on XML, Java technology, and e-commerce. For more on this topic, see www.marchal.com. You can contact him at bmarchal@pineapplesoft.com.



13 February 2004

Also available in Russian Japanese

The evolution of Web service protocols has gone from supporting very simple requests with simple parameters to fully supporting modern, object-oriented languages. XML-RPC, arguably one of the earliest forms of Web services, only supported simple types -- strings, integers, booleans, and the like. SOAP took this one step further with its encoding rules for objects. The last step -- improving on the binary -- came with SOAP with attachments.

SOAP with attachments was originally introduced as an extension to SOAP 1.1, and it is supported by the major SOAP kits. Although SOAP 1.2, the official W3C release, does not support attachments yet, work is under way to include them in the (ideally) near future.

Web services and binary data

I have little doubt that XML's success in application integration comes from its reliance on a textual encoding (as opposed to binary protocols such as CORBA, an object-oriented RPC standard, or RMI, a Java-specific RPC standard). Textual encoding is preferable for several reasons, the most critical of which may be that it is easier to debug and easier to roll up a special implementation when the need arises.

Still, the reliance on textual encoding has a darker side, and XML offers no efficient solution for including binary data. According to the W3C XML Schema specification, binary data should be encoded in base 64 or hexadecimal. Unfortunately, 64-encoded data is 50% larger than non-encoded data. Hexadecimal encoding doubles the size. This overhead is acceptable for small pieces of binary data, but it is clearly an issue for larger sets.

Binary data is useful in many applications. For example:

  • Security applications need keys, hashes, certificates, and the encrypted data itself.
  • Multimedia applications work with photos, music, and movies.
  • In some applications, an XML representation of the data is deemed too inefficient -- CAD/CAM comes to mind.
  • Thousands of file formats predate XML: word processing, spreadsheets, fonts, vector graphics, genealogy, and many others.

While it is possible to create XML versions of these file formats (similar to SVG for vector graphics), binary data has been around for a long time and will likely remain popular.

Finally, there is the issue of XML itself! It is not trivial to include an XML document inside another XML document (the syntactically correct solution relies on CDATA sections and character escaping).

MIME and base 64

To clear up a source of frequent confusion, MIME does not mandate base 64 encoding. Specifically, HTTP implementations do not encode attachments; only mail clients encode attachments to work around limitations in SMTP (so there's no gain when compared to XML).

To address the needs of all these applications, Web services must support binary data efficiently. The proposed solution is SOAP with attachments which, in a nutshell, removes binary information from the XML payload and stores it directly in the HTTP request as multipart/related MIME content.

Your options, when designing a Web service that works with binary data, are:

  • If the dataset is small, you might consider base 64 encoding within the XML payload; the overhead is less of a problem with small datasets.
  • If the dataset is larger, an attachment is the only practical option.

Listing 1 is a SOAP request with a base 64-encoded parameter. Note the address element.

Listing 1. base 64-encoded parameter
POST /ws/retrieve HTTP/1.0
Content-Type: text/xml; charset=utf-8
Accept: application/soap+xml multipart/related, text/*
Host: localhost:8080
SOAPAction: ""
Content-Length: 540

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
                  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <soapenv:Body>
  <ps:retrieve 
           soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
           xmlns:ps="http://psol.com/2004/ws/retrieve">
   <address xsi:type="xsd:base64Binary">d3d3Lm1hcmNoYWwuY29t</address>
  </ps:retrieve>
 </soapenv:Body>
</soapenv:Envelope>

Implementing attachments

Attachments are available to Java developers through both JAX-RPC (the Java API for XML-based RPC) and SAAJ (SOAP with Attachments API for Java). Don't let the SAAJ acronym fool you: JAX-RPC supports attachments (see Resources for an example). The difference between JAX-RPC and SAAJ is the level of abstraction, not the capabilities.

JAX-RPC is a high-level API that's more abstract than SAAJ. It hides most of the protocol-oriented aspects of SOAP behind an RMI layer. The developer works on Java objects and the pre-processor turns them into SOAP nodes. JAX-RPC uses the java.awt.Image and javax.activation.DataHandler classes to represent attachments.

SAAJ is closer to the protocol. It takes more work to create a SOAP message with SAAJ than with JAX-RPC (and furthermore it offers no automatic link to WSDL), so in most cases you will want to use JAX-RPC. Still the low-level aspects of SAAJ make it more suitable for illustrating how attachments really work. Listing 2 is a SOAP request with an attachment. The request asks the server to resize a photo; because photo files are large, an attachment is more efficient.

Listing 2. Attachment parameter
POST /ws/resize HTTP/1.0
Content-Type: multipart/related; type="text/xml"; 
     start="<EB6FC7EDE9EF4E510F641C481A9FF1F3>"; 
     boundary="----=_Part_0_7145370.1075485514903"
Accept: application/soap+xml, multipart/related, text/*
Host: localhost:8080
SOAPAction: ""
Content-Length: 1506005

------=_Part_0_7145370.1075485514903
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: binary
Content-Id: <EB6FC7EDE9EF4E510F641C481A9FF1F3>

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" 
                  xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <soapenv:Body>
  <ps:resize 
          soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" 
          xmlns:ps="http://psol.com/2004/ws/resize" 
          xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
   <source href="cid:E1A97E9D40359F85CA19D1B8A7C52AA3"/>
   <percent>20</percent>
  </ps:resize>
 </soapenv:Body>
</soapenv:Envelope>

------=_Part_0_7145370.1075485514903
Content-Type: image/jpeg
Content-Transfer-Encoding: binary
Content-Id: <E1A97E9D40359F85CA19D1B8A7C52AA3>

note: binary data deleted...

------=_Part_0_7145370.1075485514903--

Listing 3 illustrates the creation of the SOAP request. The request asks a server to resize an image. The procedure is as follows:

  • Create SOAP connection and SOAP message objects through factories.
  • Retrieve the message body from the message object (intermediary steps: retrieve the SOAP part and envelope).
  • Create a new XML element to represent the request and set the encoding style.
  • Create the attachment and initialize it with a DataHandler object.
  • Create more elements to represent the two parameters (source and percent).
  • Associate the attachment to the first parameter by adding an href attribute. The attachment is referred to through a cid (content-id) URL.
  • Set the value of the second parameter directly as text and call the service.

The service replies with the resized image, again as an attachment. To retrieve it, you can test for a SOAP fault (which indicates an error). If there are no faults, retrieve the attachment as a file and process it.

Listing 3. Using SAAJ
public File resize(String endPoint,File file)
{
   SOAPConnection connection =
      SOAPConnectionFactory.newInstance().createConnection();
   SOAPMessage message = MessageFactory.newInstance().createMessage();
   SOAPPart part = message.getSOAPPart();
   SOAPEnvelope envelope = part.getEnvelope();
   SOAPBody body = envelope.getBody();
   SOAPBodyElement operation =
      body.addBodyElement(
         envelope.createName("resize",
                             "ps",
                             "http://psol.com/2004/ws/resize"));
   operation.setEncodingStyle("http://schemas.xmlsoap.org/soap/encoding/");

   DataHandler dh = new DataHandler(new FileDataSource(file));
   AttachmentPart attachment = message.createAttachmentPart(dh);
   SOAPElement source = operation.addChildElement("source",""),
               percent = operation.addChildElement("percent","");
   message.addAttachmentPart(attachment);
   source.addAttribute(envelope.createName("href"),
                       "cid:" + attachment.getContentId());
   width.addTextNode("20");

   SOAPMessage result = connection.call(message,endPoint);
   part = result.getSOAPPart();
   envelope = part.getEnvelope();
   body = envelope.getBody();
   if(!body.hasFault())
   {
      Iterator iterator = result.getAttachments();
      if(iterator.hasNext())
      {
         dh = ((AttachmentPart)iterator.next()).getDataHandler();
         String fname = dh.getName();
         if(null != fname)
            return new File(fname);
      }
   }
   return null;
}

Note that Listing 3 makes it clear that the attachment is outside of the XML message! This is necessary for efficiency.

Speaking of efficiency, take a look at Listing 4, which illustrates the more common (and dramatically shorter) JAX-RPC version of Listing 3. The JAX-RPC precompiler generates a stub that greatly simplifies coding. You pass a DataHandler object as a method parameter and JAX-RPC automatically generates the attachment.

Listing 4. The more efficient JAX-RPC
public File resize(File file)
   throws ServiceException, RemoteException
{
   AttachmentService service = new AttachmentServiceLocator();
   AttachmentTip port = service.getAttachmentTip();   // get stub
   DataHandler dh = new DataHandler(new FileDataSource(file));
   DataHandler result = port.resize(dh,20);
   return new File(result.getName());
}

Conclusion

Choice is good, and SOAP gives you a choice when working with binary data: You can either encode it as base 64 within the XML payload, which is good for small datasets, or you can attach larger binary files, unencoded, to the request.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into XML on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML, SOA and web services
ArticleID=12372
ArticleTitle=Tip: Passing files to a Web service
publish-date=02132004