Skip to main content

skip to main content

developerWorks  >  XML  >

XML Watch: WBXML and basic SyncML server requirements

Use SyncML to mobilize your data

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Intermediate

Edd Dumbill (edd@xml.com), Editor and publisher, xmlhack.com

01 Mar 2003

In the second installment of his quest to make his data available wherever and whenever he wants by using SyncML, Edd Dumbill encounters Wireless Binary XML (WBXML) and examines the minimum functionality required for a SyncML server.

In my previous column, I introduced my mission to investigate and deploy SyncML. Increasingly, people are becoming users of multiple devices, depending on location and occupation. When they travel or change devices, they want their data to come with them. This is the central function of the SyncML XML protocol, which is rapidly becoming a checkbox item on the feature lists of today's mobile phones.

Last time, I gave a high-level overview of SyncML, and showed what happened when I captured the first SyncML message sent from my Ericsson R520m mobile phone to my Web server. The most surprising thing about this message was that it was encoded not in XML, but in Wireless Binary XML (WBXML). WBXML is a standard developed by the WAP Forum, and is intended to provide a space and a CPU-efficient XML representation.

Many XML developers have never encountered WBXML before, as it is largely used in proprietary cell phone networks. However, supporting SyncML requires the ability to handle this encoding, as well as the straight XML encoding. In this installment, I give a brief overview of the WBXML encodings, the steps involved in processing the WBXML encoding of SyncML into XML, and what is required to go back the other way, from XML to WBXML. I also introduce the main elements of the SyncML protocol, to set the stage for creating a SyncML server.

WBXML overview

The topic of binary notations for XML is one of the more enduring permathreads that have continued through the five or so years of XML developer discussion. At a high level, opinion is divided into two camps: The first of these favors a custom encoding, such as that used by WBXML; the second maintains that compressing normal XML will achieve similar space savings. To me, the second approach has always seemed preferable, as it provides the ability to re-use common and well-known software components.

However, the WAP Forum decided to pursue a custom encoding scheme -- WBXML. Along with several other technical decisions made by the forum, this has received its share of criticism over time. (See Resources for links to critiques of these specifications.)

As Figure 1 from the previous column might indicate to you, WBXML takes a tokenizing approach to encoding XML. The most common constructs -- such as tags, attributes, and attribute values -- are reduced to one-byte tokens, with some literal text left in the clear. WBXML also allows for common strings to be reduced to tokens as well, with the token table sent as part of the document preamble.

WBXML implements the equivalent of XML namespaces through code pages. As only 27 tokens are available for elements -- 5 bits, with the values 0 through 4 being reserved -- complex vocabularies need to be multiplexed by organizing token sets into separate code pages. Switching code pages is analogous to switching the default namespace. The WBXML encoding of SyncML uses a code page for each of the DTDs used in the protocol: SyncML, SyncML Meta Information, and SyncML Device Information.

Processing WBXML is reasonably simple: It is a matter of reading the document preamble, selecting a set of appropriate token tables (specific to the DTD of the application), and then consuming the tokens of the document. Possible tokens include start/end element, switch code page, entity, processing instruction, tokenized string, literal string, various extension tokens, and opaque data. This latter, opaque, is the WBXML equivalent of XML's CDATA. The extension tokens are used in different ways by different WBXML applications. The SyncML encoding doesn't use them at all, but WML uses them for including tokenized body-text strings from a string table sent in the preamble. It should be obvious by now that WBXML isn't a general-purpose binary encoding of XML. Every application requires at least a token table to perform lookups, and often needs code to interpret the extension tokens as well.

For these SyncML investigations, I propose to convert the incoming WBXML representation of SyncML into its XML representation, to make debugging and module separation more convenient. This still leaves the issue of how to go back the other way.



Back to top


Converting back to XML

If you thought interpreting WBXML was awkward, I'm afraid you will view creating WBXML even less favorably. Again, although there is a general-purpose algorithm for converting XML into WBXML, this is complicated by the application-specific utilization of extensions. To give you a good idea of what is involved, read this extract from the WAP Forum's WBXML specification (WAP-192-WBXML-20010725-a Version 1.3 -- see Resources for a link to this document).

The process of tokenising an XML document MUST convert all markup and XML syntax (i.e., entities, tags, attributes, etc.) into their corresponding tokenised format. All comments, the XML declaration, and the document type declaration MUST be removed. Processing instructions intended for the tokeniser MAY be removed; all other processing instructions MUST be preserved. All text and character entities MUST be converted to string (e.g., STR_I) or entity (ENTITY) tokens. All character entities in the textual markup (e.g., &) which can be represented in the target character encoding MUST be converted to string form when tokenised. All others (i.e., those which can not be represented in the target character encoding) MUST be encoded using the ENTITY token. XML parsed entities (both internal and external) MUST be resolved before tokenisation. XML notations and unparsed entities are resolved on an application basis (e.g., using inline opaque data). Attribute names MUST be converted to an attribute start token (which, if so defined, will also specify all or part of the attribute value) or MUST be represented by a single LITERAL token. Attribute values MUST NOT be encoded using a LITERAL token.

One significant consideration with the WBXML encoding of SyncML is the length of each document. Wireless devices have relatively small amounts of memory, and clearly cannot process responses of arbitrary length. The most common example of this restriction can be seen in WML that's used for WAP pages, where each deck of pages should not exceed approximately 1500 bytes. Obviously, the chunking of output into appropriate sizes is a matter of negotiation between the application and the encoding module, as only the application knows the most appropriate place to break.

Rather than leaving it to guesswork, SyncML allows for each device to indicate the size of the message it can handle. The Meta Information DTD's MaxMsgSize element is used for this purpose. For example, look at the extract in Listing 1, taken from the example.xml file in the accompanying download (see Resources).


Listing 1. Header information from a SyncML payload, showing meta information
		
<SyncHdr><VerDTD>1.0</VerDTD>
<VerProto>SyncML/1.0</VerProto>
<SessionID>10</SessionID>
<MsgID>1</MsgID>
<Target><LocURI>sync.example.com</LocURI>
</Target>
<Source><LocURI>520327511080721</LocURI>
</Source>
<Cred><Meta><Format xmlns='syncml:metinf'>b64</Format>
<Type xmlns='syncml:metinf'>syncml:auth-basic</Type>
</Meta>
<Data>ZDpk</Data>
</Cred>
<Meta><MaxMsgSize xmlns='syncml:metinf'>2700</MaxMsgSize>
		</Meta>
</SyncHdr><dl>



Back to top


Basic SyncML server requirements

That's enough bits and bytes for now. Let's conclude by looking at the basic features that a SyncML server is required to implement to provide useful data synchronization functionality.

At a minimum, the server must be able to understand the basic SyncML vocabularies. Additionally, it must support the vCard, vCalendar, vTodo, and RFC2822/RFC2045 specifications if it implements contacts, calendars, tasks, and e-mail respectively. (See Resources for links to these specifications.)

However, a server is not required to implement all of the SyncML protocol's functionality. Full details of conformance requirements can be found in section 7 of the SyncML Representation Protocol specification (see Resources for a link to the SyncML specifications). Table 1 describes the semantics of basic SyncML operations, and summarizes basic functionality.

Table 1: Description of minimum server commands for SyncML

Command

Description in the context of a SyncML server

Add

Used to indicate to the server new additions made in the client's database (for example, a new entry in the phone book).

Alert

Used to carry notifications to the server. These are requests for synchronization that carry data about the state of the client's database. Refer to the Alerts with CmdID 2 and 3 in example.xml to see requests for synchronization of calendar and phone book. The associated code in the Data element specifies the type of request, in this case 201, which means "Slow Synchronization". A full list of codes can be found in the "Errata to SyncML Sync Representation" specification (see Resources).

Copy

Requests the creation of a copy of an item in a new location on the recipient's database.

Delete

Requests the permanent removal of an item from the server's database.

Get

Explicitly requests the retrieval of a data item with the requested URI from the server's database. Is used as a one-shot command outside of device synchronization.

Map

Used to maintain a map that correlates local resource identifiers to remote ones. For instance, an item on a phone might have a 2-byte identifier, while a 16-character string is used on the server as the same item's ID.

Put

Used to upload a data item to the server to the specified URI. For instance, in example.xml see the Put with CmdID 1. This requests the server to store the phone's capabilities (encoded using the SyncML Device Information DTD) at the relative URI ./devinf10. Put is used outside of device synchronization.

Replace

Requests the replacement of a specified object as part of synchronization.

Results

Used to carry the objects returned as a result of a request such as Get.

Status

Used to return status codes associated with requests.

Sync

Used to wrap a selection of commands (such as Add, Replace, and Delete) forming a synchronization.

The basic requirements for a SyncML client are similar to those for a server. I'll explore these further as I get deeper into implementing the protocol itself in future installments of XML Watch.

SyncML employs the semantics of URIs to indicate items on the local and remote databases. This means that a file system would serve as a reasonable substrate for a synchronization database. With this in mind, the next installment will focus on the construction of a basic server that is able to use either WBXML or XML-encoded SyncML.



Resources



About the author

Edd Dumbill is managing editor of XML.com and the editor and publisher of the XML developer news site XMLhack. He is co-author of O'Reilly's Programming Web Services with XML-RPC, and co-founder and adviser to the Pharmalicensing life sciences intellectual property exchange. Edd was also program chair of the XML Europe 2002 conference. You can contact Edd at edd@xml.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top