You hear a lot about XML messaging within the enterprise and over the Internet, but not a great deal about its use on the desktop. This isn't entirely surprising, as the various remote procedure call (RPC) systems in use in operating systems are pretty well entrenched. As a result, there isn't a big demand to change them. Additionally, developers involved with operating system messaging strategies tend to be suspicious of XML's verbosity.
But what if you had an application that needed to talk to other applications, and needed to achieve that interoperability in a very short period of time? XML suddenly has lots of advantages: Tool support is ubiquitous, it's easy to share a definition of permissible messages through a schema, and XML's plain-text format makes writing documentation for it simple.
That's pretty much the route that Nat Friedman and his fellow developers (see Resources) took when developing Dashboard, an open source application that provides a continuous real-time search of your desktop information to show you content relevant to what you are doing on your computer at any particular time. Dashboard's integration of other applications depends heavily on it being easy to alter those applications to communicate with it.
Dashboard receives contextual information from the focused application and uses it to query a selection of back-end stores. For instance, when reading an e-mail, your mail client will send Dashboard the e-mail address of your correspondent, along with the subject line and body text. The back-end stores then search for related information -- for instance, an address book back end will bring up an address card for your correspondent -- and feed that to the Dashboard user interface. As Figure 1 shows, the user's mobile phone receives a text message from Edd Dumbill, and Dashboard shows the user associated information it has stored about the sender:
Figure 1. Dashboard in action
It's easy to see useful applications for Dashboard. On the desktop, the user is currently lost among a collection of data islands, each bounded by the application the data belongs to. Dashboard provides a unifying channel for these applications and a chance for the user to get a lot more value out of each application's data. In a business setting, many daily operations depend heavily on context and related information. For example, in customer relationship management (CRM) it is vital that salespeople have at their fingertips all the data related to the customer they're dealing with.
The key thing about Dashboard is that it relies on the desktop's applications being able to send it hints -- or in Dashboard terminology, clues -- about what they are doing. From that perspective, using Dashboard is only practical either where you have strong leverage over the entire desktop (such as with Mac OS X) or in the open source world, where you can modify the relevant applications yourself. Falling into the latter of these categories, Dashboard is currently targeted at the GNOME open source desktop, and has been successful in creating a bubble of enthusiasm that has a lot of people (your author included) writing new code to interface with it.
Figure 2 shows an outline diagram of Dashboard's architecture, illustrating the flow of data between user applications, the main Dashboard engine, and the specialized queriable data store back ends:
Figure 2. A broad overview of Dashboard's architecture
Dashboard currently runs on x86 Linux, under Ximian's Mono .NET run time (see Resources), and is written in C#. It is experimental software and has not yet been released (something to bear in mind before embarking on too detailed a critique of its architecture). It is, however, capturing the imagination of a large number of developers and commentators. After Dashboard was demonstrated at the O'Reilly Open Source Convention this year, the room was abuzz, and even Linux desktop skeptic Tim O'Reilly was moved to wonder why Mac OS X hadn't come up with a Dashboard-like product yet.
How do applications tell Dashboard what they're doing? Enter the cluepacket, an XML document carrying typed context information. Cluepackets are sent from user applications, then distributed to each back end by the Dashboard engine. If any data from the back end matches data in the cluepacket, the back end returns a result in HTML. It then optionally augments the cluepacket with new metadata that might enable other back ends to get a better result.
Listing 1 shows an example cluepacket from Phone Manager, which was depicted earlier in Figure 1. Phone Manager is an application I wrote to send and receive text messages using a cell phone (see Resources).
Listing 1. An example cluepacket
<CluePacket>
<Frontend>Phone Manager</Frontend>
<Context>Text Message</Context>
<Focused>True</Focused>
<Clue Type="phone" Relevance="10">+447867093093</Clue>
<Clue Type="textblock" Relevance="10">
Will you email john@smith.com with today's
schedule? Thanks.
</Clue>
</CluePacket>
|
You can see that the main pieces of metadata a cluepacket carries are its source and -- for each contained clue -- its type, relevance, and name. The so-called cluetypes are currently an open-ended list that includes:
- date
- full_name
- chat IDs (
aim_name,yahoo_name,msn_name,icq_name,jabber_name) - org (the name of an organization)
- address
- url
You can find a link to a more complete list in the Resources section.
Some of these types are pretty free-form, and -- to the XML developer used to tying things down with schemas -- a source of discomfort. However, to imagine that you could unify information structuring among desktop applications overnight would be desperately naive, and even then you'd be faced with the inconsistency of humans entering that information. Dashboard's approach is to go for coarse-grained typing, and make the back end query engines take the weight of deciding what's significant. This seems logical enough, and draws some of its reasoning from the success of the approach that Google takes with the Web. For a metadata-obsessed person like myself, however, it seems somewhat of a shame not to record the richest metadata one can manage when it's actually available.
The transmission protocol used for these cluepacket messages is currently a simple TCP/IP socket. Applications connect to the socket on localhost, spit the XML out, and close the socket again. The reasoning behind such a simple approach
seems to be for ease of implementation: The hardest problem to solve in writing an information aggregator of any sort is getting the buy-in from the data providers. Dashboard bundles several examples on how to do this. Adding support to Phone Manager, for instance, required only a few extra lines of C when I used Dashboard's convenience functions.
Figure 3 shows a screenshot of how Web browsing can be augmented using Dashboard. The browser is viewing my personal weblog. It sends the URL and HTML body content to Dashboard. One of the back ends scans the HTML and finds some keywords in the
<head> section, which it then links to my RDF Site Summary (RSS) and friend-of-a-friend (FOAF) files. The incoming clues are then augmented with clues of types keyword, rss, foafid, and rdfurl, which other back ends then use. The bookmarks back end then matches the keywords against my bookmarks, the RSS back end retrieves the latest four articles from my site, and the FOAF back end shows my personal information.
Figure 3. Enhanced Web browsing with the Epiphany Web browser and Dashboard
The communication taking place on the left side of Figure 2 is achieved differently from that on the right side. While the one-way communication between user applications and Dashboard is done through simple XML messaging, the communication on the right side is done by C# method calls. That is, all back ends must be implemented in .NET (for now, this means C#). Run time reflection is used to instantiate and call each back end that Dashboard discovers when it is run. Back end queries are run in threads and paralleled because they might take some time to complete, especially if complex searching or network access is required to provide results.
As I mentioned above, there is a curious imbalance in Dashboard: While any language can be used to send cluepackets, only a .NET language can be a back end. To address this and interface with Web applications, I wrote a small bridging class that rebroadcasts cluepackets to an HTTP server using HTTP POST. The cluepacket is sent as a parameter to the POST, and a return value of either HTML or XML is expected. (Currently, my code is rather inefficient and does the POST twice -- once for the HTML result and once for any XML clue augmentation.) The code for the bridge is available from GNOME CVS, (see Resources). I've used it successfully to integrate Dashboard with FOAFbot, an aggregating spider for FOAF files that I have mentioned in several previous XML Watch columns (see Resources). Because many application frameworks can expose HTTP server interfaces, the HTTP bridge might prove useful for legacy integration.
Dashboard is a very good first attempt at trying to bring together islands of data on the desktop. Its use of XML might well be a harbinger of the way XML could liberate desktop data further in the future. Its authors have made some interesting choices in order to maximize its adoption:
- Dashboard has avoided the temptation of aiming for perfection, but instead
espouses an experimental development approach.
- Over-engineering has been avoided in favor of simple solutions that work.
Namespaces, RDF, and URIs could easily drift into an application like this, but focusing on that at the beginning would only make it less accessible for the all-important user application instrumentation.
- Dashboard's use of XML and a drop-dead simple network protocol brings many players in quickly -- having the source data available is vital to the experimentation needed to find out what architecture the Dashboard engine needs.
In the immature genre of unified desktops, Dashboard might well turn out to be the one you make to throw away after a sufficient amount of experimentation has been achieved. In the meantime, the combination of open source, a simple infrastructure, and XML means that it's attracting a lot of attention and contributors.
- Find the latest news and status information about Dashboard on the
Dashboard weblog, maintained by Nat
Friedman.
- Dashboard is written using the Mono .NET
implementation for Linux.
- The Dashboard code can be browsed using the GNOME CVS Web view.
- Dashboard runs under the GNOME open source desktop.
- The author's Phone
Manager application sends a notification to Dashboard when it receives a
text message.
- A full list of cluetypes can be found in the Dashboard documentation.
- Dashboard's HTTP
Bridge back end enables non-.NET legacy applications to act as
back ends.
- Read more about FOAF in XML Watch: Finding friends with XML and RDF (developerWorks, June 2002), and XML Watch: Support online communities with FOAF (developerWorks, August 2002).
- Discover more about expressing metadata in RDF and XML in An introduction to RDF (developerWorks, December 2000).
- Find more XML resources on the developerWorks
XML zone. Read previous installments in the
XML Watch
column series.
- IBM's DB2 database provides not only relational database storage, but also XML-related tools such as the DB2 XML Extender , which provides a bridge between XML and relational systems. Visit the DB2 Developer Domain to learn more about DB2.
- Find out how you can become an IBM Certified Developer in XML and related technologies.
Edd Dumbill is managing editor of XML.com and the editor and publisher of the XML developer news site XMLhack. He is co-author of O'Reilly's Programming Web Services with XML-RPC, and co-founder and adviser to the Pharmalicensing life sciences intellectual property exchange. Edd is also program chair of the XML Europe conference. You can contact him at edd@xml.com.





