We have some marshalling* code that takes objects and emits XML to send across the Internet. We've had a Java-based demarshaller for a long time, but I needed to write a demarshaller for the Ajax-based web client. It was challenging to parse, because we'd used a structure that favored XML elements over attributes. E.g., Say we marshalled a list of "person" objects - they'd look like this from the element-centric marshaller:
<person> <firstName>Bill</firstName> <lastName>Higgins</lastName> <emailAddress>email@example.com</emailAddress> <city>Durham</city> <state>NC</state></person><person> <firstName>Lou</firstName> <lastName>Gerstner</lastName> <emailAddress>firstname.lastname@example.org</emailAddress> <city>Armonk</city> <state>NY</state></person>Listing 1: Element-based XML; easier for humans, harder for computer programs.
Now compare the above from XML generated by an attribute-centric marshaller:
<person firtName="Bill" lastName="Higgins" emailAddress="email@example.com" city="Durham" state="NC"/><person firtName="Lou" lastName="Gerstner" emailAddress="firstname.lastname@example.org" city="Armonk" state="NY"/>Listing 2: Attribute-based XML; harder for humans, easier for computer programs.
I spoke with Balaji (our marshalling guru) about changing the element-centric (upper) version to the attribute-centric (lower) version, and it turned out to be a single-line of code change. So my natural question was "why were we using the element-centric version which is much trickier for a programming language to parse?" And the answer was that several people considered the element-centric version easier to read, which I must admit it is - especially once the objects start having many many attributes and as you start having long lists of objects.
This is what puzzled me. Structurally, the element-based system is much more complex, but most people considered it much easier to read. So why is it easier for humans to "parse" one version, and easier for a programming language to parse the other. Both are textual in nature, so it's not like the apples-to-oranges question "why is it easier for computers to understand binary than text while binary is inscrutable to humans?". No, this one is more subtle. The only thing I can think of is that the human brain is sophisticated enough that rather than being put off by the extra structure, it takes advantage of it to more easily chunk the information into digestible nuggets. I.e. the element-based form changes the "person" definitions into something resembling a paragraph rather than just a list of run-on sentences.
Anyhow, I'm not really sure where this post was going, but I'm becoming more and more interested about how to make things easier - whether the "user" is an application user, a developer coding against your API, or a computer programming consuming some data you're emitting. Based on this example it seems pretty clear that the rules of simplicity vary based on the "user".
PS and FYI - a side-effect of the move to Roller is that readers no longer must register with developerWorks to comment, thus greatly simplifying the feedback process and hopefully resulting in more feedback!
* A marshaller is a program that converts information from typical in-memory data structures to a form (like XML) that is easy to transmit across the network; more on this in a later blog.[Read More]