Skip to main content

skip to main content

developerWorks  >  XML | Java technology  >

XML and Java technology: Sun's Java and XML APIs: Helping or hurting?

Is Sun wrapping APIs or taking them over?

developerWorks
Document options

Document options requiring JavaScript are not displayed

Discuss


Rate this page

Help us improve this content


Level: Introductory

Brett D. McLaughlin, Sr. (brett@newInstance.com), Author and Editor, O'Reilly Media, Inc.

10 Jul 2007

Long-time readers know that Brett McLaughlin loves SAX and doesn't mind so much the DOM. What he doesn't particularly like is that Sun has gone from wrapping these APIs in packages like JAXP (early versions) to almost completely taking them over in the latest versions of Java™ technology and the Java 2 Platform, Standard Edition 1.2-1.4 (J2SE).

Remembering JAXP (in the old days)

For programmers newer to the Java and XML scene, or who come to XML through the lens of Sun and the J2SE, it's worth briefly remembering JAXP in its earliest days. At that stage, JAXP was the third API to the Java and XML party, following (well) after the introduction and popularity of SAX, the Simple API for XML, and the DOM, the Document Object Model. And JAXP's goal was simple: to make using SAX and the DOM easier, in particular in the area of vendor neutrality.

JAXP was a wrapper API

JAXP was initially intended merely to provide convenience and vendor-neutrality to SAX and DOM. In light of that, it was never intended as a replacement for SAX or the DOM; in fact, in its earlier versions, JAXP had very few methods, and two of those were getXMLReader() and getDOMParser(). These methods made it pretty obvious that JAXP's authors intended developers to use JAXP and then work with the underlying SAX and DOM implementation classes.

It's also important to note that while JAXP has added a lot of functionality over the years, those two methods were never modified. While some might argue that this was a decision based on backwards-compatibility, it really reflects that JAXP simply was never intended to replace SAX or DOM, but just to wrap them and let developers get into SAX or DOM without lots of vendor-specific code.

JAXP provided vendor-neutrality

In the early days of Java and XML programming, there were a lot of XML parsers (Xerces, XML4J in its prime, Sun's Crimson, Oracle's XML parser, and several others that few people today have ever heard of). When you wrote an application that worked and interacted with XML, you had to connect your SAX and DOM APIs to these parser implementations, usually by telling SAX or the DOM about the parser class name, sort of like this:

Parser parser = new org.apache.xercers.parsers.SAXParser();

Note: I used the older SAX Parser interface intentionally; that's the older, SAX 1 parser class, and it was what we all worked with at the time JAXP became an issue.

JAXP introduced a system property, javax.xml.parsers.SAXParserFactory, which allowed you to specify the parser factory implementation that provided the parser you wanted to use. You can specify this through a system property with System.setProperty(), or through a jaxp.properties file in several locations (essentially anywhere in your application's classpath).

However you chose to specify this property—or its DOM counterpart, javax.xml.parsers.DocumentBuilderFactory —you'll avoid ever putting any classnames in your parsing code. That was the initial reason for JAXP to exist: to remove any chance of you putting that sort of information directly into your code. You can change out properties easily through changing that property value, or even keeping multiple versions of jaxp.properties around for different parser implementations, and switch them out as needed.



Back to top


JAXP Today

You can argue about what Sun originally intended JAXP to be, and even disagree about its value, but all of that's a bit moot at this point. What isn't moot, and worth continuing discussion, is what JAXP is now, and how it's being used. With JAXP now part of every Java release [J2SE and Java Platform, Enterprise Edition 5 (Java EE); Java Platform, Micro Edition (Java ME), is an obvious exception], probably 95% of all Java and XML developers use it.

JAXP as a substitute for SAX and DOM

More and more, JAXP is no longer wraps SAX and DOM, but actually replaces them. Now keep in mind that JAXP is not a true parsing API on its own, in the sense that it requires a SAX or DOM parser to operate. So JAXP can never functionally replace SAX or DOM; however, it can practically replace them, in that developers stop using methods or classes in the SAX package (org.xml.sax) or the DOM package (org.w3c.dom).

One way to verify this is to simply ask developers. Because Sun promotes (understandably so) its own APIs, and because JAXP just "comes with" current versions of Java technology, many developers are introduced to XML through JAXP. Naturally, they learn to use JAXP, even though most developers are better served to come to JAXP after they already have a solid understanding of SAX and DOM on their own. In fact, many developers don't even realize that SAX and the DOM are the underpinnings of JAXP at all!

All of this adds up to JAXP increasingly obscuring SAX and the DOM from the views of many developers. Talking about a ContentHandler or a DOMImplementation is largely a thing for the past, or at least relegated to pretty high-end Java and XML programmers. That's very different from even five years ago, when JAXP was still evolving and taking off; at that point, developers were more balanced in that they generally were comfortable with at least one of SAX or the DOM (and in many cases both) in addition to JAXP.

Adding functionality, or giving it back?

Even more important than being balanced, though, is that using only JAXP—instead of using SAX or DOM directly, or using JAXP along with those APIs—dramatically limits XML programming and parsing functionality. Because JAXP really is just a wrapper API (no matter how people are using it), it simply can't support every option that SAX and the DOM provide. While you can set features and properties on parsers with JAXP, and deal with content and basic error handling, SAX in particular offers a lot of events related to grammars (DTDs and schemas) and more advanced lexical events like processing instructions. The only way to access these events is to work directly with the SAX XMLReader interface.

Keep in mind that I don't advocate dropping JAXP altogether. You can use JAXP for its original purpose—access to a parser without dealing directly with vendor parser classes in your code—and then use JAXP's getXMLReader() method to get the SAX XMLReader interface. From there, it's easy to work with SAX directly—but all this assumes that you know how to work with the XMLReader interface in the first place.

So the point of all this is that JAXP ostensibly is to add functionality—vendor-neutrality, and some convenience and helper methods—but could potentially be effectively removing it. If developers become too dependent upon JAXP, which is probably already becoming the case, then it's very easy to forget or even not realize that JAXP doesn't expose a lot of functionality in SAX and the DOM. So while JAXP is meant to offer functionality, it can actually reduce the tools in a Java and XML programmer's toolbox.

Open source issues?

I left this issue for last, largely because it gets into ethics, legalities, and all sorts of things that most programmers find dry and boring; and as if that wasn't enough, mix in some philosophy of open source. In general, though, I wonder if Sun—while abiding to the letter of the law—isn't violating the spirit of the law (of open source and community). When they took the SAX API under their roof, and then begin to add functionality into the JAXP portion of things, I wonder why they did not simply roll this functionality into SAX itself? It certainly would still be possible to add JAXP methods that called into that new functionality (that's what most of their API is). But could not some of this have been submitted back into the SAX API as a whole?

In fact, if JAXP is as useful as Sun wants us all to believe (and I'm not necessarily arguing its usefulness, to be clear), then why would they not make that functionality available to those of you (us!) who, at least sometimes, prefer to work with SAX alone, and not involve JAXP. Lest there be any confusion, you'd still have to use Java technology, so Sun isn't losing any business (which even then is joking, as Sun doesn't sell Java technology) by making that functionality available not just through JAXP, but in SAX itself. Again, this is a fairly minor point, but it's worth thinking about: if JAXP offers so much value, might not some of that value be donated back to the underlying APIs? JAXP would still have the benefit of making translation between SAX and DOM easy, but this would be a nice nod to the XML community.



Back to top


In conclusion

I don't want to start a rant here (although I wouldn't mind it if some of you did on the discussion forums), but I do question JAXP—and other Sun APIs that have followed in the same vein—and its evolution from being a wrapper API to a "you don't need anything else" parser API. I think that JAXP has obscured the value of learning SAX and the DOM APIs on their own, without really providing significant value in return. I'd much rather Sun keep JAXP as a light layer for vendor-neutrality (for those that need it), and leave the parsing methods and behavior firmly in the territory of XML parser and API vendors.

Share this...

digg Digg this story
del.icio.us Post to del.icio.us
Slashdot Slashdot it!

Of course, Sun hasn't called me asking for my opinion, and Scott McNealy certainly has never heard of me, so this isn't going to ring any bells over at Sun. However, I imagine if you're a Java and XML programmer, it has at least made you question your use of JAXP—that's certainly the intention. If programmers would (re-)learn SAX and the DOM, we could begin to use JAXP intelligently again, and in fact write better applications, because we're able to work with XML documents at a lower level, and with more competence. And then even Scott might notice the killer apps we're writing, wouldn't he? So you tell me... is any of this making sense? I hope I'll see you over at the developerWorks forums, and you'll let me know what you think.



Resources

Learn
  • Sun's online Java and XML Headquarters: When it comes to JAXP, there's no better place to start than here.

  • The core API documentation for Java 5.0 technology: See how the JAXP JavaDoc is now integrated into the API.

  • SAX Web site: Find out more about the APIs under the covers of JAXP. Start with SAX 2 for the Java environment.

  • The W3C Web site: For another view of XML supported by SAX, take a look at DOM.

  • Apache Xerces parser: Read about this parser that Sun uses in their JDK 5.0 implementation.

  • "New to XML" page: Need a more basic introduction to XML? Check out the XML zone's updated resource central for XML.

  • IBM XML certification: Find out how you can become an IBM-Certified Developer in XML and related technologies.

Discuss


About the author

Photo of Brett McLaughlin

Brett McLaughlin has worked in computers since the Logo days. (Remember the little triangle?) In recent years, he's become one of the most well-known authors and programmers in the Java and XML communities. He's worked for Nextel Communications, implementing complex enterprise systems; at Lutris Technologies, actually writing application servers; and most recently at O'Reilly Media, Inc., where he continues to write and edit books that matter. Brett's upcoming book, Head Rush Ajax, brings the award-winning and innovative Head First approach to Ajax. His last book, Java 1.5 Tiger: A Developer's Notebook, was the first book available on the newest version of Java technology. And his classic Java and XML remains one of the definitive works on using XML technologies in the Java language.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top


Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others. Other company, product, or service names may be trademarks or service marks of others.