Long-time readers of this column are probably expecting the third part in my coverage of JaxMe. As you may recall, I was going to show you how to use JaxMe to insert and retrieve data from a database, and work through JaxMe's other enhancements to the basic data binding facilities that it also offers.
And that's still coming -- in my next column. Right now, I want to take just a little time to explore some of the issues that have been increasingly prominent in my inbox, and that of course have relevance to this data binding series. So read on, and know that JaxMe is still on tap for next time.
While my last several articles have been useful (I hope) from the perspective of a specific implementation, my goal is to make this article down-to-earth and practical. Most of the stuff I read in books and magazines makes me wonder if the authors have actually used the things they're talking about. I mean, they document the API correctly, and nothing is technically wrong -- but that might apply to handing the keys of a car to a four-year old. Just tell him how to start it up and let 'em rip, right? Well, of course not!
What I value is not just documentation about an API, but information on how best to use that API. When does it work well? When does it bog down and kill your application? Even better, when is it just not worth the time and effort required to introduce a new API into your application framework? These are the questions that I care about, and I'm going to go out on a limb and assume you do, as well.
So I want to begin with those areas in data binding that can make you run screaming. For the most part, these are not API-specific (I note the exceptions), but apply to all data binding implementations. They are all common problems that people encounter when working with a data binding API -- for that matter, many apply to all new and cool APIs.
Here's the single most common mistake that's made with any new and exciting API -- confusing it with a fad. This generally involves two classic blunders:
- Throwing an API in without proper thought and testing
- Using an API for every possible task
These are both bad ideas for a number of reasons. Take the first problem, for example. When a new API comes along and it looks like it might be helpful, many developers tend to simply ignore all the normal steps involved with adding technology. For example, you wouldn't just change from JDBC to EJB without some long meetings, a ton of planning, and an enormous amount of forethought. In the same vein, it's pretty irresponsible to rip out all of your SAX and DOM code and replace it with JAXB or JaxMe (or Coins, or Zeus, or XMLBeans, or anything else you come up with) on a whim.
Surprisingly, though, this is just what many developers are doing. SAX is a pain to code in; callbacks aren't very OO, and it takes a long time to get good at it. DOM is easy to use (except when it's unnecessarily hard, like getting the text from an element), but can be bloated and a memory hog. Still, that's no reason to toss these APIs out just because data binding is the newest XML-related toy on the shelf. Instead, take your time and look into the new technology. Read articles like this, play with sample code, and generally be cautious. It may slow things down, but you will save yourself a ton of headaches.
The other temptation is to use a new technology everywhere. In other words (using the specific example of a data binding API), don't assume that you want to use data binding every time your code works with XML. This is naive, an over-simplification, and will almost certainly hurt your application in the long run. Instead, you need to look at each possible usage, and evaluate it separately. For example, it may not make sense to add the overhead of transitioning to data binding because you have some proprietary Perl scripts that read XML just fine -- in which case you should just leave it. Save the work and effort for tasks that don't work as well as they should.
I know I'll catch all sorts of flack here, but I'm going to say it anyway: Going with a standards-based solution isn't always the best idea. Now to be clear, I get that Sun is putting all sorts of work into Java technology, their Java Specification Requests (JSRs), and the API -- I'm not knocking any of that. I'm simply saying that Sun-endorsed isn't the end-all, be-all of a technology. If it were, nobody would care about (for example) Struts or Spring or Hibernate. But people do care about those things, because they work. None of these are standards, but they're all crucial parts of thousands of applications.
Of course, that isn't an admonition to go to the other extreme -- don't avoid something just because it is a standard. I'm as much an open source advocate as the next guy -- probably more, in fact -- but I have several cases where I prefer Sun's solutions to equivalent open source projects. You've got to choose the best tool for the job, whatever it may be (and whomever it may be affiliated with). For your application, JAXB (from Sun) may be just what you need; or you might hate JAXB, but JaxMe is just the ticket (open source); or, you might move away from specifications entirely and use something like Quick or XMLBeans. The choice is certainly yours.
Standards can be the best option
I know -- you're thinking, "Didn't he just say that standards don't matter?" No, I said that standards aren't a trump card that make all other considerations worthless. However, choosing a standard does have its benefits. Here are the two most important (at least in my esteemed opinion!):
- Some (at least minimal) guarantee of stability and conformity
- Some (at least minimal) degree of cross-platform compatibility
You should note that in both cases, I've added at least minimal. These are factors to consider, but unless you're looking at a feature in the core JDK (like generics, or regular expressions), these guarantees are not super-substantial. They are important, however, and I'll take each in turn.
First, a standard is -- well -- a standard. It's an established set of rules and reasoning that you can count on. And when you use an API that adheres to a standard, you get those benefits. You don't have to worry, for example, about a JSP-compliant container properly interpreting your JSP pages, if they are also JSP-compliant. If something goes wrong, it's a bug, and will (presumably) be fixed quickly. That's quite nice, and not always the case with projects that aren't based on standards, and don't have as high a bar to maintain. For example, JAXB code works with JAXB-compliant APIs, like JAXB's reference implementation and JaxMe. (You have to change a line or two, but that's it.) Again, this isn't the only thing to consider, but having that degree of stability and some expectation of behavior is important. You'll often find that it's particularly salient when you have a skeptical manager.
Second, when you work with standards -- and particularly implementations of a standard -- you increase your chances of translating your code quickly to other platforms, machines, and development environments. For example, if you deploy an application based on XMLBeans, you'll of course need to ensure that the XMLBeans JAR files go with the application. While that's pretty normal, things get trickier if you begin to cooperate with other companies, rather than just deploy applications.
Consider your data binding code, based on a project that is non-standard like Quick or Castor -- you get it working, you get your XML behaving properly and formatted for use in your data binding API, and you're ready to send that generated XML to some other cooperating company. Of course, they're using some other API (whether it's based on a standard or not is more or less inconsequential), so it doesn't interpret that XML you've worked so hard on in the same way, and things don't work. This is the classic contract issue -- how do you ensure that two ends of a communication chain can work with the data coherently? A standardized API, like JAXB, can really help out in these situations.
So now that I've been suitably negative, let me talk some about the right way to use these technologies. The things detailed here are all pretty simple; there's nothing particularly brilliant about them, but they can make an enormous difference in how your applications run, and how they are maintained.
You should also understand that these are not comprehensive solutions; you may find that in some cases, one or two of these ideas won't fit with your architecture. In other cases, you may simply not agree on their value. These are perfectly legitimate thoughts and courses of action -- the key thing is that you want to spend significant time thinking about your infrastructure. Don't just drop in a solution or technology; instead, carefully plan and evaluate the consequences of every decision you make. Not only will your boss appreciate it, but you'll find that you're not in the office at 2 AM as frequently.
I've talked about abstraction in this column, and in many of my other articles, numerous times. But it's so important that I think it's worth mentioning again. Simply, you can find no substitute -- no amount of clever programming -- that is worth as much as modular design. If you are going to use data binding, don't assume that you'll always use data binding. Instead, abstract the functionality that you want to implement. For example, data binding is most commonly used to read and write data to some storage medium. In other words, it's largely about data persistence. So don't think about the part of my application that does data binding. Instead, think about the part of my application that handles persistence of this particular bit of data.
The difference between these two approaches will completely change the way you think about application design. If you think about it in terms of data binding, you'll have all sorts of code spread throughout your application, calling directly into the data binding API (whether that be JAXB, Zeus, or something else). As a result, if you ever change the specific API you're using, or -- worse yet -- move away from data binding to something like JDO or another new technology, you're going to have to make huge changes in your application code. That's bad, bad, bad.
On the other hand, suppose you simply consider this to be an API (of your own) that persists data. When
you look at things this way, you'll probably realize that you want a thin application-specific API.
It might have calls like writeAddress(Address address) or
loadUserCredentials(int userID). Your application will make
calls into this API. Then, your shell API is going to use something -- perhaps data
binding -- to accomplish its job. If you change that mechanism to another project or
technology, you're only changing your shell classes, which are all (presumably) in a single
package or two, and certainly consolidated in your application. The rest of your application
happily chugs along, calling that same API -- blissfully unaware of how the loading
and storing is being accomplished. Now that's good design.
I realize that this isn't a popular word these days, but it's still an important one. At the last software company where I worked, I was lucky enough to have an entire division that did nothing but testing. Those guys could break anything! But the applications that rolled out after that testing were nearly indestructible. It was incredibly valuable, and well worth the extra time, effort, and money it took to release a product.
Fortunately, concepts like extreme programming and agile development have made it possible to get a lot of these same benefits without an entire team of people whacking on your code. This presumes that you are actually using one of these approaches to development. Time and space obviously don't permit me to get into the specifics of team development, test-driven development, use-cases, JUnit, and the like. That said, there is a wealth of resources on the subject, and you really owe it to yourself to check them out.
If you're not sure if you're doing enough testing, here's a really great rule of thumb: If you don't have a test that at some point fails, you're probably not doing things right. In other words, if you can code up your module, test it, and nothing goes wrong the first time, I daresay your tests aren't stringent enough. I certainly realize that you can go on some sort of rush with coding, akin to poker (Brunson did win the World Series twice with 10-deuce), but that's rather rare. Put together an extensive set of tests -- even before you start writing the code that will be tested -- and make sure all the required functionality is tested. You'll be annoyed when those tests fail, but pleased at rollout when you're confident that you won't come back later to fix some obvious bug.
Assume really stupid -- and really smart -- users
One more for today and I'll call it quits. This is pretty basic, and has some relation to testing, but bears mentioning on its own. I literally can't count the number of times I thought that I had finished an application module -- largely because it worked for me every time. Then, I rolled it out only to discover hundreds of little problems almost immediately. It honestly took me a while to realize that not all users are just like me!
Once I understood that, I began to assume two things:
- I'm going to have really dumb users, who manage to find every possible way to break my application, by doing things out of order, in ridiculous ways, and without any thought at all.
- I'm going to have really brilliant users, who think of shortcuts, odd ways to approach things, and who may ultimately want to break things for the sake of breaking them, or breaking into them.
Both of these are really legitimate issues that you have to deal with. Part of the solution goes back to testing, but testing functionality isn't going to catch abusive users -- whether they mean to be abusive or not. Once you've passed all your tests, I recommend the gorilla test. That's where you find a few people who know nothing about your application to try it out. The less computer-savvy these folks are, the better; they'll find the strangest ways to do things, and break your module. But this will allow you to correct problems before they cost you money or credibility.
Then, you've got to do some hacker testing. Find the smartest security whiz you know, and ask him to break into your application. You'll probably have no idea how he (or she) does it, but it almost always happens. Now, you just need to get that same hacker to help you figure out what's wrong, close the hole, and smile all the way to the bank. Simple enough, right? Well worth the time and effort.
While this has been a departure from my recent discussion on JaxMe, I hope you find this article helpful nonetheless. These problems and recommendations about using data binding are not things I sat around and made up; they represent lots of e-mail from readers and posters who have asked about these very things. I culled those that seemed to come up over and over again, and figured they would be worth dealing with in some detail.
I haven't covered every possible problem or scenario. While I'll return to JaxMe in my next installment, I'd love to hear about your particular use-cases. If you've got some ideas or questions, mail them to me, and maybe they'll appear in a subsequent article on pitfalls and recommendations. I hope you've learned something new, made your code better, and left a little smarter. So until next month, I'll see you -- as always -- online!
- Participate in the discussion forum.
- Visit the JaxMe Web site, and learn more about this new API.
- Visit Sun's Java Architecture for Binding (JAXB) page.
- Explore Castor, a data binding project that isn't based on Sun's standards, but is still very useful.
- Check out Enhydra Zeus, another non-standard, but useful, data binding project.
- Read Ronald Bourret's excellent overview of numerous data binding projects and resources.
- Check out the Apache Incubator, where new and ingenious projects like JaxMe are coming online all the time.
-
Browse for books on these and other technical topics.
- Look at several XML data binding approaches using code generation from W3C XML Schema or DTD grammars for XML documents in Dennis Sosnoski's article "Data binding, Part 1: Code generation approaches -- JAXB and more" (developerWorks, January 2003).
- Obtain text parsing utilities from the Jakarta Commons package.
- Get the scoop on Sun's XML APIs from Sun's Web site.
- Learn how to use JAXB to develop enterprise applications with WebSphere Studio Application Developer V5.1 in this article by Tilak Mitra (developerWorks, February 2004).
- Discover more data binding resources on the developerWorks
XML and Java technology zones.
- Find out how you can become an IBM Certified Developer in XML and related technologies.

Brett McLaughlin has worked in computers since the Logo days. (Remember the little triangle?) In recent years, he's become one of the most well-known authors and programmers in the Java and XML communities. He's worked for Nextel Communications, implementing complex enterprise systems; at Lutris Technologies, actually writing application servers; and most recently at O'Reilly Media, Inc., where he continues to write and edit books that matter. His most recent book, Java 1.5 Tiger: A Developer's Notebook, is the first book available on the newest version of Java technology, and his classic Java and XML remains one of the definitive works on using XML technologies in the Java language.