First, what's the difference?
Serialization usually means binary serialization, a Java facility for writing an object to a byte stream. The object's class must implement
ObjectOutputStream#writeObject(Object)writes the serializable object to a binary stream. Deserialization (
ObjectInputStream#readObject()) converts the binary data back into an instance. CORBA calls this marshalling and demarshalling.
Java objects can also be serialized to XML, a text (character) format. O/X mapping libraries like JDOM (JSR 102), dom4j, Transformation API For XML (TrAX), JSR 173: Streaming API for XML (StAX), etc. can make this easier. JAXB was supposed to facilitate declarative tools for mapping XML schemas to Java object structures, but it doesn't ever seem to have gotten off the ground. (See JAX-WS Improves JAX-RPC with Better O/X Mapping.) What has stuck better is SAAJ, an object model for SOAP messages, whose implementation usually contains something JAXB-like. (Keep in mind that SOAP is just one kind of XML, a specific XML schema.)
In theory, XML serialization can be built right into Java the way binary serialization is. This is why binary serialization is implemented as a specific set of classes separate from the rest of Java; so that you can substitute another set of classes that implement another serialization scheme. Instead of
ObjectInputStream, you could have XML equivalents. These would convert any
java.io.xml.XMLSerializable(there is no such interface) object to some defualt XML that specified the class' full name, version, and each of its (non-transient) instance variables. But somehow this idea has never taken off.
So which should you use, binary or XML serialization?
Binary serialization is much more efficient than XML and easier to get working. Writing out an object in binary form is a little faster than text/XML form. Binary form is much more compact and so saves memory and bandwidth. And binary form is much faster and easier to parse than XML.
So why would anyone use XML instead of binary serialization?
Flexibility. Binary serialization requires the serializer and the deserializer be Java (but see the comments for details), and both Java program(s) must have the classes in their classpaths for the objects being serialized. There are also class version issues. XML is not Java-specific and so works with any language. Its text data can more easily be displayed and read by people.
I like to say this: If you find that your apps are too efficient and don't burden your hardware enough, use more XML. There is no code so inefficient that it can't be made even more inefficient using XML. As a keynote speaker said at OOPSLA a couple of years ago (I think it was Alfred Spector): "We just never thought that the programming community would be so accepting of a format as inefficient as XML."
There are ways to make XML more efficient. Don't include whitespace. Use short element names and shallow namespace trees with short names. There's even a move afoot for "binary XML" (practically an oxymoron); see Better Web Services Performance.
So bottom line, if all your apps writing and reading your data are implemented in Java, use binary serialization. Compared to XML, it's easier to get working and has better performance. If and when you need interoperability with other languages or better support for people reading the serialized data, then go through the extra effort to implement the code for XML marshalling and demarshalling.
A few references for more detail:
- XML, Java, and the future of the Web -- The mother of all Java and XML articles: "XML gives Java something to do."
- XML Serialization of Java Objects -- What the title says.
- Designing For Object Serialization -- What the title says; one of the articles I've written.
- Improving HttpSession Performance with Smart Serialization -- Applies to more than just