Level: Intermediate Dennis Sosnoski (dms@sosnoski.com), President, Sosnoski Software Solutions, Inc.
18 Apr 2003 JiBX lead developer Dennis Sosnoski shows you how to work with his new framework for XML data binding in Java applications. With the binding definitions used by JiBX, you control virtually all aspects of marshalling and unmarshalling, including handling structural differences between your XML documents and the corresponding Java language objects. Want to refactor your code without changing the XML representation? With JiBX you can...
Part 3 of this series gave you an introduction to the architecture of the
JiBX data binding framework. That included a quick overview of JiBX's
Java-centric approach to data binding, as contrasted with the
XML-centric approach used by most other data binding frameworks. Now in
Part 4, you'll find out how to use the power of this Java-centric approach to
data binding in your applications.
Most other data binding frameworks for the Java language force you to supply a
DTD or W3C XML Schema grammar for your documents, then generate a collection of
classes from that grammar. You need to work with these generated classes
to use the frameworks, but in most cases you have little or no control over the
classes -- they're basically JavaBean-type wrappers around simple data structures
along with some added framework code. The whole point of these generated classes
is to provide an interface for working with data from your XML documents.
The JavaBean wrapper approach is sometimes presented as object-oriented
because of the use of get/set methods to access data. In reality, it's about as
far from true object-oriented development as you can get, because of the lack of
extensibility in the data objects. True object-oriented programming means that
objects hide their internal state and implement their own behaviors for working
with that state information. This is typically not possible with the generated code approaches.
With JiBX, binding to XML is treated as an aspect that applies to your
classes, not as the primary purpose of those classes. Thus, you use
object-oriented designs that are appropriate to your application. It also gives you the
freedom to refactor your classes without needing to change the bound XML
document structure. This aspect-oriented approach even lets you define multiple
bindings to be used with the same classes, so that you can work with multiple
input or output XML document structures without having to change your code.
Of XML bondage
The core of JiBX's aspect-oriented approach to binding is the use of binding
definitions to control how your Java objects are converted to and from XML
documents. To see how this works, think of XML documents as tree structures,
where the nesting of elements define branches of the tree. Data binding
converts these XML trees to and from trees of objects (or sometimes
graphs of objects, with links up or across the tree structure). A JiBX binding
definition is a third tree that represents a merger of the structure of the XML
and object trees (which, in JiBX, can be different). This merged structure tells
JiBX how to convert the XML tree to and from the object tree.
Figure 1 gives a simple example of how the binding definition is used in the JiBX framework.
In this case the XML document and the bound classes use the same structure -- a
customer
has a
name
as a child, and the
name
has a pair of
simple text values as children. The binding definition simply duplicates this
structure, supplying the necessary information at each level to relate XML
elements to the corresponding object properties.
Figure 1. Binding definition role
Binding components
Several types of elements are used within a binding definition. The purposes
of these elements and the types of child elements they can contain are shown
in Table 1.
Table 1. Elements used in a binding definition
|
Element
|
Purpose
|
binding
|
The root element of the binding definition, with optional attributes for
binding name and global settings.
Children:
namespace
,
format
,
mapping
(at least 1
mapping
required)
|
namespace
|
Namespace declaration that defines a namespace URI and associated
prefix (with the prefix used for marshalling).
Children: none
|
format
|
Format definition for converting simple values to and from text. This is
needed only if you want to use nonstandard conversions.
Children: none
|
mapping
|
Defines how objects of a particular class are converted to and from
XML. Each mapping is a reusable component that can be referenced wherever an
object of that type needs to be handled within the binding definition. Mapping
definitions that are children of the
binding
element are called
global mappings.
Children:
namespace
,
format
,
mapping
, followed by any
combination of
value
,
structure
, and
collection
elements
|
value
|
Gives the conversion handling for a simple value (a primitive, or an
object type with a format supplied) to convert it to and from text. The
XML representation can be an attribute, a simple element, or in some cases a
plain text or CDATA node.
Children: none
|
structure
|
Structure component of binding, which can represent a Java object, an
XML element, or both. Usually, this represents both a Java
object and an XML element linked to that object. A structure mapping
is defined when either the object or the XML element is missing from the
definition.
Children: any combination of
value
,
structure
, and
collection
elements
|
collection
|
Similar to a
structure
element, but specifically for representing
Java collection objects (added in JiBX Beta 2).
Children: any combination of
value
,
structure
, and
collection
elements
|
The
binding
element is always the root element of a binding
definition. As children it can have
namespace
,
format
, and
mapping
elements, which must be in that order (with the first two types
optional). Each
mapping
element can in turn have these same types of
elements as children for nested definitions, followed by a mixture of
value
,
structure
, and
collection
elements that define the
details of the relationship between XML and a Java class.
The
value
elements represent simple value components of the XML
document, which can be attributes, simple child elements (with only text content),
text, or CDATA. The
structure
elements are more involved. In the most
common case (as in the Figure 1 example), a
structure
element relates
a child element with complex content (the
name
element, in the example)
to an object-value property of a Java object (the name field of the
Customer object). Both sides of the relationship are optional,
though. This allows the
structure
element to define an XML element with
no corresponding object, or an object with no corresponding element. I'll show
how this works in the following examples.
A simple binding
In Part 3, I gave some examples of the flexibility JiBX provides with structure
mapping. I'll go through the actual binding definitions here. Figure 2 shows
the first example, with a direct correspondence between the structure of the XML
documents and the Java objects. This is just the full version of the same document
and class structure used in Figure 1. Listing 1 gives a full binding definition for this
correspondence.
Figure 2. Direct correspondence to XML
Listing 1. Binding definition for direct correspondence
<binding>
<mapping name="customer" class="Customer">
<structure name="name" field="name">
<value name="first-name" field="firstName"/>
<value name="last-name" field="lastName"/>
</structure>
<value name="street1" field="street1"/>
<value name="city" field="city"/>
<value name="state" field="state"/>
<value name="zip" field="zip"/>
<value name="phone" field="phone"/>
</mapping>
</binding> |
Compacting the definition
Listing 1 gives the full form of the binding definition. This isn't the only
way of specifying a binding, though. If requested, JiBX
(starting with Beta 2) will map unspecified
simple properties of Java objects automatically. The properties to be mapped may
take the form of either fields or JavaBean-style get/set methods. Taking
advantage of this default mapping allows Listing 2 to be used as a (much shorter)
alternative to the Listing 1 definition.
Listing 2. Compact version of binding definition
<binding auto-link="fields">
<mapping name="customer" class="Customer">
<structure name="name" field="name"/>
</mapping>
</binding> |
This compact approach does have some limitations. The automatically generated
property bindings will always follow any definitions given explicitly, and will
occur in the order they're defined. In the case of the Figure 1 binding this is just
what I want -- the
name
element is the first child of the
customer
element, and the field definitions within both the Name and
Customer classes use the same order as the corresponding XML
child elements. When the Java data matches the XML structure as closely as in
this case, automatically generating bindings can make the binding
definition very simple.
JiBX does provide some specialized options for customizing the automatic
property binding generation. These let you give a prefix and suffix to be
stripped from field or JavaBean property names when generating the corresponding
XML element or attribute names, control the style of XML names, and set the
access level for fields or methods included in the automatic generation. You
can even list field or property names to be specifically included or excluded in
the automatic generation. However, for the rest of the examples in this article I'll just
stay with the full binding definition format, in order to clearly show
exactly what values are being bound.
Flattening the tree
The simple binding example doesn't really do justice to the flexibility of JiBX.
In Part 3 I also showed a pair of examples of structure mapping, which
handles structural differences between the XML document and the bound Java
classes. Figure 3 shows the first example of this type, with the same XML
document structure bound to a single class rather than the pair of classes used
previously.
Figure 3. Binding to single class
In the Figure 3 example, the Java class structure is a flattened version of
the XML document. Rather than using a separate class for the values within the
XML
name
element, this just includes the values directly within the
class that corresponds to the parent
customer
element. Listing 3 gives a
full binding definition for this structure mapping.
Listing 3. Structure mapping to single class
<binding>
<mapping name="customer" class="Customer">
<structure name="name">
<value name="first-name" field="firstName"/>
<value name="last-name" field="lastName"/>
</structure>
<value name="street1" field="street1"/>
<value name="city" field="city"/>
<value name="state" field="state"/>
<value name="zip" field="zip"/>
<value name="phone" field="phone"/>
</mapping>
</binding>
|
If you compare Listing 1 and Listing 3 you'll see that the change to the
binding definition for this flattened mapping is trivial. Only one line of the binding definition is
different -- the
field
attribute has been removed from the original version.
This tells JiBX that the
structure
element of the binding definition defines an
element in the XML (as shown by the
name
attribute) that maps to some
properties of the current object.
Warping the tree
Figure 4 gives a second example of structure mapping. This time the Java
class structure uses a pair of classes, but the breakdown of data values doesn't
match the structure of the XML document -- data values from the
customer
element of the XML document are split between the two classes, and the values
from the
name
child element are included directly in the class corresponding
to its parent element.
Figure 4. Binding to split classes
Listing 4 gives the full form of a binding definition for the Figure 4 binding.
The only difference from the Listing 3 binding definition is that I've added a
structure
element corresponding to the new Address class.
This
structure
element includes a
field
attribute but no
name
attribute, telling JiBX that the
structure
element is defining an object
property with no corresponding element in the XML document.
Listing 4. Structure mapping to split classes
<binding>
<mapping name="customer" class="Customer">
<structure name="name">
<value name="first-name" field="firstName"/>
<value name="last-name" field="lastName"/>
</structure>
<structure field="address">
<value name="street1" field="street1"/>
<value name="city" field="city"/>
<value name="state" field="state"/>
<value name="zip" field="zip"/>
</structure>
<value name="phone" field="phone"/>
</mapping>
</binding>
|
Tying up loose ends
JiBX binding definitions offer many options beyond what I've covered here. Some
of these are hinted at in the list of binding definition elements, such as
custom serialization and deserialization using the
format
element, and
easy namespace handling with the
namespace
element. Other options are
controlled by attributes of the binding definition elements. These include
defaults for optional values, methods to be called before marshalling or after
unmarshalling objects, and identifier values for referencing objects.
JiBX also includes a general extension hook in the form of custom marshal and unmarshal method definitions. These let you take over complete control of the marshalling
and unmarshalling process, working directly with the low-level methods defined by
the JiBX context classes. This type of low-level operation is not intended for
general usage. It does provide some interesting possibilities for JiBX add-ons,
though. One potential use is to allow portions of a document to be mapped
to and from a document model (such as DOM, JDOM, or dom4j). This would provide
easy handling of special cases, such as XHTML fragments embedded within XML
documents.
Compiling the binding
Once you've got your binding definition, you need to actually compile it into the
class files. JiBX supplies a binding compiler for this purpose. To use it,
you just set up the Java class path so both the jar files included in the JiBX
distribution and your own classes are accessible to the JVM, then run the
org.jibx.binding.Compile program with one or more binding definition
file paths as arguments.
The binding compiler adds JiBX binding code to your class files, preparing them
for use with the JiBX runtime. It's smart about how it does this: If the same
added code is needed by more than one binding, the binding compiler only
generates the code once. Likewise, if you rerun the compiler with a modified
binding definition it replaces the methods added for the old binding rather
than just adding new ones. The compiler even removes methods and classes
that were previously added for a binding you're no longer using. Finally, it
only writes to class files that have actually been changed. This makes it safe
to rerun the binding compiler after changing (and compiling) some of your Java
source code files, without needing to recompile all your Java source files.
Running the binding
Once the binding compiler has modified your Java class files, you're ready to use
the JiBX runtime for marshalling and unmarshalling documents. There's just one
hitch, though: The actual binding code isn't added until after your Java source
code is compiled to class files and run through the JiBX binding compiler, so
you can't access this binding code directly in your source code. Instead, you need
to work through a portion of the JiBX runtime that tracks the binding definitions
you're using and connects you to the proper code at runtime.
This uses the org.jibx.runtime.BindingDirectory class that's
included in the JiBX runtime jar, along with a class that JiBX generates in the
same package as your code (or as the first class file it modifies, if your code
is spread across multiple packages). You don't need to worry about the details of
getting at this generated class, though; instead, you access it by passing one of the
classes defined by a global mapping (one that's a child of the root
binding
element) in your binding to the BindingDirectory (if you've
compiled more than one binding into the code, you'll also need to pass the name
of the binding you want to use). The code is simple:
IBindingFactory bfact =
BindingDirectory.getFactory(Customer.class);
|
Here, Customer is the name of a class with a global mapping in
the binding. The org.jibx.runtime.IBindingFactory interface that gets
returned provides methods to construct marshalling and unmarshalling contexts,
which in turn allow you to do the actual marshal and unmarshal operations.
Here's an unmarshal example:
IUnmarshallingContext uctx = bfact.createUnmarshallingContext();
Object obj = uctx.unmarshalDocument
(new FileInputStream("filename.xml"), null);
|
This is just one of several variations of an unmarshal call -- in this case to
unmarshal an XML document in the file filename.xml. You can pass a
reader instead of a stream as the source of the document data, and you can also
specify an encoding for the document -- see the JavaDocs on the JiBX site for details. The
returned object is an instance of one of your classes defined with a global
mapping in the binding -- you can either check the type with
instanceof
or cast directly to your object type, if you know what it is.
Marshalling is just as easy. Here's an example:
IMarshallingContext mctx = bfact.createMarshallingContext();
mctx.setIndent(4);
mctx.marshalDocument(obj, "UTF-8", null,
new FileOutputStream("filename.xml"));
|
As with the unmarshal example, this is just one of several variations that
can be used for the marshal call. It first sets the indentation of the
output XML to 4 spaces per nesting level, then marshals the object to an XML
document written to the file filename.xml with UTF-8 character
encoding (the most common choice for XML). You can pass a writer instead of a
stream, as well as some other variations -- again, see the JavaDocs on the JiBX site for details. The
object to be marshalled must be an instance of a class that's defined with a global
mapping in the binding.
Future directions
JiBX provides a number of advantages over other available XML
data binding frameworks for Java applications. These include very fast operation,
a compact runtime distribution, and greater isolation between XML document
formats and Java language object structures. As JiBX nears initial production
release, it's looking like a great alternative for many applications.
JiBX does still have some areas of weakness. One is the current
lack of support for code generation from an XML grammar. That may seem a
surprising comment after my earlier remarks on the limitations of the XML-centric
code generation approaches, however JiBX actually offers the means to avoid many of these
limitations. If an XML grammar could be used for generating an initial set of classes
and a corresponding binding definition, this would provide the benefit of getting
working code quickly. At the same time, users would still have the long-term
flexibility to independently refactor either the code or the grammar while
modifying the binding definition to keep everything working in harmony.
Another very useful feature would be a tool to verify a binding definition
against an XML grammar. A grammar provides most of the information necessary to
say whether a binding definition will actually handle the intended documents
properly. A tool to actually check the combination for compatibility would help
prevent potential surprises in testing or deployment.
The current byte code enhancement approach that adds binding framework methods
to your compiled classes is also an area where more flexibility would be
useful. Byte code enhancement offers the advantage of keeping your source code
clean, but at the costs of an added step in the build process and potential
confusion in tracking problems in your code accessed during marshalling or
unmarshalling. It'd be great to offer an alternative for cases where these costs
outweigh the benefits.
I think there's a relatively simple solution to this issue. It should be fairly
easy to decompile the code added by JiBX back to source code, and merge this into
the original Java language source files. Once this is done the methods needed by
JiBX will be compiled-in automatically, and as long as the user doesn't tinker
with the code added by JiBX there should be no need to rerun the binding compiler
until the binding definitions change. I'm currently investigating adding support
for this type of operation to JiBX, though it probably won't be until after the
initial production release.
As a final note on its limitations, JiBX currently offers relatively weak
validation support compared to many of the other data binding frameworks. It's
possible that this will change in the future. JiBX does include support for
methods to be called before an object is marshalled or after an object is
unmarshalled, and these methods could be used to handle most forms of validation.
For many XML applications, full validation support is secondary to the main goal
of fast and convenient access to data from XML documents, and for this goal JiBX
already offers a great alternative.
Conclusion
In this article I've shown you the basics of working with the JiBX data binding
framework. I personally feel JiBX has a lot to offer (which isn't too surprising, since
I am the author of JiBX!). It's especially useful for applications that need to
adapt existing object structures to XML, and for any applications where you want
to decouple your code from the actual XML structure. But JiBX is definitely
not a solution to all XML requirements for Java applications.
XML is used for many different purposes, and the ever-increasing
number of tools for working with XML in Java applications reflects the fact that
different purposes require different tools. The toughest part of working with
XML is often just knowing which tool is best for a specific application. JiBX
is designed to suit applications that need to interpret XML documents as data
and work with that data in memory, where the focus is more on the use of that
data by the application than on the XML documents themselves.
If JiBX sounds like a match for your needs I encourage you to download the
current distribution and give it a try. Since it's an open source project with a
BSD-style license, you're free to modify the code to suit your requirements --
and you don't even need to make your modifications public. I naturally hope that
many people find it useful, and that they do contribute extensions and added tools
to help the project grow in the future.
In my next article, I'll close out this series on data binding with a look at the
recently released JAXB data binding standard and reference implementation. This
will include trying out some of the customization options JAXB provides. JAXB is
strongest in the very areas where JiBX is weakest, so this final article will be
a nice wrap-up to a series that's covered a full range of tools and techniques.
Resources
- Learn more about the new JiBX framework for mapped bindings.
-
Part 1 of this series on data binding provides background on why you'd want to use data binding for XML, along with an overview of the available Java frameworks for data binding. Part 2 gives performance comparisons between the data binding frameworks, including the new JiBX framework (developerWorks, January 2003). Part 3 introduces the JiBX architecture and discusses the reasons behind the choices made in JiBX (developerWorks, April 2003).
- If you need background on XML, try the developerWorks
Introduction to XML tutorial (August 2002).
- Review the author's previous developerWorks articles covering performance (September 2001) and usage (February 2002) comparisons for Java XML document models.
About the author  | 
|  | Dennis Sosnoski (dms@sosnoski.com) is the founder and lead consultant of
Seattle-area Java consulting company Sosnoski Software Solutions, Inc.,
specialists in J2EE, XML, and Web services support. Dennis's professional software development
experience spans over 30 years, with the last several years focused on server-side Java technologies.
He's a frequent speaker on XML in Java and J2EE technologies at
conferences nationwide, and chairs the Seattle Java-XML SIG. |
Rate this page
|