Skip to main content

skip to main content

developerWorks  >  Java technology | XML | Open source  >

JiBX 1.2, Part 2: XML schema to Java code

Generate cleaner, customized Java code from XML schema

developerWorks
Go to the previous pagePage 6 of 14 Go to the next page

Document options
PDF format - Fits A4 and Letter

PDF - Fits A4 and Letter
222 KB (36 pages)

Get Adobe® Reader®

Sample code


My developerWorks needs you!

Connect to your technical community


Rate this tutorial

Help us improve this content


Trying a real-world schema

Working with a stand-alone schema definition is great for a simple demonstration, but it doesn't give much of a feeling for how a tool functions when applied to the complex schema definitions widely used in enterprise applications. Now it's time to move on to a more realistic example, in the form of one of the industry-standard HR-XML schema definitions.

HR-XML TimeCard schema

The HR-XML Consortium is an organization established to develop open standards for XML representations for human resources. It represents more than 110 corporate members, and almost 50 technology firms are certified to meet its standards.

The HR-XML schemas used for this tutorial consist of 157 schemas, including a mixture of top-level document definitions and common components. CodeGen can easily handle this number of schemas, but the number of generated classes and the complexity of the interrelationships would obscure the more interesting aspects of the schema handling. To focus in on these details, the subset of HR-XML used here consists of a single top-level document definition, for the TimeCard element, along with the common components referenced as part of the TimeCard definition — a total of seven schema definitions.

You can find the subset of HR-XML schema definitions used in this tutorial under the hrxml/schemas directory. Listing 7 shows an edited version of the main schema for the TimeCard element definition. This gives a sample of the HR-XML schema style, which uses a mixture of nested and global type definitions and contains a wider range of schema structures than the first example, including:

  • <xs:choice> compositors (as shown in some of the embedded complexTypes within the TimeCardType definition)
  • <xs:any> particles (see the AdditionalDataType definition near the start of the listing)
  • <xs:simpleType> <union>s (see the TimeCardDuration definition at the end of the listing)
  • Nonenumeration <xs:simpleType> restrictions

Listing 7. HR-XML TimeCard schema
<xs:schema targetNamespace="http://ns.hr-xml.org/2007-04-15" ...
  elementFormDefault="qualified" version="2007-04-15">
  <xs:import namespace="http://www.w3.org/XML/1998/namespace" ...>
  <xs:include schemaLocation="../CPO/EntityIdType.xsd"/>
  ...
  <xs:complexType name="AdditionalDataType" mixed="true">
    ...
    <xs:sequence minOccurs="0" maxOccurs="unbounded">
      <xs:any namespace="##any" processContents="strict" minOccurs="0"
          maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="type" type="xs:string"/>
  </xs:complexType>
 ...
  <xs:element name="TimeCard" type="TimeCardType"/>
  <xs:complexType name="TimeCardType">
    <xs:sequence>
      <xs:element name="Id" type="EntityIdType" minOccurs="0"/>
      <xs:element name="ReportedResource">
        <xs:complexType>
          <xs:choice>
            <xs:element name="Person" type="TimeCardPersonType"/>
            <xs:element name="Resource">
          <xs:complexType>
        <xs:sequence>
         <xs:element name="Id" type="EntityIdType"
           minOccurs="0" maxOccurs="unbounded"/>
         <xs:element name="ResourceName" type="xs:string" minOccurs="0"/>
         <xs:element name="AdditionalData" type="AdditionalDataType" minOccurs="0"
             maxOccurs="unbounded"/>
        </xs:sequence>
        <xs:attribute name="type" type="xs:string"/>
       </xs:complexType>
      </xs:element>
     </xs:choice>
    </xs:complexType>
   </xs:element>
   <xs:element name="ReportedTime" maxOccurs="unbounded">
    <xs:complexType>
     <xs:sequence>
      <xs:element name="PeriodStartDate" type="AnyDateTimeType"/>
      <xs:element name="PeriodEndDate" type="AnyDateTimeType"/>
      <xs:element name="ReportedPersonAssignment" minOccurs="0">
       <xs:complexType>
        <xs:sequence>
         <xs:element name="Id" type="EntityIdType" minOccurs="0"/>
        </xs:sequence>
       </xs:complexType>
      </xs:element>
      <xs:choice maxOccurs="unbounded">
       <xs:element name="TimeInterval">
        <xs:complexType>
         <xs:sequence>
          <xs:element name="Id" type="EntityIdType" minOccurs="0"/>
          <xs:element name="StartDateTime" type="AnyDateTimeType"/>
          <xs:choice>
           <xs:sequence>
            <xs:element name="EndDateTime" type="AnyDateTimeType"/>
            <xs:element name="Duration" type="TimeCardDuration" minOccurs="0"/>
           </xs:sequence>
           <xs:element name="Duration" type="TimeCardDuration"/>
          </xs:choice>
          <xs:element name="PieceWork" minOccurs="0" maxOccurs="unbounded">
           ...
          </xs:element>
          <xs:element name="RateOrAmount" minOccurs="0" maxOccurs="unbounded">
           ...
          </xs:element>
          <xs:element name="Allowance" minOccurs="0" maxOccurs="unbounded">
           ...
          </xs:element>
          ...
         </xs:sequence>
         <xs:attribute name="type" type="xs:string" use="required"/>
         ...
        </xs:complexType>
       </xs:element>
       <xs:element name="TimeEvent">
        ...
       </xs:element>
       <xs:element name="Expense">
        ...
       </xs:element>
       <xs:element name="Allowance">
        ...
       </xs:element>
      </xs:choice>
      ...
     </xs:sequence>
     ...
    </xs:complexType>
   </xs:element>
   ...
  </xs:sequence>
  <xs:attribute ref="xml:lang"/>
 </xs:complexType>
 ...
 <xs:simpleType name="TimeCardDuration">
  <xs:union memberTypes="xs:duration xs:decimal"/>
 </xs:simpleType>
</xs:schema>



Back to top


Generated code for TimeCard

The Ant build.xml file in the hrxml directory defines Ant targets for trying out the basic code generation for the TimeCard schema, including both the default generation and a couple of customization examples (discussed later). The sample directory also contains a test program, org.jibx.hrxml.Test. It unmarshals sample documents using the generated data-model classes and then marshals the documents back out and compares the result with the original document. And there's a set of test documents from the HR-XML distribution in the samples directory. The codegen target runs CodeGen using defaults, compile compiles the generated code and test code, bind compiles the JiBX binding, and roundtrip runs the test program on the sample documents. You can also use the full task to run all of these steps in sequence.

Most forms of code generation from schema generate a separate class for each complexType definition and for enumeration simpleTypes. CodeGen often is able to reduce the number of generated classes by examining references and inlining definitions where possible and by ignoring unused definitions in included and imported schema definitions. In the case of the TimeCard schema, there are a total of 10 global (named) complexTypes and an additional 23 local (anonymous) complexTypes, along with 8 enumeration simpleTypes. The generated default data model consists of 15 top-level classes and 23 inner classes, just a few fewer than the number you'd expect to see based on the schema component counts. You'll see later some ways of using customizations to further simplify the data model in cases in which not all the schema components are needed.



Back to top


<xs:choice> handling

Listing 8 shows how CodeGen handles a choice between two elements in the TimeCardType complexType definition. CodeGen by default uses a selection variable to track which choice is currently active. The set methods for values included in the choice allow you to write a new value for the current selection but prevent changing the selection directly (throwing an IllegalStateException if you try). To change the current selection once it has been set, you first need to call a clear method (here clearReportedResourceSelect()) which resets the selection state.


Listing 8. HR-XML TimeCard-generated code sample
/**
 * Schema fragment(s) for this class:
 * <pre>
 * <xs:complexType xmlns:ns="http://ns.hr-xml.org/2007-04-15" 
 *    xmlns:ns1="http://www.w3.org/XML/1998/namespace" 
 *    xmlns:xs="http://www.w3.org/2001/XMLSchema" name="TimeCardType">
 *   <xs:sequence>
 *     <xs:element type="ns:EntityIdType" name="Id" minOccurs="0"/>
 *     <xs:element name="ReportedResource">
 *       <xs:complexType>
 *         <xs:choice>
 *           <xs:element type="ns:TimeCardPersonType" name="Person"/>
 *           <xs:element name="Resource">
 *             <!-- Reference to inner class Resource -->
 *           </xs:element>
 *         </xs:choice>
 *       </xs:complexType>
 *     </xs:element>
 *     ...
 */
public class TimeCardType
{
    private EntityIdType id;
    private int reportedResourceSelect = -1;
    private final int REPORTED_RESOURCE_PERSON_CHOICE = 0;
    private final int RESOURCE_CHOICE = 1;
    private TimeCardPersonType reportedResourcePerson;
    private Resource resource;
    ...
    private void setReportedResourceSelect(int choice) {
      if (reportedResourceSelect == -1) {
          reportedResourceSelect = choice;
      } else if (reportedResourceSelect != choice) {
          throw new IllegalStateException(
            "Need to call clearReportedResourceSelect() before changing existing choice");
        }
    }

    /**
     * Clear the choice selection.
     */
    public void clearReportedResourceSelect() {
        reportedResourceSelect = -1;
    }

    /**
     * Check if ReportedResourcePerson is current selection for choice.
     *
     * @return <code>true</code> if selection, <code>false</code> if not
     */
    public boolean ifReportedResourcePerson() {
        return reportedResourceSelect == REPORTED_RESOURCE_PERSON_CHOICE;
    }

    /**
     * Get the 'Person' element value.
     *
     * @return value
     */
    public TimeCardPersonType getReportedResourcePerson() {
        return reportedResourcePerson;
    }

    /**
     * Set the 'Person' element value.
     *
     * @param reportedResourcePerson
     */
    public void setReportedResourcePerson(
            TimeCardPersonType reportedResourcePerson) {
        setReportedResourceSelect(REPORTED_RESOURCE_PERSON_CHOICE);
        this.reportedResourcePerson = reportedResourcePerson;
    }

    /**
     * Check if Resource is current selection for choice.
     *
     * @return <code>true</code> if selection, <code>false</code> if not
     */
    public boolean ifResource() {
        return reportedResourceSelect == RESOURCE_CHOICE;
    }

    /**
     * Get the 'Resource' element value.
     *
     * @return value
     */
    public Resource getResource() {
        return resource;
    }

    /**
     * Set the 'Resource' element value.
     *
     * @param resource
     */
    public void setResource(Resource resource) {
        setReportedResourceSelect(RESOURCE_CHOICE);
        this.resource = resource;
    }

For most applications, this type of choice handling works well, preventing the user from trying to set more than one alternative in a choice. Customizations can be used to modify the default choice handling, though, so if you don't like this form of choice handling, you can easily change it. The choice-check attribute controls how the selection state for an <xsd:choice> is checked in the generated code. The choice-check="disable" value disables all checking and does not track a selection state, leaving it up to the user to set one and only one value for each choice. choice-check="checkset" matches the default handling shown in Listing 8, where only the set methods check for a current setting and throw an exception. choice-check="checkboth" also checks the selection state when a get method is called, throwing an exception if the get method does not match the current selection state. Finally, choice-check="override" changes the default handling always to change the current state when any value in the choice is set, rather than throwing an exception when a different state was previously set.

The choice-exposed customization attribute works in combination with the choice-check settings, which track a current selection state. A value of choice-exposed="false" keeps the selection state constants, state variable value, and state change method all private, matching the default code generation shown in Listing 8. choice-exposed="true" makes these all publicly accessible, adding a get method for the state variable. This allows you to use a Java switch statement easily to execute different code depending on the current state, in place of multiple if statements.

Both these attributes can be used at any level of customization, allowing you to set the behavior for all the generated code on the outermost customization easily while still retaining the ability to do something different on a case-by-case basis.



Back to top


<xs:any> and mixed="true" handling

Like many enterprise schemas, the HR-XML schemas use <xs:any> schema components to create extension points for data that can be defined by users independently of the original schema. CodeGen by default handles <xs:any> schema components using an org.w3c.dom.Element object (or list of Element, if the maxOccurs value on the <xs:any> is greater than 1). The Element object can be used to represent any arbitrary XML element (including all attributes, namespace declarations, and content), so it provides all the flexibility needed to work with any document matching the schema definition.

Listing 9 shows the generated code matching an <xs:any> component in the Listing 7 schema sample. Because the <xs:any> uses maxOccurs="unbounded", the generated code uses a list of Elements.


Listing 9. <xs:any>-generated code sample
/**
 * ...
 * Schema fragment(s) for this class:
 * <pre>
 * <xs:complexType xmlns:xs="http://www.w3.org/2001/XMLSchema" mixed="true" 
 *    name="AdditionalDataType">
 *   <xs:sequence>
 *     <xs:any minOccurs="0" maxOccurs="unbounded" processContents="strict" 
 *        namespace="##any"/>
 *   </xs:sequence>
 *   <xs:attribute type="xs:string" name="type"/>
 * </xs:complexType>
 * </pre>
 */
public class AdditionalDataType
{
    private List<Element> anyList = new ArrayList<Element>();
    private String type;

    /**
     * Get the list of sequence items.
     *
     * @return list
     */
    public List<Element> getAny() {
        return anyList;
    }

    /**
     * Set the list of sequence items.
     *
     * @param list
     */
    public void setAny(List<Element> list) {
        anyList = list;
    }
   ...
}

Some aspects of the schema definition in Listing 9 are ignored or only partially handled by CodeGen. First, the enclosing <xs:complexType> definition specifies mixed="true", which means that character data is allowed to be intermixed with the elements represented by the <xs:any> particle. The data model generated by CodeGen has no place to hold such character-data content, so it'll just be discarded when a document is unmarshalled. Second, the <xs:any> uses processContents="strict", meaning any elements present in instance documents need to have their own schema definitions. CodeGen ignores this attribute, though it is possible to get a similar effect using a different form of <xs:any> handling (discussed below). CodeGen also ignores <xs:any> namespace restrictions. Listing 9 uses namespace="##any", meaning elements matching the <xs:any> are not namespace-restricted, but if the value had instead been namespace="##other", for example, the result would be the same.

You can use the any-handling customization attribute at any level of customizations to select other ways of handling <xs:any>. The value any-handling="discard" simply ignores the <xs:any> in the generated data model and discards any elements corresponding to the <xs:any> when unmarshalling occurs. any-handling="dom" matches the default handling, using org.w3c.dom.Element to represent an element matching the <xs:any>. Finally, any-handling="mapped" generates code that requires a global schema definition for each element matching the <xs:any> (roughly corresponding to the processContents="strict" schema condition). In this last case, the data model uses java.lang.Object to represent an element, with the actual runtime type of the object matching the global schema definition.



Back to top


<xs:simpleType> handling

Like most forms of code generation from schema, CodeGen ignores or only partially handles many aspects of <xs:simpleType> definitions. <xs:simpleType> restrictions are one example of this limited support. Of the many varieties of simpleType restrictions defined by schema (which include length restrictions, value ranges, and even regular expression patterns), only <xs:enumeration> restrictions are currently enforced in the generated data model.

<xs:simpleType> <union>s are also currently ignored by CodeGen. Listing 10 shows the generated code matching an <xs:union> reference, along with the original schema fragments matching the code (at the bottom of the listing). You can see in Listing 10 that each of the references to a union type (including both the TimeCardDuration type shown in the listing and the AnyDateTimeType) is represented by a simple String value in the generated code.


Listing 10. <xs:union>-generated code sample and original schema
/**
     * Schema fragment(s) for this class:
     * <pre>
     * <xs:element xmlns:ns="http://ns.hr-xml.org/2007-04-15" 
     *    xmlns:xs="http://www.w3.org/2001/XMLSchema" name="TimeInterval">
     *   <xs:complexType>
     *     <xs:sequence>
     *       <xs:element type="ns:EntityIdType" name="Id" minOccurs="0"/>
     *       <xs:element type="xs:string" name="StartDateTime"/>
     *       <xs:choice>
     *         <xs:sequence>
     *           <xs:element type="xs:string" name="EndDateTime"/>
     *           <xs:element type="xs:string" name="Duration" minOccurs="0"/>
     *         </xs:sequence>
     *         <xs:element type="xs:string" name="Duration"/>
     *       </xs:choice>
     *       ...
     * </pre>
     */
    public static class TimeInterval
    {
        private EntityIdType id;
        private String startDateTime;
        private int choiceSelect = -1;
        private final int END_DATE_TIME_CHOICE = 0;
        private final int DURATION_CHOICE = 1;
        private String endDateTime;
        private String duration;
        private String duration1;
        ...

    ...
    <xsd:element name="TimeInterval">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="Id" type="EntityIdType" minOccurs="0"/>
          <xsd:element name="StartDateTime" type="AnyDateTimeType"/>
          <xsd:choice>
            <xsd:sequence>
              <xsd:element name="EndDateTime" type="AnyDateTimeType"/>
              <xsd:element name="Duration" type="TimeCardDuration" minOccurs="0"/>
            </xsd:sequence>
            <xsd:element name="Duration" type="TimeCardDuration"/>
          </xsd:choice>
          ...
<xsd:simpleType name="TimeCardDuration">
  <xsd:union memberTypes="xsd:duration xsd:decimal"/>
</xsd:simpleType>



Back to top


Schema modifications

If you compare the schema fragments embedded in the Javadoc at the top of Listing 10 with the actual schema fragments at the bottom of the listing, you'll see that the union simpleType references in the original schema have been replaced by xs:string references in the Javadoc version. This is deliberate, and it's representative of several types of transformations to schema structures performed by CodeGen. Some transformations, such as the elimination of <union> simpleTypes and of simpleType restriction facets other than <xs:enumeration>, are hardcoded into the CodeGen operation. Other transformations are controlled by customizations. Either way, schema fragments included in Javadocs always show the transformed schema because this is what is actually used to generate the code.

You'll see more types of transformations controlled by customizations in the following sections of the tutorial.



Back to top



Go to the previous pagePage 6 of 14 Go to the next page