Skip to main content
skip to main content

developerWorks  >  SOA and Web services | XML  >

Cleanup your schema for SOAP

Updating XML schemas to be SOAP-friendly

developerWorks
Document options

Document options requiring JavaScript are not displayed

Sample code


My developerWorks needs you!

Connect to your technical community


Rate this page

Help us improve this content


Level: Introductory

Shane Curcuru (shane_curcuru@us.ibm.com), Advisory Software Engineer, IBM Research

29 Apr 2003

More and more projects are using XML schemas to define the structure of their data. As your repository of schemas grows, you need tools to manipulate and manage your schemas. The Eclipse XSD Schema Infoset Model has powerful querying and editing capabilities. In this article, Shane Curcuru will show how you can update a schema for use with SOAP by automatically converting attribute uses into element declarations.

This article prerequisites an understanding of schemas in XML and how SOAP works. The sample code included in the zip file works standalone or in an Eclipse workbench.

Introduction: the cleanup example

If you've built a library of schemas, you might want to reuse them for new applications. If you already have a data model for an internal purchase order, as you move towards Web services you may need to update it for use with SOAP. SOAP allows you to transport an xml message across a network; the xml body can also be constrained with a schema. However a SOAP message typically uses element data for its xml body, not attribute data. You'll explore a program that can automatically update an existing schema document to convert any attribute declaration into roughly "equivalent" element declarations.


Figure 1. Changing attributes into elements
Changing attributes into elements


Back to top


Overview: how you're going to do it

Given the complexity of XML schemas, you certainly don't want to use Notepad to edit the .xsd files. A good XML editor is not much of a step up -- while it may organize your elements and attributes nicely, it can't show the many abstract Infoset relationships that are defined in the Schema specification. That's where the Schema Infoset Model comes in; it expresses both the concrete DOM representation of a set of schema documents, and the full abstract Infoset model of a schema. Both of these representations are shown through the programmatic API of the Model as well as in the Model's built-in sample editor for schemas.

Visual Schema Editing

If you've installed the XSD Schema Infoset Model and Eclipse Modeling Framework (EMF) plugins into Eclipse, you can see the sample editor at work in your Workbench. (Note: Exploring the editor is not required to follow this article). Simply right-click on a schema.xsd file in the Navigator, and select Open With... then Sample XML Schema Editor. You will get a standard Eclipse editor that shows the usual Source view -- this is the concrete DOM representation of the .xsd file that you just opened.

At the bottom of the editor there are two more tabs after the Source view -- Semantics and Syntax. These two graphical tree views show the various abstract Infoset relationships between schema components. For example, in the Semantics view, you will see a top-level item for Types -- this is all of the types (simple and complex) declared anywhere in the schema itself, not just at the top level, and not just in this document (this becomes more obvious when the schema document you opened uses includes and imports).

For the purposes of the example, I'm going to simplify this problem slightly. MakeSoapCompatible.java is a program that will attempt to take most attributeDeclarations in a schema and turn them into roughly-equivalent elementDeclarations, with a few caveats.

First, you obviously can't convert an attribute into an element if there already is an element with that name in the particles of the complexType. Therefore, I will detect name conflicts first, and refuse to change these attributes. To keep the example simpler, I also arbitrarily declare some other conditions that make a schema incompatible for this program. I won't change schemas that have wildcards, since you would have to do some complex name-checking to ensure that any changed attributes wouldn't conflict with elements. I also won't change any groups that use #all or #choice as a compositor since this could change the meaning of the group in ways you can't easily predict.



Back to top


Evaluating the schema for incompatibilities

Using the power of abstract Infoset relationships

First, you need to find all of the complexTypes in the schema, since those are the only places that attributes can actually be used. You don't need to search for all attribute declarations, because you can always ask each complexType later what attributes it actually uses. Querying your schema document is a very simple process: iterate through the contents and look for complex types. Note that there are a wide variety of querying methods for examining schema contents, this is just one way to do it.


Listing 1. Finding complexTypes
// Find type definitions: for our purposes, the simplest 
//  way to get all complexTypes is to drop down to the 
//  underlying EMF model of a schema to iterate through 
//  all concrete components contained within this schema
List complexTypeDefinitions = new ArrayList();
for (Iterator iter = schema.eAllContents(); iter.hasNext(); )
{
    XSDConcreteComponent concreteComponent = (XSDConcreteComponent)iter.next();
    if (concreteComponent instanceof XSDComplexTypeDefinition)
    {
        complexTypeDefinitions.add(concreteComponent);
    }
}
// An alternate method would be to use the abstract Infoset 
//  relationship of schema.getTypeDefinitions(), which would 
//  get all globally-visible typedefs (simple and complex) 
//  within the whole schema, however that would miss any 
//  types that were nested inside of other components

Now that you've got a list of all the complexTypes you need to update, let's exclude any types that you have decided are incompatible with your sample program. Since you are just querying information about various schema components, you can make effective use of the many abstract Infoset relationships and methods that the Model exposes. These abstract methods automatically take into account things like base types and derivation types; references to declarations elsewhere; and effects of imported, included, or redefined schema documents.


Listing 2. Looking for incompatibilities

// Detect name collisions between top-level elems and attrs
List elementNames = getElementNames(complexType);
List attributeNames = getAttributeNames(complexType);
attributeNames.retainAll(elementNames);
if (!attributeNames.isEmpty()) {
    // Report the name collision and return...
}

// Now check for any attribute wildcards, which we 
//  can't really change into elements
XSDWildcard attributeWildcard = complexType.getAttributeWildcard();
if (null != attributeWildcard) {
    // Report an incompatible wildcard and return...
}

// Check the content for other incompatible conditions like 
//  groups with choice or all or a simpleType
XSDComplexTypeContent complexTypeContent = complexType.getContent();
if (complexTypeContent instanceof XSDSimpleTypeDefinition) {
    // Report a simple type as incompatible and return...
}
else if (null != complexTypeContent)
{
    XSDTerm particleTerm = ((XSDParticle)complexTypeContent).getTerm();

    if (particleTerm instanceof XSDModelGroup)
    {
        XSDCompositor compositor = ((XSDModelGroup)particleTerm).getCompositor();
        if ((XSDCompositor.ALL_LITERAL == compositor)
                || (XSDCompositor.CHOICE_LITERAL == compositor)) {
            // Report an incompatible group type and return...
        }
    }
    // more checks for wildcards, etc.
}

Note: not all code for detecting incompatibilities is shown here; please download the samples zip file (see Resources) to see the full story! The MakeSoapCompatible.java program is carefully designed and thoroughly commented to showcase how to manipulate schemas with the Model, and should be your next stop if you want to learn more.



Back to top


Creating element declarations

Getting concrete when adding and manipulating components

Once you've found some complexTypes that you want to update, you need to get concrete. For each complexType you'll iterate over the getAttributeContents() list of concrete attribute uses that the type has. For each attribute use you first ensure that you're pointing at the real declaration of the attribute -- even if it's a reference to a declaration elsewhere. Then you want to create an elementDeclaration that has the same name and type as each attribute use -- this turns out to be a straightforward process. You'll also put the elementDeclaration inside a new particle's getContents(), since you will want to add this particle to your complexType later.


Listing 3.Changing attributes to elements
if (attrDecl.isAttributeDeclarationReference())
    attrDecl = attrDecl.getResolvedAttributeDeclaration();

// Create a blank element and simply copy over the 
//  pertinent data about the attribute
XSDElementDeclaration elemDecl = XSDFactory.eINSTANCE.createXSDElementDeclaration();
elemDecl.setName(attrDecl.getName());
elemDecl.setTypeDefinition(attrType);

// Note that since an annotation's elements are only modeled 
//  in the concrete tree that we must explicitly ask to clone them
if (null != attrDecl.getAnnotation()) {
    cloneAnnotation(attrDecl, elemDecl);
}
// Wrap this element in a particle
XSDParticle particle = XSDFactory.eINSTANCE.createXSDParticle();
particle.setContent(elemDecl);

This is an area that clearly shows the difference between the concrete model and the abstract model. You may be wondering what a particle is if you're looking at your schemaDocument.xsd files, since you probably don't see any xsd:particle elements. You can read the specification for a particle, although it's quite detailed. A particle is essentially the abstract container of an element declaration, model group, or any (wildcard); the particle is what defines its min/maxOccurs constraints at that particular place in the schema. Since the Model can express both the concrete and abstract representations of a schema, it's easy to work with either kind of representation.

Annotations are the one kind of schema component that are only modeled in the concrete representation of the model, so they require slightly special handling. In this code sample, you will copy over any annotations from the attribute declaration into the new element declaration you just created. You use the DOM cloneNode() method to actually clone or copy over the contents of the annotation component, and then add the annotation itself to the new element declaration.


Listing 4. Cloning concrete annotations

XSDAnnotation oldAnnotation = attrDecl.getAnnotation();
XSDAnnotation newAnnotation = XSDFactory.eINSTANCE.createXSDAnnotation();
try {
    Element oldAnnElem = oldAnnotation.getElement();
    // Use the DOM method to do a deep clone of the element
    Element newAnnElem = (Element)oldAnnElem.cloneNode(true);
    newAnnotation.setElement(newAnnElem);
    elemDecl.setAnnotation(newAnnotation);
} 
catch (Exception e) {
    // Report the error and return
}



Back to top


Swapping the elements for attributes

Now that you've got a new element declaration to replace your attribute use, you need to swap the two in your complexType component. Since you were careful to use the concrete containment relationship of complexType.getAttributeContents() in your loop, you can simply go add the new elementDeclaration, and then call attrContentsIter.remove() to remove the actual attribute use from your type.


Listing 5. Using concrete lists to delete items

// Use this concrete relationship, since we're going to 
//  actually remove the attributes from this type
for (ListIterator iter = 
        complexType.getAttributeContents().listIterator(); 
        iter.hasNext(); /* no-op */ ) {

    if (changeAttributeIntoElement(complexType, 
            (XSDAttributeGroupContent)iter.next(), changedAttrs)) {
        // Note that list manipulation calls like remove() 
        //  will only work properly on concrete lists; 
        //  attempting to manipulate 'abstract' lists will 
        //  either throw an exception or will silently fail
        iter.remove();
    }
    else {
        // Report the error and continue...
    }
}



Back to top


Writing out the schema

The MakeSoapCompatible sample program will print its status to System.out as it executes. If you detect that there are no attributes actively used in the schema, you report that no changes were made, and exit. Otherwise, you write out a modified schema document to a new name, and report your status -- either a list of attributes successfully changed into elements, or a warning about which attributes had name conflicts, and were left unchanged.

Presuming that you've changed at least some attribute declarations into equivalent element declarations, you want to save the schema for later use with your SOAP application. The EMF framework that the Model is built on top of provides Resource handling services for loading and saving schema documents in a variety of ways. This code sample shows one very simple way to serialize directly to a URI, in this case an output file on disk named after the original input file.


Listing 6. Writing a schema to a file
File outFile = new File(newLocation);
FileOutputStream fos = new FileOutputStream(outFile);
// Ensure that the abstract model is synchronized with the 
//  concrete tree: this will ensure that the Model has 
//  updated the concrete Element in the schema document 
//  with any changes that may have been made in the 
//  abstract model
schema.updateElement();

// Simply ask the XSDResourceImpl to serialize the schema to 
//  a document for us; this is just one way we can easily use 
//  the XSD/EMF framework to manage resources for us
XSDResourceImpl.serialize(fos, schema.getElement());
fos.close();



Back to top


Conclusion

As you've seen, performing a conceptually simple editing operation on schema documents (turning attributes into elements) can entail a fair amount of work. However the power of the Schema Infoset Model's representation of both the abstract Infoset of a schema and its concrete representation of the schema documents makes this a manageable task. The Model also includes simple tools for loading and saving schema documents to a variety of sources, making it a complete solution for managing your schema repository programmatically.

Some users might ask, "Why not use XSLT or another XML-aware application to edit schema documents?" While XSLT can easily process the concrete model of a set of schema documents, it can't easily see any of the abstract relationships within the overall schema that they represent. For example, suppose that you need to update any enumerated simpleTypes to include a new UNK enumeration value meaning unknown. Of course, you only want to update enumerations that fit this format of using strings of length of three; you would not want to update numeric or other enumerations.

While XSLT could find all of the simpleType declarations, it cannot understand the relationship between types and base types, or easily evaluate the meanings of facets on those types. The Model's representation of the abstract Infoset relationships within a schema includes things like simpleType.getEffectiveEnumerationFacets() that takes into account base types, references, and other relationships within the schema. It will return the complete list of enumerations on that simpleType which you can easily query and update with your new value, if appropriate. The Model also includes powerful support for managing namespaces and resolving other types at any point within your schema that would be very difficult to do with other tools.



Back to top


Sample Code

A sample program, MakeSoapCompatible.java, shows the example in this article; anyone interested in exploring further should definitely read the comments there. A simple schema document MakeSoapCompatible.xsd is also included showing a basic purchase order with attributes to be changed into elements; the sample program can also be run against any other schema documents you have. The sample program requires at least the XSD Schema Infoset Model and the Eclipse Modeling Framework to run standalone.

You can download the sample program and the following utility programs in the sample Zip file (see Resources).

Copies of two other utility .java files normally shipped with the XSD Schema Infoset Model (version 1.0.1 and later) are also attached with commented code showing several other useful techniques. These include:

XSDSchemaQueryTools.java showcases a number of other ways to perform advanced queries on schema components.

XSDSchemaBuildingTools.java has convenience methods for building schemas programmatically.




Back to top


Download

NameSizeDownload method
ws-clean.zipFTP
Information about download methodsGet Adobe® Reader®


Back to top


Resources



Back to top


About the author

Shane Curcuru has been a developer and quality engineer at Lotus and IBM for 12 years and is a member of the Apache Software Foundation. He has worked on such diverse projects as Lotus 1-2-3, Lotus eSuite, Apache's Xalan-J XSLT processor, and a variety of XML Schema tools. You can contact Shane Curcuru at shane_curcuru@us.ibm.com.




Back to top


Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
 


Back to top



    About IBMPrivacyContact