XML recently celebrated its fourth anniversary. In these four years, it has achieved more than perhaps any other group of technologies under the umbrella of a single buzzword. It is now routine for all sorts of software systems to be deployed with XML interchange facilities. Most development tools, from database management systems to programming languages, now come with XML processing facilities.
When XML 1.0 was introduced, it was billed as more than another valuable tool in the hacker's toolbox, but also as a system for encouraging people to share information more freely. XML promised to standardize the formats that underlie electronic commerce between organizations and to lay the foundations of a more intelligent Web with smarter search engines and support for next-generation distributed applications.
At the heart of these promises lies the ability to go beyond sharing simple tagged documents to sharing some of the concepts behind the tags in the documents. This is the idea of sharing semantics, and has been the focal point of this column. I have discussed organizational efforts to establish dictionaries and maps of XML formats so that semantics can be shared by thorough and formal documentation (for example, RosettaNet and ebXML). I have also discussed practical techniques for expressing semantics in programs using RDF. The foundations of the technology are there for such work, as I take care to demonstrate, but the question is whether the softer foundations will ever be laid.
By softer foundations, I mean agreement. All the consortia, registries, repositories, schema languages, and vocabularies in the world will be for naught without a critical mass of people relying on them to make and formalize their agreements. There are hundreds of definitions of what makes up a purchase order, but is there enough will to match up one purchase order format with another using the tools that XML-related technologies offer?
In the rest of this column, I'll make my usual survey of recent developments in semantic transparency. The fact that there is still a great deal of activity along these lines shows that there is still a chance of meeting the grand promises of XML.
The Open Applications Group (OAG) is another pioneering group trying to standardize XML formats for interchange between organizations. Their core work product has been the Open Applications Group Interoperability Standard (OAGIS), of which a new release, version 8.0, was recently announced. OAGIS contains a collection of message sets for common business transactions, formatted in XML. These are called Business Object Documents (BOD). The sequence of messages that form common business interactions are called scenarios. OAG encourages users to exchange each BOD in a scenario directly, using electronic application integration (EAI) features, basic or secure HTTP or SMTP, or SOAP, or even within ebXML frameworks.
Version 8.0 of OAGIS offers the same collection of BODs as version, 7.2.1. Support for more descriptive tag names has been added, but most of the changes are additions of technical feature specifications. In particular, BODs now have W3C XML Schema Definition Language (XSDL) definitions, and there are mappings defined using XPath for integrating application execution based on message content. All the documentation of the BODs in version 8.0 is provided in strongly cross-referenced HTML.
OAG also seems to take the idea of interoperability with other groups very seriously. The 7.2.1 release had already provided support for ebXML, RosettaNet Implementation Framework (RNIF), and Microsoft's BizTalk. With the addition of core XML technology mappings, the path to supporting OAGIS within all manner of generic XML tools is now made rather clear.
OAGIS 8.0 is a free download. You get a big zip file, and after unzipping it you can start by viewing the included index.html in a Web browser. The archive also includes XML schemata (in XSDL only -- no DTDs in sight) and example BODs. The schemata are huge and spread out into a maze of interconnected files. OAG recommends an XML IDE, perhaps on the assumption that such a tool can help make sense of the sprawling schema documents. The Xerces parser, Saxon XSLT engine, and XSV XSDL validator are bundled for convenience.
There are also examples for all the 200 or so BODs. These examples are a bit less useful than they could be because they have numerous placeholders, such as the text "String," presumably so as to say "place some string here." Perhaps this is suitable for a template, but in an example, it is not very helpful. It seems unlikely that these samples are meant to be merely templates, since OAG recommends XML tools, any of which would be able to easily generate a template from a schema. Rather, as examples they should help the user understand the structure and intent of the document as a whole in a friendly way. The incomplete examples are thus a bit frustrating, but given that BODs can be pretty large, complete examples were probably judged to be too much work to put into a freely available package.
The U.S. Government, through the various defense agencies, has been responsible for establishing many common technologies -- from the Global Positioning System (GPS) to the Internet itself. This is particularly true when the developing technologies require the complicated deployments that private companies shy away from investing in because of the risk and the possibility of unwittingly benefiting competitors and late players who didn't contribute. A recent initiative of the U.S. Department of Defense's (DoD) is a very hopeful sign for XML semantic transparency.
The DoD assigned the Defense Information Systems Agency (DISA) to establish a repository of XML components primarily for use within defense agencies, but apparently with broader potential reach (see Resources). Do not confuse this DISA (http://www.disa.mil) with the Data Interchange Standards Association (http://www.disa.org), which I mentioned in the first installment of this column. Among other things, the civilian DISA is host of the Accredited Standards Committee (ASC) for the X12 EDI standards.
The impetus behind the effort is twofold: A variety of government agencies increasingly use XML, and the benefits of using XML are expected to multiply with standardization -- or at least by registration of the various formats, best practices, and relationships with vendors and industry groups. (This varied mix is generically called components in the DISA memos.) DISA already manages the DoD XML Registry v2.1, which, according to the definition on its home page "enables the consistent use of XML, both vertically within projects and horizontally across organizations." Specifically, the registry defines XML information items from a variety of vocabularies, organized, for example, by namespace (in this case organizationally rather than technically as in XML namespaces 1.0). Each namespace has a manager, and is cross-referenced with related items in other namespaces.
For example, you could go to this registry, under the
Personnel namespace, and look up the information item for defining marital status (
Con_MaritalStatus (1.0)). You could even download an XML document that expresses the formal values in the marital status enumeration (see Listing 1).
<![CDATA[ <?xml version="1.0"?> <!DOCTYPE DomainValueSet SYSTEM "http://diides.ncr.disa.mil/xmlreg/DTD/registry_domain_values.dtd"> <DomainValueSet> <EffectiveDate>10/15/2001</EffectiveDate> <SecurityClassification>Unclassified</SecurityClassification> <Definition> The code that represents the Marital Status of a Person </Definition> <Namespace>PER</Namespace> <InformationResourceName>Con_MaritalStatus</InformationResourceName> <InformationResourceVersion>1.0</InformationResourceVersion> <DomainValues> <DomainValue> <KeyValue>D</KeyValue> <Description>Divorced</Description> </DomainValue> <DomainValue> <KeyValue>I</KeyValue> <Description>Interlocutory</Description> </DomainValue> <DomainValue> <KeyValue>L</KeyValue> <Description>Legally Separated</Description> </DomainValue> <DomainValue> <KeyValue>A</KeyValue> <Description>Marriage Annulled</Description> </DomainValue> <DomainValue> <KeyValue>M</KeyValue> <Description>Married</Description> </DomainValue> <DomainValue> <KeyValue>N</KeyValue> <Description>Never Married</Description> </DomainValue> <DomainValue> <KeyValue>W</KeyValue> <Description>Widowed</Description> </DomainValue> </DomainValues> </DomainValueSet> ]]>
Now DISA has a charter to expand even more on this system, which is already a powerful tool for interoperability and for gleaning network benefits. One hopes this is yet another government project that is a good investment of taxpayer money. You can find more information on the DISA registry and other resources related to the use of XML technology in government at the XML.gov Web site (see Resources).
XML is already extraordinarily successful, but one gets the sense that the critical mass is there to make a fundamental leap in the efficiency of information systems using XML as the stepping stone. Continued effort on shared semantics based on XML (such as the long-standing effort of the OAG group) is an essential part of making this happen. A dramatic breakthrough to success will probably require a determined effort by a party with deep pockets, one that has more to gain from interoperability than it has to risk. The U.S. Government might just be that party, and the DISA effort shows the potential for taking XML to the next level.
In the next column in this series, I shall go back to the issue tracker. Now that many of the fundamental pieces are in place -- XML source, RDF schema, and query -- you can start to shape the application in a middleware setting, illustrating how RDF can enhance the value of the XML that is already in your applications.
- Read about the Open Applications Group work on "Best Practices and XML Content for eBusiness and Application Integration."
- Discover the Defense Information Systems Agency's rich registry of XML information resources. Keep watching for the planned expansion of this work.
- Explore the numerous resources at XML.gov regarding the use of XML in the U.S. government.
- Review previous installments of this column that discuss initiatives for semantic transparency:
- "XML meets semantics: The reality" (developerWorks, February 2001)
- "XML meets semantics: Meet the new kids on the block, and one more from the old neighborhood" (developerWorks, May 2001)
- "Walking the semantic beat" (developerWorks, May 2001)
- "Once again around the block" (developerWorks, January 2002)
- Read "WebSphere Business Components and Web services architectures" as Brent Carlson and others discuss the use of WebSphere to plug in business components defined as XML schema such as OAGIS and RosettaNet (developerWorks, October 2000).
- Check out Thinking XML's previous columns.
- Find out how you can become an IBM Certified Developer in XML and related technologies.
- Find many more XML resources on the developerWorks XML zone.
- Finally, take a look at IBM WebSphere Studio Application Developer, an easy-to-use, integrated development environment for building, testing, and deploying J2EE applications, including generating XML documents from DTDs and schemas.
Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF and knowledge-management applications. Mr. Ogbuji is a Computer Engineer and writer born in Nigeria, living and working in Boulder, Colo., USA. You can contact Mr. Ogbuji at firstname.lastname@example.org.