In an earlier Thinking XML installment, I introduced Universal Business Language. UBL is a library of documents for business transactions that's designed to accommodate organizations of all sizes and locales. Recently OASIS, the consortium behind UBL, announced that its members have ratified version 1.0 as an OASIS Standard. The close association of the OASIS UBL effort with the OASIS/United Nations Electronic Business XML (ebXML) work means UBL is likely to rise quickly to formal international standardization now that it is complete. Indeed, it is wending its way through the ISO standards track, through the auspices of ISO Technical Committee 154 (A UN/CEFACT group), which in their words specializes in "open trade data interchange" and is the group behind the EDIFACT standards. The UBL group maintains formal liaisons with dozens of other standards organizations and consortia, from the horizontal (UN/CEFACT, ANSI ASC X12, ISO, RosettaNet, OAG) to the vertical (ACORD, HL7, SWIFT, XRBL). I've covered all of these groups previously in this column.
UBL's core non-technical principles include addressing the needs of a diversity of organization types and locales, and its commitment to freedom from royalties and intellectual property encumbrances. Now, after three long years of development, it is being presented as a complete product. This means you will soon learn whether these admirable principles, combined with the attention to technical detail I discuss below, are enough to establish UBL as a foundation for XML document exchange.
UBL is the most visible artifact of a sizeable and very formal framework for electronic business transactions. Several collaborating organizations, primarily OASIS and UN/CEFACT, are developing UBL. UBL was designed as a payload format for ebXML messaging (ebMS), to be administered through business process and agreement (contract) standards developed as ebXML Business Process Specification Schema (BPSS) and Collaboration Protocol Profile and Agreement (CPPA), respectively. UBL is semantically anchored in ebXML Core Components, which are reusable data elements that identify some abstract concept. Core Components require business context before they can be used in practice, and UBL defines Business Information Entities (BIEs), which are contextualized Core Components. BIEs make up the UBL conceptual model, organizing business concepts into classes and associations. UBL takes the BIEs that make up a business transaction and translates them to an XML format for the actual message. This entire framework is held in place in the ebXML Registry/Repository. Much of this ebXML framework is now standardized as ISO 15000.
The ebXML Core Components Technical Specification (ISO 15000-5) is a system for expressing business information in a reusable yet flexible way. It defines Core Components (CCs), which I've mentioned several times before in this column. In the characterization I set up in "Semantic anchors for XML," CCs are a bottom-up initiative, defining terms and concepts at the discrete level, independently of the documents in which they are to appear (the document layer is handled by UBL or similar specifications). UBL is the first fully-conformant implementation of CCTS.
A CC may be atomic (also known as basic) or aggregate. An example of an aggregate component is a postal address, which makes up a coherent, abstract concept; it is composed of several atomic components such as province and postal code. To get an idea of where UBL comes in, think of the many contexts in which addresses are used. For example, in the case of a company making a purchase order, you'll often see a distinction between the billing address and the shipping address. This distinction is reflected in business rules such as a rule that a shipping address may not be a post office box. These distinctions make up the business context that turns the component into a UBL BIE. Several aspects of the context may affect mapping to a BIE. For example, a US address has a specific information structure that's specialized from the abstract concept of an address. It has a zip code, which can be recognized and constrained as being, say, different from a United Kingdom postal code. The former is made up entirely of numbers, while the latter is a mix of letters and numbers (among other differences).
The Core Components Technical Specification is designed according to the ISO/IEC 11179 specification for metadata registries (generally data dictionaries). The ISO 11179 specifications are extremely thorough and meticulous, and there is no doubt that UBL is founded on the most rigorous semantic basis one could expect. I personally wonder whether all these layers of bedrock might impose excessive rigidity and complexity, but to be fair UBL tries to encapsulate things so that you don't have to drill down into the semantic sediment any further than your application immediately requires. Most users will merely be concerned with what tags and content to put in what order to construct a valid UBL document.
In my earlier look at UBL, I mentioned the UBL Naming and Design Rules (NDR), which seek to bring extreme care to the translation from BIEs to actual XML components. The UBL developers detail the BIEs in spreadsheets, and create XML schemas by applying NDRs. Again, it's all so rigorous as to be intimidating (the NDR document is certainly an imposing text), but all you probably need to care about is the end product: XML schemata.
To have a look at this XML product, I downloaded the UBL 1.0 package, which contains everything from the spreadsheets that define the BIEs to ASN.1 schema rules to the XML schemata for all the document types (unfortunately, all in WXS; no RELAX NG) to sample XML and generated documentation. The sample XML is much easier to find than it was the last time I looked at UBL packaging. Looking over the sample documents, I could immediately see that some things have changed -- so in Listing 1, I present an example that corresponds to the code listing in my previous UBL article, this time illustrating UBL 1.0.
Listing 1. Excerpt from a UBL 1.0 sample order transaction document.
<?xml version="1.0" encoding="UTF-8"?> <Order xmlns="urn:oasis:names:specification:ubl:schema:xsd:Order-1.0" xmlns:cbc= "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-1.0" xmlns:cac= "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-1.0" > <!-- Trimmed some spurious namespaces, as well as the xsi:schemaLocation attribute --> <BuyersID>20031234-1</BuyersID> <cbc:IssueDate>2003-01-23</cbc:IssueDate> <cbc:LineExtensionTotalAmount amountCurrencyCodeListVersionID="0.3" amountCurrencyID="USD">438.50</cbc:LineExtensionTotalAmount> <cac:BuyerParty> <cac:Party> <cac:PartyName> <cbc:Name>Bills Microdevices</cbc:Name> </cac:PartyName> <cac:Address> <cbc:StreetName>Spring St</cbc:StreetName> <cbc:BuildingNumber>413</cbc:BuildingNumber> <cbc:CityName>Elgin</cbc:CityName> <cbc:PostalZone>60123</cbc:PostalZone> <cac:CountrySubentityCode>IL</cac:CountrySubentityCode> </cac:Address> <cac:Contact> <cbc:Name>George Tirebiter</cbc:Name> </cac:Contact> </cac:Party> </cac:BuyerParty> <!-- Cut to one of the order lines --> <cac:OrderLine> <cac:LineItem> <cac:BuyersID>1</cac:BuyersID> <cbc:Quantity quantityUnitCode="PKG">5</cbc:Quantity> <cbc:LineExtensionAmount amountCurrencyCodeListVersionID="0.3" amountCurrencyID="USD">12.50</cbc:LineExtensionAmount> <cac:Item> <cbc:Description>Pencils, box #2 red</cbc:Description> <cac:SellersItemIdentification> <cac:ID>32145-12</cac:ID> </cac:SellersItemIdentification> <cac:BasePrice> <cbc:PriceAmount amountCurrencyCodeListVersionID="0.3" amountCurrencyID="USD">2.50</cbc:PriceAmount> </cac:BasePrice> </cac:Item> </cac:LineItem> </cac:OrderLine> </Order>
The biggest difference is the use of namespaces for partitioning the elements into framework elements, elements based on basic CCs, and those based on aggregate CCs. For example, if you look at
cac:Address, you can see how it is marked as an aggregate concept (based on the namespace bound to the
cac prefix), and that it contains mostly subelements that are based on basic CCs (in the namespace bound to the
cbc prefix). UBL 1.0 also has a few minor structural and naming differences, but the namespaces really jump out. Sample UBL documents appear to be the creation of designers who are a bit too strongly influenced by WXS, especially in the way that namespaces are used (and occasionally not used).
It's the sort of format that makes those who (like me) are more grounded in the ISO DSDL world-view think "I guess I understand why they had to come up with Namespace Routing Language (NRL)." It will be very interesting to see how best practices emerge for modular processing of UBL documents.
The UBL TC is not working in a vacuum. Earlier, I mentioned my disappointment with the lack of an official RELAX NG schema set for UBL 1.0. Hiroshi Naito and his colleagues at the Osaka Institute of Technology have released draft RELAX NG schemata for UBL. At present the work is based on the UBL 1.0 beta rather than the final release, and it is documented in Japanese. I mentioned in my last article the free UBL XSL-FO stylesheets from Crane Softwrights Ltd., which allows for XSLT conversion of UBL documents to PDF, TeX, and other print-ready formats. Ambrosoft has announced a free Java formatter that makes the Crane UBL stylesheets available in a single Java .jar file for easy usage (see Resources).
As I re-read the conclusion from my last UBL article, I find that almost all the points are still relevant. UBL has a lot of promise, but after three years of work it still just scratches the surface of the range of document and transaction sets needed to cover electronic business needs. Even all the effort that's put into extensibility of UBL doesn't do all that much to lighten this load. The group acknowledges this in the UBL FAQ:
UBL does not attempt in its first release to reproduce the functionality of existing EDI message sets or even of the xCBL specification with which it began. Instead of the 40-50 document types in these older standards, UBL in its first release focuses on simple procurement and contains just eight basic document types (in addition to the component library upon which they are based). It is explicitly an 80/20 solution that emphasizes a fit with small-business practices and inexpensive generic software.
I can certainly appreciate this good, practical sense. As the UBL TC's ambition grows in contemplating UBL 2.0, the liaison with other groups such as UN/CEFACT, OAG, and ANSI X12 should help. But even before UBL 2.0, a UBL 1.1 is planned as a fairly minor release addressing some outstanding issues that were deferred for the 1.0 release. Meanwhile, the basics are in place and readily available in the UBL 1.0 package. I encourage you to explore them for yourself. As UBL and sister technologies continue to mature and develop, I shall keep an eye on them in this column.
- Participate in the discussion forum.
- Browse the huge amount of information put forth by the OASIS Universal Business Language TC, including the UBL 1.0 release. The UBL FAQ includes an overview of the contents of the UBL 1.0 package.
- Find useful UBL support material, including a RELAX NG schema set among the UBL downloads maintained by Hiroshi Naito (page in Japanese).
- Learn more about the ISO groups connected to UBL. UBL taps into ISO wherever it can, seeking maximum accord with international standards, and particularly ISO/TC154. This page also links to the ebXML specifications being standardized as ISO 15000.
- Try out Crane Softwrights Ltd.'s UBL stylesheet libraries and their packaging in Ambrosoft's UBL 1.0 transformation library.
- In an earlier Thinking XML installment, the author introduced Universal Business Language (UBL) (developerworks, February 2003).
- Confused by all the XML standards out there? Uche Ogbuji's developerWorks article series on XML standards can help you sort through it all:
- Part 1 -- The core standards (January 2004)
- Part 2 -- XML processing standards (February 2004)
- Part 3 -- The most important vocabularies (February 2004)
- Part 4 -- Detailed cross-reference of the most important XML standards (March 2004)
- Find more XML resources on the developerWorks XML zone, including previous installments of the Thinking XML column. If you have comments on this article, please post them on the Thinking XML forum.
- Learn how you can become an IBM Certified Developer in XML and related technologies.
Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management. Fourthought develops 4Suite, an open source platform for XML, RDF, and knowledge-management applications. Mr. Ogbuji is also a lead developer of the Versa RDF query language. He is a computer engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at firstname.lastname@example.org.