XML, or EXtensible Markup Language, is a platform-independent way to represent data. Simply put, XML enables you to create data that can read by any application on any platform. You can even edit and create it by hand, because it is based on the same tag-based technology that underlies HTML.
For example, suppose you want to use XML to store information about a transaction. This
transaction originates on your salesman's iBook, so you'll want to store it there. But
it will then be sent to the data application on your Windows server, and ultimately
archived on your mainframe, so it needs to be very flexible. XML enables you to create
something like that shown in Listing 1.
Listing 1. XML
<?xml version="1.0"?> <transaction ID="THX1138"> <salesperson>bluemax</salesperson> <order> <product productNumber="3263827"> <quantity>1</quantity> <unitprice currency="standard">3000000</unitprice> <description>Medium Trash Compactor</description> </product> </order> <return></return> </transaction>
Serialized this way, as text, the information is available in any environment in which you might need it. Even without a special application, you can see the content (in bold) and the markup, which describes it.
XML is fairly straightforward to use, once you understand its structure. It also provides several different methods by which you can control the structure, and even the content, of your data. Once you begin to use XML, you'll also have questions about the best way to design your XML structures, but it doesn't have to be a complicated process.
Get started with these resources:
- XML basics for new users: An introduction to proper markup
- Introduction to XML
- The Java XML Validation API
The flexibility of XML means that it's useful for so many applications, such as configuration files, Web services, data storage, and so on.
Since its introduction, developers have found numerous uses for XML. Here are some resources that give you an idea of how you can put XML to work.
The most obvious use of XML is to store data. XML provides advantages for both data-centric information (such as the data you find in a database) and document-centric information (such as data you store in XML so you can display it differently in different environments.)
Learn more about XML as a data-centric storage medium in these resources.
- An introduction to XQuery: Debunking XQuery myths and misunderstandings
- Query DB2 XML data with XQuery
- Develop Java applications for DB2 XML data
If you're interested in storing XML data, you should know that IBM provides a no-charge version of the new DB2 9, IBM DB2 Express-C 9. You should also check out the new DB2 Developer Workbench, which makes it easier to use XQuery and SQL/XML with DB2 9.
Web services began as a way to pass non-HTML information over HTTP. They have grown to be the foundation for fields from Ajax, used to add interactivity to Web sites, to today's Service Oriented Architectures (SOA), complex message-based applications. XML is integral to the field of Web services. All of the leading methods of Web services, SOAP, REST, and even XML-RPC, are based in XML.
See the section below on XML and Web services for more information.
Podcasting and other data syndication
One of the most common uses of XML today is in the realm of syndication. Millions of bloggers use RSS feeds to keep up with the latest information on their favorite blogs, and commercial interests have begun taking an interest in podcasting, or distributing audio and video over the internet to devices such as iPods, which also uses XML.
See what the syndication landscape looks like in these resources:
A common place to find XML is behind the scenes of your favorite applications and development environments, where it serves as a common means for creating files of configurations or instructions. Providing configuration instructions in a human-readable XML file enables users to control the behavior of applications much more easily than before.
Although the tags you see in Listing 1 are the most common serialization of XML, it is very common to deal with XML data in the context of an application. In that case, you will typically use one of several models, including the following.
The Document Object Model (DOM)
The Document Object Model, or DOM, is an object-based, tree-like way to view XML data. For example, in Listing 1, the salesperson, order, and return elements are children of the transaction element, meaning that they are contained below it in the hierarchy. DOM is the primary way in which most XML-based applications deal with XML.
The Simple API for XML (SAX)
DOM is useful when you are trying to manipulate data, because everything resides in memory. On the other hand, it can be quite a resource hog, because everything resides in memory.
The Simple API for XML, or SAX, solves the problem of having everything in memory at one time by analyzing data from the beginning of the document to the end, and notifying your application of every event, such as "start element" or "characters". It's more resource friendly than DOM, but you can't manipulate the data in quite the same way.
Start to understand SAX with these resources:
DOM and SAX are the most common ways of programmatically interacting with XML, but sometimes you don't need to build an application to manipulate XML data.
Sometimes the manipulation you want to do with XML doesn't even require programming. You can manipulate XML using EXtensible Stylesheet Language Transformations, or XSLT. XSLT enables you to transform an XML document into a different XML structure, or even into a non-XML structure. XSLT is extremely powerful, and very commonly used.
XML is platform and programming-language independent, so you can use it with virtually any programming language, as long as the underlying software, such as a parser, which reads the text file of tags and creates the XML Document for manipulation, is available. Learn how to work with XML using various programming languages with these resources:
XML parsing and other capabilities are built directly into Java.
PHP support for XML started out a bit rough; early implementations weren't quite in synch with the DOM specification. These days, however, the situation is much better, with more standard-like support.
Perl was designed to work with text, so sometimes the temptation is to work on the text directly rather than use XML methods, but the benefits are definitely there.
With Python's ease of use and XML's emphasis on cross-platform availability, the pair is a match made in heaven.
C++ programmers can also get their hands on XML capabilities.
The REXML library provides XML support for the Ruby programming language.
As developers began to use XML for various applications, standard vocabularies, or XML applications, began to emerge. For example, XHTML is an XML version of HTML, and podcasting takes place using various flavors of an XML vocabulary called RSS. The Scalable Vector Graphics (SVG) language provides a way to define graphic images using XML in a way that browsers such as Firefox can render them.
Some examples of XML in action are discussed below.
RSS and syndication
Bloggers often provide external feeds that show their most recent posts and provide links back to the original material. These feeds have turned into big business, with advertisers taking note, and the distribution of audio and/or video, or podcasting, becoming the focus of major media companies such as the broadcast television networks. These feeds are in the form of XML, either in one of the varieties of RSS, or Atom.
Scalable Vector Graphics (SVG)
SVG tries to do for graphics what HTML did for desktop publishing, provide a way to specify graphics using small, simple text instructions. SVG enables you to create complex graphics that are both small in terms of bandwidth, and controllable programmatically.
Think of XForms as the next generation of HTML forms, providing a way to specify the information to be collected in a presentation-independent way. This enables you to not only add more functionality more easily than before, but also to easily reuse forms in other mediums, such as cell phones, where the information is the same, but the presentation might be totally different.
More XML in action
You can find XML in a variety of places, such as publishing, encoding semantic data, and even those voice recognition units you talk to over the telephone. Here are some more examples:
Although you can implement Service Oriented Architectures (SOA) using a variety of technologies, the most common is to use Web services, and that means XML. The two most popular means to implement Web services, SOAP and REST, are both based on XML.
For example, you can make a request to the Google Web service by sending this SOAP
document as a Web request (see Listing 2).
Listing 2. Making a
request to the Google Web service by sending a SOAP document
<?xml version='1.0' encoding='UTF-8'?>
<q xsi:type="xsd:string">death star trash compactor</q>
Here you see the SOAP envelope, a standard format the Web service engine can
understand. The contents of this message, in this case the
doGoogleSearch element, is known as the payload, and consists of the
information to be processed by the Web service.
The overall Web services picture
In fact, most of the standards surrounding Web services -- and there are many -- are essentially XML vocabularies. Web Service Description Language is an XML file that describes a service, for example.
Get started with XML and Web services with these resources:
- Tip: SOAP 1.2 and
- Tip: Make SOAP and Web servers cohabit peacefully
- Tip: Passing files to a Web service
You can get more information on XML and Web services on the New to SOA and Web services page.
XML is at the heart of many of today's nascent technologies. For example, as search engines improve and the world moves towards the Semantic Web, XML is how webmasters can add meaningful information to their pages. Grid computing and autonomic computing continue to gain ground, and XML figures prominently in these technologies, as well. Database vendors continue to look at storing XML more efficiently, and XML Query Language gains steam.
In the following sections are resources to help you glimpse the future of XML:
RDF, microformats, and other semantic technologies
The semantic Web doesn't require XML, but you'd be hard-pressed to see that from the way the technology currently looks. Most information is encoded in some form of XML, whether it is the Resource Description Framework (RDF), or independent microformats. This is because of XML's nearly universal readability and understandability.
- Introduction to Jena
- Thinking XML: Semantic anchors for XML
- Thinking XML: XML meets semantics, Part 1 of a four part series
Grid and autonomic computing
The world becomes smaller, and computer systems get bigger. Specifically, researchers, companies, and other organizations begin to see the advantage in mending their systems together into a single larger system, either to provide enhanced computing power or to save money by eliminating waste. Because of its platform independence, XML is perfect for exchanging information between disparate systems.
- Meet the Experts: Susan Malaika on XML Standards and Grid Computing
- Policy Management for Autonomic Computing: Solve a business problem using PMAC
- The Ajax transport method
- User annotations in Ajax
As more information becomes available through Web services, more enterprising developers find more things to do with it. One way much of this data has been utilized is in the mashup, a rapidly growing type of application that combines data from multiple sources into a single view.
If you want to improve your XML skills, the best way to do it is to get a grounding in the essentials, and then simply use it. Start with the resources listed under What is XML? and move on to those under Does XML lend itself to application development?. From there, you can move on to any of the other areas that interest you.