Add XML structure to the resume
Put HR-XML, stylesheets, formatting objects, and namespaces to work
You can quickly and easily compose a résumé in a What You See Is What You Get (WYSIWYG) editor and with a couple of mouse clicks translate it to a PDF file for transmission to a prospective employer. So why put the extra effort into storing the data in an XML file first? Complicating the process with extra steps can introduce errors, so you need a good reason for the additional effort.
The justification lies in the separation of data from presentation and benefiting from the structure that a backend such as XML brings. When the data becomes more complex and output requirements more varied, XML offers accuracy, portability, and adaptability. Data enthusiasts try to store all data in a database of some kind. Whether a complex data structure is overkill for a plain résumé depends on your needs and how often the data changes.
Many employers react negatively to an incomplete résumé. Structure is good—elements act as reminders of what must appear in the document. You can use XML on a wide variety of platforms, and one single XML data backend can provide a résumé (short version) or curriculum vitae (long version) according to the employer's requirements simply by using a different stylesheet.
The process described here uses Apache FOP (see Related topics) to generate a PDF file from an XML data file using an Extensible Stylesheet Language (XSL) stylesheet. The stylesheet controls the presentation of the data and follows the standard format as described in the W3C document (see Related topics).
You can store the résumé data in plain XML format using your own unique schema. But a standard format such as HR-XML has advantages. If you have special requirements not covered by the standard, it is a simple matter to take what you need from the standard and extend it by creating a personal namespace for the additional material.
HR-XML and OAGIS
HR-XML and OAGIS (see Related topics) are two open source projects that combine to offer the kind of structure that many large organizations consider important in human resources and business contexts.
HR-XML is the result of much thinking by specialists in the field of human resources. These specialists view the issue from an employer's point of view, so the schema contains the scaffolding for a lot more information than is required at the interview stage. Managing people is a complex business. From determining staffing requirements through recruiting, background checks, competency assessment, and hiring to ongoing time reporting and compensation, benefits management, performance goals, and assessment, HR-XML offers schemas to cover them all.
While HR-XML is dedicated to the human resources industry, OAGIS looks at cross-industry data exchange standards. It deals with ideas and concepts common to industries in general but leaves the industry-specific elements to specialist groups from within the industry that have the expertise.
HR-XML is careful not to reinvent ideas already created by the broader OAGIS set of elements—it simply adds new material in its own namespace. The result is a schema based on what to store given the human resources context (elements) and how to store it (attributes, hierarchy), so why not benefit from their work? To get more detail about the schema that HR-XML uses, download it or view it online at the web site (registration required). In the case of the version 3.1 download, here is a path to the documentation related to the
Online, a good starting point is at the following URL:
The data file
Listing 1 is an example of a basic data file—a fragment from a larger file—that employs the
Candidate element and some of its children.
Listing 1. Example data file
<?xml version="1.0" encoding="UTF-8"?> <hr:Candidate xmlns:hr="http://www.hr-xml.org/3" xmlns:ccts="urn:un:unece:uncefact:documentation:1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:oa="http://www.openapplications.org/oagis/9"> <hr:DocumentID>000000001</hr:DocumentID> <hr:CandidatePerson> <hr:PersonName> <hr:FormattedName>Blimpo Togwer</hr:FormattedName> <oa:GivenName>Blimpo</oa:GivenName> <hr:FamilyName>Togwer</hr:FamilyName> </hr:PersonName> <hr:Communication> <hr:ChannelCode>Mail</hr:ChannelCode> <hr:Address> <oa:AddressLine sequence="1">5555 Yellow Brick Road</oa:AddressLine> <oa:AddressLine sequence="2">RR #1</oa:AddressLine> <oa:CityName>Lesser Village</oa:CityName> <oa:CountrySubDivisionCode>KKK</oa:CountrySubDivisionCode> <hr:CountryCode>XX</hr:CountryCode> <oa:PostalCode>AAA BBB</oa:PostalCode> </hr:Address> </hr:Communication> </hr:CandidatePerson> </hr:Candidate>
This code fragment, which stands as a complete but rather simple example, shows a number of details:
- The XML declaration is followed by the root element
Candidatehere has the meaning that is defined in the
hrnamespace signified by that prefix.
hrnamespace is associated with the label http://www.hr-xml.org/3.
- Each of the elements is preceded by a namespace label that removes all ambiguity as to what the element represents.
- Some of the elements are defined in the
hrnamespace (HR-XML) and some in the
oanamespace (OAGIS). They are mixed and matched as required.
CountryCoderequires a two-character code such as
CountrySubDivisionCoderepresents a state, province, department, or other major administrative region within a country.
- Hierarchy is important. For example, to get to the city name, the path involves:
Use the online schema resource from HR-XML to get the names of additional elements such as
CandidateProfile that allow you to add more information such as
Certifications, and so on.
Namespaces are a structure that addresses possible ambiguities when giving names to XML elements. See Related topics for more information about getting started with namespaces. They impose good discipline; however, they require careful use to ensure that the correct data is retrieved, otherwise errors might occur—many of them silently. For example, if you refer to your
education section and do not specify the namespace, there is a good chance that because the data cannot be found the processor prints nothing at all in that section, without warning.
To make changes to the XML files, because both the data file and the stylesheet are pure XML, use your favorite XML or text editor. For example, get Eclipse (see Related topics), open a new project, copy and paste the code from Listing 1 into a new document, edit, and you are well on your way to a structured résumé data file.
For a selection of tutorials about how to build and use stylesheets, see the W3C XSL web page (see Related topics).
Listing 2 is an example of a basic stylesheet in the résumé context.
Listing 2. Example stylesheet
<?xml version="1.0" encoding="UTF-8" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:oa="http://www.openapplications.org/oagis/9" xmlns:hr="http://www.hr-xml.org/3"> <xsl:output method="xml" indent="yes"/> <xsl:template match="/"> <fo:root> <fo:layout-master-set> <fo:simple-page-master master-name="page1"> <fo:region-body margin="1in" /> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="page1"> <fo:flow flow-name="xsl-region-body"> <fo:block text-align="right" font-size="12pt" font-family="serif"> DocumentID: <xsl:value-of select="hr:Candidate/hr:DocumentID" /> </fo:block> <fo:block> <fo:leader leader-pattern="dots" leader-length="100%" /> </fo:block> <fo:block font-size="12pt" font-family="serif"> Curriculum Vitae - Résumé </fo:block> <fo:block font-size="20pt" font-family="Arial" font-weight="bold"> <xsl:value-of select="hr:Candidate/hr:CandidatePerson/hr:PersonName/hr:FormattedName" /> </fo:block> <fo:block font-size="12pt" font-family="serif"> Contact </fo:block> <xsl:for-each select="hr:Candidate/hr:CandidatePerson/hr:Communication[hr:ChannelCode='Mail']"> <fo:block font-size="10pt" font-family="Arial" font-weight="normal"> <xsl:value-of select="hr:Address//oa:AddressLine[@sequence=1]" />, <xsl:value-of select="hr:Address/oa:AddressLine[@sequence=2]" /> </fo:block> <fo:block font-size="10pt" font-family="Arial" font-weight="normal"> <xsl:value-of select="hr:Address/oa:CityName" />, <xsl:value-of select="hr:Address/oa:CountrySubDivisionCode" /> </fo:block> <fo:block font-size="10pt" font-family="Arial" font-weight="normal"> <xsl:value-of select="hr:Address/oa:PostalCode" />, <xsl:value-of select="hr:Address/hr:CountryCode" /> </fo:block> </xsl:for-each> </fo:flow> </fo:page-sequence> </fo:root> </xsl:template> </xsl:stylesheet>
- The document needs four different namespaces. All references to data explicitly state the namespace at each node, avoiding confusion that can arise when allowing the default namespace, where no prefix is used.
- The template match is a forward slash (
/), indicating that searches start at the root element of the data document.
- The stylesheet specifies a layout master set that defines pages in the overall document and then a page sequence element for individual pages.
- Each page requires a series of
blockelements that instruct the processor where to place an item on the page and how to display it, including font and font size.
- The stylesheet uses
for-eachstatements to iterate over groups of elements. For example, there might be multiple communication channels: mail, email, phone, and so on. Using square bracket () notation, you can specify a filter—in this case, the stylesheet filters for
Output using Apache FOP
Apache FOP uses the data file together with the stylesheet to produce the PDF. FOP is not limited to PDF output—you can also generate Rich Text Format (RTF), Printer Command Language (PCL), PostScript (PS), Advanced Function Presentation (AFP), Tagged Image File Format (TIFF), and Portable Network Graphics (PNG), as well as plain text files.
Getting and installing FOP is as simple as downloading and unpacking the binary version (see Related topics). FOP is then ready to run from the downloaded location.
Here is an example command-line instruction to
fop. In this case, the data, style, and configuration files are located in one directory. With that directory as the working directory, you call
fop from its own location:
/path/to/fop/fop -c fop.xconf -xml exx.xml -xsl exx.xsl -pdf exx.pdf
This instruction tells the
fop executable file to do the following:
- Look for configuration information in the fop.xconf file
- Look for data in the exx.xml file
- Use the exx.xsl stylesheet to produce the exx.pdf output
The configuration file is important and appears as shown in Listing 3.
Listing 3. FOP configuration file
<?xml version="1.0"?> <fop version="1.0"> <base>.</base> <source-resolution>72</source-resolution> <target-resolution>72</target-resolution> <default-page-settings height="11in" width="8.26in"/> <renderers> <renderer mime="application/pdf"> <filterList> <value>flate</value> </filterList> <fonts> <auto-detect /> </fonts> </renderer> </renderers> </fop>
In this configuration, the
filterlist element controls how objects are compressed in the PDF output, and the
fonts element instructs the processor to use fonts that are already known to the operating system.
Figure 1, which is a screen capture from a PDF reader of the output from the earlier listings, shows the result of running the transformation.
Figure 1. The PDF output
The stylesheet can contain simple decoration items:
- Rows of dots appear in the example, and the following code generates them:
<fo:block> <fo:leader leader-pattern="dots" leader-length="100%" /> </fo:block>
- You can make blank lines appear using the techniques included in Nicholas Chase's developerWorks Tip (see Related topics) or with the following code:
See the FOP documentation (see Related topics) for further possibilities including borders, margins, padding, color, images, and tables.
Generating a résumé or curriculum vitae from an XML file involves a little more work but imposes a disciplined structure that helps ensure that the document is as complete as is necessary.
Creating documents using a text editor is still a valid possibility in the simple situation. Alternatively, using an XML file as a common source of information for different versions of a résumé suits the more intricate data source. The choice becomes one of "Is it more efficient to maintain multiple copies of a document together with markup in an editor or to maintain multiple stylesheets that operate on the same data?" Both tend to the same conclusion but use different paths.
- Principles of XML design: Use XML namespaces with care (Uche Ogbuji, developerWorks, Apr 2004)L Read about some of the difficulties of working with namespaces and minimize problems as you incorporate namespaces into XML design.
- Tip: Control white space in an XSLT style sheet (Nicholas Chase, developerWorks, Nov 2002): Understand whitespace and space stripping in transformation and create the document you want.
- Apache FOP: Learn more about this print formatter driven by XSL formatting objects (XSL-FO) and an output independent formatter.
- Apache FOP Compliance Page: Visit this page to explore the formatting possibilities in a FOP document.
- HR-XML: Check out an HR-XML implementation tool.
- Open Applications Group: Go to the website for this standards development organization that builds process-based business standards for eCommerce, Cloud Computing, Service Oriented Architecture (SOA), Web Services, and Enterprise Integration.
- OASIS: Learn more about the Organization for the Advancement of Structured Information Standards.
- XSL: Delve in to this family of recommendations for defining XML document transformation and presentation.
- Eclipse: Try this open-source, XML development environment.
- More articles by this author (Colin Beckingham, developerWorks, March 2009-current): Read articles about XML, voice recognition, XHTML, PHP, SMIL, and other technologies.
- IBM certification: Find out how you can become an IBM-Certified Developer.
- XML technical library: See the developerWorks XML Zone for a wide range of technical articles and tips, tutorials, standards, and IBM Redbooks. Also, read more XML tips.
- IBM product evaluation versions: Get your hands on application development tools and middleware products.