Skip to main content

XML Matters: OASIS Election Markup Language

Standardization of XML formats for voting and elections

David Mertz (mertz@gnosis.cx), Bean Counter, Gnosis Software, Inc.
David Mertz
To David Mertz, all the world is a stage, and his career is devoted to providing marginal staging instructions. David may be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/. Suggestions and recommendations on this, past, or future columns are welcomed. Check out David's book Text Processing in Python.

Summary:  The Organization for the Advancement of Structured Information Standards (OASIS) has developed many XML standards in use within government, law, and business. Election Markup Language (EML) is OASIS' foray into the world of elections -- with an emphasis on voting within governmental jurisdictions. In this installment, David gives readers an introductory look at the structure and purpose of EML, with an eye toward how this standard, which is now used largely in Europe, will substantially influence future data standards in the United States.

View more content in this series

Date:  15 Oct 2004
Level:  Intermediate
Activity:  1392 views
Comments:  

Readers of my previous XML Matters installment on the use of XML in an open source voting machine will recognize my motivation for investigating the OASIS standard for EML. My direct interest has been further piqued by my recent membership in the still-fledgling IEEE Project 1622 (Voting Systems Electronic Data Interchange -- see Resources). Actually, OASIS' EML covers quite a bit more ground than the Open Voting Consortium's narrow demo system, or even than is anticipated for P-1622.

Specifically, EML is intended to:

  • Be rich enough to accommodate governmental elections across many jurisdiction levels, as well as elections with many different kinds of organizations (community or corporate, for example)
  • Allow voting over many channels, both traditional voting booths (perhaps electronic) and remote systems like Web pages, telephone voting, kiosks, and so on
  • Enable many tabulation and voting rules, such as ranked preference and cumulative voting
  • Handle security, encryption, and authentication requirements
  • Record and convey information about voter registration, organization membership, and other voter metadata

EML has seen significant real world use in European government, and in some non-governmental organizations worldwide.

EML, in my opinion, suffers somewhat (but not outrageously) from an over-engineering common among XML technologies (think SOAP, W3C XML Schemas, or even XSLT). Committees have a tendency to produce standards with too many details, handling too many corner cases centrally, and with too many levels of indirection. Of course, having joined another standards committee myself, I suppose I too will soon be guilty of participating in feature creep. Nonetheless, our tentative plan in IEEE P-1622 is to start with a simpler data model provided by a commercial election system vendor (but released on non-proprietary terms), rather than adopt EML whole cloth towards standardization of elections data for the United States. Our target in P-1622 is only to accommodate the needs of governmental elections, rather than every possible voting scenario; moreover, the fifty-some US states and territories have somewhat less procedural variation than do the 45 member nations in the Council of Europe (for example). Nonetheless, the fact that we already have several other contributed data models to reconcile into the final design already makes for a nascent featuritis.

What does EML include?

To give you a sense of the scope of EML version 3.0, here's a quote from the Executive Summary to the standard:

The primary deliverable of the committee is the Election Markup Language (EML). This is a set of data and message definitions described as XML schemas. At present EML includes specifications for:
* Candidate Nomination, Response to Nomination and Approved Candidate Lists
* Voter Registration information, including eligible voter lists
* Various communications between voters and election officials, such [as] polling information, election notices, etc.
* Logical Ballot information (races, contests, candidates, etc.)
* Voter Authentication
* Vote Casting and Vote Confirmation
* Election counts and results
* Audit information pertinent to some of the other defined data and interfaces

Many distinct data requirements are addressed by the various aspects of EML. The schemas associated with the logical aspects of an election process are given numeric prefixes to indicate general category. So the 400 series schemas are associated with voting as such; the 500 series with tabulation (also known as canvassing in American terminology); the 100 series with an overall election specification; the 200 series with candidates; the 300 series with voters (eligibility and so forth). Within each schema series, one or more W3C XML Schemas are provided to describe documents that meet those requirements.

Some of the included schemas are:

  • 110-electionevent.xsd
  • 230-candidatelist.xsd
  • 310-voterregistration.xsd
  • 340-pollinginformation.xsd
  • 410-ballots.xsd
  • 420-authentication.xsd
  • 440-castvote.xsd
  • 510-count.xsd

Information on the naming scheme, along with the schemas themselves, can be found at the OASIS site (see Resources).

In addition to the numbered schema families, EML contains a collection of supporting schemas that mainly deal with common datatypes. For example, most or all include the schema emlcore.xsd (in some cases indirectly through some other include). Such a schema will have a line like this:

<xsd:include schemaLocation="emlcore.xsd"/>

The EML core, in turn, includes emlexternals.xsd and imports emltimestamp.xsd and the W3C's xmldsig-core-schema.xsd. I have not listed everything that's incorporated, but this illustrates the style. The lines for including or importing the mentioned schemas are:


Listing 1. External resources used by emlcore.xsd
<xsd:include schemaLocation="emlexternals.xsd"/>
<xsd:import namespace="urn:oasis:names:tc:evs:schema:eml:ts"
            schemaLocation="emltimestamp.xsd"/>
<xsd:import namespace="http://www.w3.org/2000/09/xmldsig#"
            schemaLocation="xmldsig-core-schema.xsd"/>

So far, so good. Now for a closer look. The schema emlexternals.xsd only defines formats for addresses and personal details about voting-eligible citizens. But my feeling is that the includes are currently structured with an eye toward expanding the element and type definitions within emlexternals.xsd when or if the need arises. In the main, emlexternals.xsd does its work with yet more includes:


Listing 2. Citizen information datatypes imported to emlexternals.xsd
<xsd:import
     namespace="http://www.govtalk.gov.uk/people/AddressAndPersonalDetails"
     schemaLocation="AddressTypes-v1.xsd"/>
<xsd:import
     namespace="http://www.govtalk.gov.uk/people/AddressAndPersonalDetails"
     schemaLocation="PersonalDetailsTypes-v1.xsd"/>

Of course, once you follow the path still further into AddressTypes-v1.xsd, you find still more external definitions -- not as includes or imports, but through namespaces like those for the Dublin Core Metadata Initiative.


What makes up a ballot?

The schema 410-ballots.xsd specifies the format for an un-cast ballot. This format is relatively unremarkable, but it is worth noticing that it includes a number of features that accommodate ballots in general, not merely governmental elections. For example, I am not familiar with any governmental elections that provide a "Reason" for Election/Contest qualification. However, in this case it may be that a reason (such as "Initiative met signature threshold") is worth conveying to elections officials, even while not displaying it to voters.

The schema 440-castvote.xsd specifies an actual vote made in response to a ballot. In the Open Voting Consortium (OVC) design that I presented in an earlier installment, I called these root elements <ballot> and <cast_ballot> to emphasize their connection. In contrast to the OVC (preliminary) design, EML does not create any particular relationship between <Ballots> and <CastVote>. Recall that the OVC design approximately generates a <cast_ballot> simply by removing non-supported selections from a <ballot>. For example, if a <ballot> contains several selections for a <contest name="Mayor">, a <cast_ballot> is just the same XML fragment with all but one selection (candidate) removed.

I believe the independent design of schemas within EML leads to certain pitfalls -- albeit minor ones. For example, in 410-ballots.xsd <Options> may contain either a list of <Candidate> elements or list of <Option> elements. Fair enough -- this is helpful in distinguishing political offices from referenda. But over in 440-castvote.xsd, every vote is listed as an <Option> and never as a <Candidate>. I see no good reason to distinguish the semantic models of cast and un-cast ballots in this way (if you want the information in one XML instance, you want it in the other; if it is superfluous, it is so in both places).

To give you a feel for EML, I decided to prepare a <CastVote> that matches the <cast_ballot> presented in my earlier installment. I have condensed the sample document by leaving out optional security tokens and <AuditInformation>. On the latter, I have some initial doubts about including the auditing record within the cast vote itself, since that has the potential to compromise anonymity; but I have not looked at this matter closely enough to evaluate whether a genuine security issue exists. However, within IEEE P-1622 -- and within OVC -- I will probably push to keep audit records as separate documents (which might be a Federal Election Commission requirement; I'm not giving legal advice here). Recall that the OVC-format cast ballot looked like this:


Listing 3. v-20081104-US-CA-Santa_Clara_County-2216-1274.xml
<cast_ballot election_date="2008-11-04" country="US" state="CA"
             county="Santa Clara County" precinct="2216"
             number="1274" serial="213" source="voting_machine">
  <contest ordered="No" coupled="Yes" name="Presidency">
    <selection writein="No" name="President">V. I. Lenin</selection>
    <selection writein="No" name="Vice President">Karl Marx</selection>
  </contest>
  <contest ordered="No" coupled="No" name="Senator">
    <selection writein="No">William Lloyd Garrison</selection>
  </contest>
  <contest ordered="No" coupled="No" name="Transportation Initiative">
    <selection writein="No">Yes</selection>
  </contest>
  <contest ordered="Yes" coupled="No" name="County Commissioner">
    <selection writein="Yes">David Packard</selection>
    <selection writein="No">Gordon Moore</selection>
    <selection writein="No">William Hewlett</selection>
  </contest>
</cast_ballot>

This vote contains the rather unusual case of the US President and Vice President where you cast a common vote for two different candidates running for two different offices. Parliamentary party-slate votes are somewhat similar, conceptually, but in those cases you vote for a single party, not multiple candidates. Other than that, I find this XML minimal and self-explanatory. EML's version tends to nest data more deeply, and does not seem to contemplate the Presidency case directly. As near as I can tell, you might represent this vote as:


Listing 4. EML-20081104-US-CA-Santa_Clara_County-2216-1274.xml
<?xml version="1.0" encoding="UTF-8"?>
<CastVote xmlns="440-castvote.xsd">
<ElectionEvent>
  <Event>
    <EventName Id="n1274s213">
      Santa Clara County, CA, USA (2008-11-04)
    </EventName>
    <EventQualifier>Precinct 2216</EventQualifier>
  </Event>
  <Election>
    <ElectionName>Presidency</ElectionName>
    <Contest>
      <ContestName>President</ContestName>
      <Selection>
        <Option>
          <OptionName>V. I. Lenin</OptionName>
        </Option>
      </Selection>
    </Contest>
  </Election>
  <Election>
    <ElectionName>Presidency</ElectionName>
    <Contest>
      <ContestName>Vice-President</ContestName>
      <Selection>
        <Option>
          <OptionName>Karl Marx</OptionName>
        </Option>
      </Selection>
    </Contest>
  </Election>
  <Election>
    <ElectionName>Senate</ElectionName>
    <Contest>
      <ContestName>Senator</ContestName>
      <Selection>
        <Option>
          <OptionName>William Lloyd Garrison</OptionName>
        </Option>
      </Selection>
    </Contest>
  </Election>
  <Election>
    <ElectionName>Local Initiative</ElectionName>
    <Contest>
      <ContestName>Transportation Initiative</ContestName>
      <Selection>
        <Option>
          <OptionName>Yes</OptionName>
        </Option>
      </Selection>
    </Contest>
  </Election>
  <Election>
    <ElectionName>Local Office</ElectionName>
    <Contest>
      <ContestName>County Commissioner</ContestName>
      <Selection>
        <Option>
          <WriteinOptionName>David Packard</WriteinOptionName>
          <Value>1</Value>
        </Option>
        <Option>
          <OptionName>Gordon Moore</OptionName>
          <Value>2</Value>
        </Option>
        <Option>
          <OptionName>William Hewlett</OptionName>
          <Value>3</Value>
        </Option>
      </Selection>
    </Contest>
  </Election>
</ElectionEvent>
</CastVote>

I am not entirely certain I have the semantics of <Election>, <Contest>, <Selection>, and <Option> right, but given the cardinalities of elements, this seems to be the required arrangement. Exactly how <ElectionName> and <ContestName> relate is also not wholly clear to me.


Final canvassing

I have looked at just a few details of EML version 3 in this installment, but it should be enough to give you a feel for what the system of schemas aims for. In particular, this installment has only really looked at the subset of EML that's concerned with ballots and votes, not all the other portions that deal with voter registration, candidate nomination, or vote canvassing (matching the coverage of my prior related installment).

In Europe, EML is a standard in relatively wide (and growing) usage, and programmers who develop elections systems -- or even systems that touch on them peripherally -- need to become familiar with EML. Moreover, as an OASIS standard, EML is certainly a specification that organizations should consider in conducting private elections. Bringing a common data format to a large swatch of elections usage will allow for interoperability among tools, including tools dedicated to audit and security analysis of elections.


Resources

About the author

David Mertz

To David Mertz, all the world is a stage, and his career is devoted to providing marginal staging instructions. David may be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/. Suggestions and recommendations on this, past, or future columns are welcomed. Check out David's book Text Processing in Python.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=18052
ArticleTitle=XML Matters: OASIS Election Markup Language
publish-date=10152004
author1-email=mertz@gnosis.cx
author1-email-cc=dwxed@us.ibm.com

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers