Technical standards in education, Part 5: Take advantage of metadata

Structure description to enhance managing and sharing educational content

Metadata provides key information about resources to support their management and discovery. Educational metadata is a specific type of metadata for describing the educational use of a resource. This article gives an overview of metadata, illustrates its basic concepts, and discusses educational metadata. You'll learn about vocabularies, tools, and the current use of educational description.


R. John Robertson (, Learning Technology Advisor, CETIS, University of Strathclyde

R. John RobertsonR. John Robertson works for the Centre for Educational Technology and Interoperability Standards (CETIS), one of the Innovation Support Centres funded by Joint Information Systems Committee (JISC) to: represent Further and Higher Education in the development of relevant standards and specifications, nurture relevant developer communities and innovation, and provide guidance and support for JISC development programmes. John currently supports JISC's work in Open Educational Resources. A librarian by training, his professional background is in development and support projects around the management of digital assets, repositories, digital libraries, and metadata.

08 March 2011

Introduction to metadata

Simply defined metadata is data about data, or a piece of information that describes another piece of information. As such, neither the description nor the thing being described has to be digital, although the most common use of the term refers to the digital description.

Metadata can be part of the thing that is being described, or it can be associated with it in some other way. For example, a library catalog record is a metadata record about a resource, such as a book. It is distinct from the book. In comparison, many modern cameras embed information about an image (aperture, time, GPS location) in the image file as Exchangeable image file format (Exif) metadata (see Resources).

To describe a thing, metadata can record many types of information, such as:

  • Technical information
  • Bibliographic information
  • Educational information
  • Usage and rating information
  • Licensing and rights information
  • Structural information (for complex objects)
  • Information about provenance

Most metadata standards focus on a limited domain, and delineate a selection of information elements to describe a thing, a structure for those elements, and a binding of the structure and elements typically in HTML, XHTML, or XML. Standards also come with usage guidelines stipulating which elements you may, should, or must record about a thing in order to comply with the standard. The guidelines may stipulate the uniqueness or repeatability of information elements and may provide further recommendations or restrictions on how to use an element.

An illustration of the basic concepts

To illustrate and review some of the basic concepts consider a box of chocolates (see Figure 1).

Figure 1. Box of chocolates
Box of chocolates

The chocolates wrapped up inside a box are the thing we want to describe. The creator of the box of chocolates provides descriptive information about it for a number of different purposes. There is a leaflet attached describing the chocolates in the box, the weight of the box and chocolates, nutritional information, and a bar code. A detailed list of ingredients either is printed on the underside of the box or may be part of the leaflet.

We can observe that:

  • The leaflet could be independent of the box or it could be inside the box.
  • Different types of information are provided about the chocolates for different purposes.
  • Some of the information is to be read by a person and some is for automated use by a machine.
  • Some of the information could be easily derived from the box of chocolates itself (weight), other bits would require effort to derive (caloric content), while others might not be possible to derive from the box itself (source of cocoa beans).
  • It is possible to make a decision whether or not to purchase the box of chocolates based on the leaflet and related information.

As with the leaflet about the chocolates, metadata can:

  • Be associated with or part of the thing being described
  • Be intended to meet different types of information needs
  • Be intended to be understood by a person or by a machine
  • In part, be derived from the thing itself, but not necessarily all of it, which might require considerable effort to do so
  • Act in place of the thing for some types of interaction

Current practice in education

In the educational sector, we use metadata in a wide range of content and tools, including library catalogs, digital libraries, e-journal systems, virtual learning environments (VLEs), institutional repositories, and content management systems.

We use metadata to describe the things these systems create, store, and manage, and it provides services to student, staff, faculty members, and the public. Library catalogs have records describing books and other materials the library manages. Digital libraries may hold photographs, videos, and many other types of materials. E-journal systems support the discovery of and manage access to digital versions of journal articles that publishers make available online. VLEs hold things like lecture notes, slides, formative assessment materials (for self-testing and checking understanding), and learning objects. Although practically anything can be described and treated as a learning object, the term typically refers to a digital thing that teaches, tests, or illustrates a point that runs in a VLE or browser. Institutional repositories can hold information about journal articles written by its academics (or about unedited preprints of those articles), electronic copies of theses and dissertations, and other relevant institutional materials such as learning materials and scientific data sets.

Although nearly all of these systems are based on appropriate standards or specifications, not all of them use standards for metadata (rather relying on some form of locally created information structure for descriptive purposes). In addition, not all use metadata standards that are specific to education or that include educational metadata. Regardless of the approach, metadata (the descriptive information) is there to provide a service: to enhance the discover-ability and use of digital content by allowing the creation of indexes used to search and browse these digital assets.

What is metadata?

Let's examine key questions around metadata in education.

Supplement or surrogate?

The types of use of metadata in education outlined in the last section highlight very different communities and corresponding uses of metadata. In education, there are at least two different patterns of use that we can trace to the different original understandings of metadata — one originating in the library community and one in the IT community. The educational technology community in the past tended to demonstrate both patterns. To simplify these patterns, consider the following:

  • The IT community views metadata as part of a digital object, and although metadata may provide additional information, it is often supplementary and derivable from the object. Metadata in this view is a bit like the <meta> information in an HTML page, supplementary comments in code, or file properties. To go back to the box of chocolates analogy, it is a bit like the list of ingredients and weight of the box, useful but not entirely critical for normal use.
  • The library community views metadata as a description that acts as a surrogate for the object. Metadata in this view is a direct parallel to library cataloging where users may make decisions about which books or articles to read based not initially on the book itself but on the metadata record. Typically, the creation of this metadata is seen as an intellectual work in itself. In the box of chocolates analogy, it is the leaflet, the catalog of the chocolates that allows you to choose a chocolate to eat before opening the box.

These simplifications distinguish two historical patterns on use and perception of the role of metadata. Fortunately, these views are simplifications, and many systems developers and repository administrators who work with a blend of both perspectives understand that. They are worth knowing about, though, to grasp the difference between supplemental and surrogate views of metadata records.

Standard or added tag?

As mentioned previously, some of the current practices in education store metadata as part of the object itself, whereas others store it as a separate file. A related issue is whether the information is recorded using a metadata standard (or specification) designed for that purpose, recorded in a suitable part of a more generic standard, or if it is simply recorded in some custom information container, such as a user created database field or XML field.

Although we touch on the other types of use, for the rest of the article we focus more on metadata standards in education, particularly those that support educational metadata.

Educational description

In the introduction, we outlined some different types of information such as bibliographic information, technical information, and annotation or usage data. Educational description and metadata standards are distinctive to the domain of education (whether K-12 education, tertiary education, or work-based training (including education in healthcare or military settings).

Educational description is primarily concerned with how metadata is used to teach or support learning. It may look for specific information such as what courses has this "thing" been designed for, what age range or prior knowledge does it assume, and similar questions. This type of information is distinct from bibliographic information (which asks what is this thing about, who created it, and when), from technical information (what type of file is this, how big is it, what do I need to display it), and from user annotation (do I think this is a good resource, what feedback would I like to give, what did other users who downloaded this also download). However, educational description shares some common interest with these other questions, and educational metadata standards include other types of information.

Metadata standards for education

The two most common educational metadata standards are the IEEE Learning Object Metadata standard, which is a specific standard for educational description, and a generic metadata standard called Dublin Core, which has some educational elements.

IEEE Learning Object Metadata

The IEEE Learning Technology Standards Committee developed the IEEE Learning Object Metadata (LOM) standard and released it as a finished standard in 2002. The committee developed LOM to meet a perceived need for the management of learning materials, in particular discrete and concrete learning objects used in virtual learning environments (VLEs) or course management systems. The standard itself is available for purchase from IEEE and the final draft is freely available for consultation. As noted in the first article in this series, application profiles developed for the LOM in many geographic regions and education sectors try to reduce the number of ways digital education assets could be described and to make sharing interoperable content easier (see Resources). One example of an application profile is the UK LOM Core. Find links to the IEEE LOM standard, previous articles in this series, and the UK LOM Core in Resources.

An important feature of the LOM is that it uses a hierarchical container-like structure to manage different types of metadata. It has a basic high-level structure that corresponds to different types of metadata, with categories for each type. The categories are:

  • general
  • lifecycle
  • meta-metadata
  • technical
  • education
  • rights
  • relation
  • annotation
  • classification

Each category contains a set of elements to store information relevant to that category, with some elements containing sub-elements. Another feature of the LOM is that, within the standard, it stores information about people and organizations as VCARDs. This is potentially problematic for transforming the data into other metadata standards and may raise some implementation issues.

Listing 1 is an example of a LOM record for a learning object.

Listing 1. Sample LOM record
    <!--General Section-->
              <lom:string language="en">Virtual Maths,
               Cuboid - Excavation quiz1
              <lom:string language="en">Interactive simulation
                explaining how to calculate cubic capacity of a
                truck for carrying excavated materials</lom:string>
              <lom:string language="en">ukoer</lom:string>
              <lom:string language="en">Virtual Maths</lom:string>
              <lom:string language="en">cubic capacity</lom:string>
              <lom:string language="en">excavation</lom:string>
   <!--Lifecycle Section-->[...]
   <!--Metametadata Section-->[...]
   <!--Technical Section-->[...]
   <!--Educational Section-->
              <lom:value>higher education</lom:value>
  <!--Rights Section-->
  <!--Relations Section-->[...]
  <!--Classification Section-->
                 <lom:string language="en">HEA</lom:string>
                   <lom:string language="en">Built Environment
                 <lom:string language="en">JACS</lom:string>
                   <lom:string language="en">Architecture
                   <lom:string language="en">Building</lom:string>
                   <lom:string language="en">Building

In Listing 1, the technical, meta-metadata, relation, and lifecycle sections of the record have been edited out to reduce the length of the code snippet. The entire record is available in the Resources) section. The container-like nature of the LOM is seen by looking at the nested sections in the <lom:rights> container. There's the <lom:copyrightAndOtherRestrictions> sub-container with two elements (<lom:source> and <lom:value>). The <lom:description> container contains another <lom:string> element that gives the semantic information of the container.

Dublin Core

Dublin Core Metadata began through a workshop held in Dublin, Ohio in 1995. The workshop sought to define a core group of metadata elements that reflected the common needs of libraries, computer scientists, museum curators, and the early web. This initial Dublin Core Metadata Element Set (DCMES) has been incorporated into other specifications and ratified by a variety of standards bodies (IETF RFC 5013, ANSI/NISO Standard Z39.85-2007, and ISO Standard 15836:2009).

The 15 DCMES elements are:

  • Title
  • Creator
  • Subject
  • Description
  • Publisher
  • Contributor
  • Date
  • Type
  • Format
  • Identifier
  • Source
  • Language
  • Relation
  • Coverage
  • Rights

After the creation of the Metadata Element Set, the work of Dublin Core continued through the creation of application profiles of the standard that would extend and enhance its suitability for particular communities. This work and ongoing efforts to develop Dublin Core for use in linked and semantic data led to a new approach to metadata elements and less emphasis on the DCMES. The new metadata element set is referred to as Dublin Core Metadata Initiative (DCMI) Metadata Terms and is designed to enable the ongoing addition and revision of terms (see Resources).

One of the problems with the original DCMES was that its simplicity and wide use led to a great diversification of application profiles and extensions for it without an inherent structure to standardize any of the extensions. Consequently, many groups developed application profiles for similar uses that were different enough to make interoperability and sharing records cumbersome.

Although work is ongoing in the development of a Dublin Core application profile for educational metadata, there are already a number of DCMI Metadata Terms specifically for educational description. These include:

  • educationLevel— "A class of entity, defined in terms of progression through an educational or training context, for which the described resource is intended. "
  • instructionalMethod— "A process, used to engender knowledge, attitudes and skills, that the described resource is designed to support."
  • Mediator— "An entity that mediates access to the resource and for whom the resource is intended or useful."

These descriptions are from the DMCI website; see the Resources section for the link to the complete definitions of these terms.

As the use of DC Terms is still developing, the example in Listing 2 uses oai_dc, which is an application profile of DCMES (its element set is identical to the DCMES elements) used as part of the Open Archives Protocol for Metadata Harvesting (OAI-PMH). The OAI-PMH is a widely implemented protocol for repositories to share metadata (Resources).

Listing 2. A Dublin Core metadata record for the same digital object as Listing 1
       <oai_dc:dc xmlns:oai_dc="
          /OAI/2.0/oai_dc/" xsi:schemaLocation=
         <dc:title xmlns:dc=""
          xml:lang="en">Virtual Maths, Cuboid - Excavation
         <dc:language xmlns:dc="
         <dc:description xmlns:dc=""
           xml:lang="en">Interactive simulation explaining how to
           calculate cubic capacity of a truck for carrying 
           excavated materials</dc:description>
         <dc:creator xmlns:dc="">
          Leeds Metropolitan University</dc:creator>
         <dc:date xmlns:dc="">
         <dc:format xmlns:dc="">
         <dc:identifier xmlns:dc="">
         <dc:identifier xmlns:dc="">

         <dc:type xmlns:dc="">
         <dc:rights xmlns:dc=""
        <dc:subject xmlns:dc=""
        <dc:subject xmlns:dc=""
          xml:lang="en">Virtual Maths</dc:subject>
        <dc:subject xmlns:dc="" 
          xml:lang="en">cubic capacity</dc:subject>
        <dc:subject xmlns:dc=""
          <dc:subject xmlns:dc="">
          Built Environment</dc:subject>
           <dc:subject xmlns:dc="">
          Architecture, Building, Building Surveying</dc:subject>
           <dc:relation xmlns:dc=""
          xml:lang="en">Virtual Maths collection</dc:relation>

In Listing 2, the same object is being described as that shown in Listing 1. For the rights example, the DC record is a single <dc:rights> tag. Note how the LOM had six sets of XML tags to convey the same information. Keep in mind, however, that the richer structure of the LOM supports recording both the subject term and the controlled vocabulary or taxonomy from which it comes. (We discuss controlled vocabularies and taxonomies later in this article.)

Community dimensions of educational metadata creation

One challenge creating educational metadata is that it requires information and skill sets that are usually distributed among a number of people. A given asset will potentially need information from the lecturer about how the material was used, and information from someone regarding what it is about and who created it. It will even need to have recorded rights information and permissions for any third party rights connected to other resources it contains. It may benefit from being classified according to an established subject scheme such as the Joint Academic Coding System (JACS) used in the United Kingdom or the Library of Congress Subject Headings (LCSH) used globally but predominantly in North America (see Resources). Often the resource creator is asked to address all of these issues, but where there are clear benefits to rich information, a more distributed approach involving learning technologists and librarians may be of benefit, though it comes with an increase in the associated cost of sharing.

Another category of educational metadata considered invaluable is information contributed by others who have read and used the shared digital asset. This user-contributed information is often in the form of comments, ratings, and recommendations. It should be noted, however, that this type of information is stored and created within particular systems and there are few good ways to share such information at this time.

Adoption in mainstream education

Despite a significant repository infrastructure in education, initiatives to share learning resources have a mixed history. Institutional repositories often focus on sharing research materials. They also have a policy of sharing learning materials, but these resources are rarely offered using education-specific metadata standards. Instead, institutions tend to use DCMES, and, if adopted at all, it is usually to add greater support for scholarly articles. However, more institutions are choosing to manage their learning materials in repositories, especially in the context of sharing learning materials as part of Open Educational Resource initiatives. One example of this is the OpenStaffs project at Staffordshire University.

There has been a number of successful subject, sector, or regionally based initiatives to share learning materials, such as Merlot and BCCampus. Some of these initiatives have used IEEE LOM — such as the Latin American Community of Learning Objects (LACL). Not all such initiatives have proved successful or sustainable. For example, the Campus Alberta Repository of Educational Objects (CAREO) was a key early e-learning repository that made a significant contribution to the development of a Canadian profile of the LOM (CanCore), but proved unsustainable.

A number of tools can be used to create or package learning content (such as Reload or eXe), and these tools can create LOM records as part of the process (although the records may be contained within IMS content packages). See Resources for links to more information about all of the examples above.

Educational metadata standards such as the educational elements of Dublin Core or the IEEE LOM have not seen the same type of pervasive use in education that the 'regular' Dublin Core has achieved.

The user perspective

It is worth noting that for those searching for resources, most systems will only occasionally expose the details of metadata. User feedback tends to create simple search and browse interfaces, and rightly only display key parts of a metadata record to the end user. As an example, for the object in the metadata records (see Listing 1 and Listing 2a>) the user of the repository sees the following search result (see Figure 2).

Figure 2. The record seen by the user for the learning object described in Listing 1
record seen by the user for the learning object described in Listing 1

Ontologies, taxonomies, and controlled vocabularies

One way to make the use of a standard more precise (by improving indexing, and thus search and browse functionality for a set of results) is to require the use of a limited set of words as possible descriptions to put into a particular metadata element. There are a number of approaches for creating and managing what are effectively complex word lists. The word list can be:

  • A controlled vocabulary
  • An ontology
  • A taxonomy
  • A folksonomy

Controlled vocabularies are highly structured hierarchical sets of words in which any given term can be related to other terms through relations like: broader term, narrower term, and use for (useful for synonyms and spelling variants). Controlled vocabularies, such as LCSH, require extensive intellectual effort to create and as much effort to maintain and update. Controlled vocabularies are often applied using complex interfaces or in consultation with reference books (physical or digital).

Ontologies and taxonomies are similar to each other in that they both create word lists intended to mark out a particular word or section of it. In the context of educational metadata, the usage of the terms is not particularly precise. If any distinction can be made, it's that an ontology will tend to impose a particular set of words on a particular context and a taxonomy is more likely to focus on identified words from a context. An example of a taxonomy is UK Educational Levels. Taxonomies and ontologies are often applied using drop down lists or other similar interfaces that control input (see Resources).

Folksonomies are a type of taxonomy created by aggregating user input. Users can assign free text keywords to resources as tags. As the system gathers tags it will also begin to suggest commonly occurring tags to users as they create or add new information. The idea is that the users of a system create a more effective and cheaper word list for the system as the words reflect how users of the system describe the resources. This way, the record creators do not need to spend effort in trying to add extra descriptions. The use of free text input for tags is used typically to support subject or topic information.

Recent developments

As noted in the first article in this series, some recent developments in resource sharing and management take a different approach. Rather than relying on detailed metadata, social tools (such as SlideShare or Flickr) or blogs are used to share resources. These approaches rely on three factors: user contributed information (rankings, comments, tags created by the public), automatic processing of the file to derive additional inherent information (file format, file size, GPS location), and the search and indexing capabilities of search engines (such as Google or Bing).

Another related issue for educational metadata is just how much discovery can be supported with relatively little metadata and detailed descriptive text. For example, in Figure 3 iTunesU provides basic bibliographic metadata with a good description (see Resources).

Figure3. Screenshot of a course in iTunesU
Screenshot of a course in iTunesU

There are many efforts underway to develop further standards for educational metadata. These include ISO Metadata for Learning Resources, IMS Learning Object Discovery and Exchange, and work on a fuller education specific application profile for Dublin Core. It is evident by the time and effort being put into the development of educational metadata standards that it remains a key area of interest for many of those in the field. As noted, work on these standards is ongoing.

Metadata products and tools

This is intended as a reference list of some metadata-related tools. It is not exhaustive or comprehensive but seeks to illustrate the types of tools in use.


Much educational metadata is created in repositories. There are two relevant types of repositories — those created primarily for managing learning objects and those created primarily for managing scholarly works.

Successful learning object repositories tend to be commercial products; some examples include HarvestRoad Hive, Equella, and Intralibrary (see Resources). These repositories are designed around supporting complex educational metadata standards and related educational technical standards.

Learning object repositories are, however, a relatively niche market when compared with the repository software designed to support scholarly communications. In this sector three key open source repository platforms, Fedora, ePrints, and DSpace, are widely used. Bepress, a commercially hosted platform, is used extensively in North America (see Resources).

While we can easily use most learning object repositories for scholarly communications, the reverse is not as true. Currently, many open source repositories lack some of the features and native support for educational standards. But, this is improving as more institutional repositories include learning materials and put effort into developing their software.

Tagging tools and text editors

Although many learning materials are created in ordinary office software, they do not offer the ability to add standardized metadata to the material. Tools such as the previously mentioned Reload and eXe allow the editing of educational metadata alongside or as part of their support for packaging educational content. Similarly, LOMPAD is a metadata editing tool supporting IEEE LOM and a number of application profiles (see Resources).

It is also worth noting that most metadata can be included in web pages. Dublin Core in particular is used in this way. For example, Listing 3 shows the metadata in the header of a page on the Dublin Core website.

Listing 3. An example of Dublin Core metadata in a web page
<link rel="schema.DC" href="" />
<meta name="DC.title" content="Metadata Basics" />
<meta name="" content="2009-10-05" />
<meta name="DC.format" content="text/html" /> 
<meta name="DC.contributor" content="Dublin Core Metadata Initiative" /> 
<meta name="DC.language" content="en" />

You can edit metadata records in most text editors too. Text editors, rather than word processing programs, are recommended as they won't add formatting, and many text editors support nested layout structures and color code them.


In this article, we explored what metadata is and how it is used. We focused on the usage of metadata in education and examined a number of metadata standards that support educational description. We covered the various types of word lists used with standards and concluded with a review of some recent developments and relevant types of tool.

When considering the use of educational metadata standards the importance of repositories should not be underestimated. The final article in the series will return to some of the challenges and opportunities created by Web 2.0 tools and various 'open' initiatives, both around sharing learning materials under open licenses and creating more community based open standards. Part of this consideration will be a discussion of their impact on educational metadata standards.

John Casey (Digitalinsite) and Gareth Waller (AGW Software) developed the original outlines for the series of articles.



Get products and technologies

  • IBM trial software: Evaluate IBM software products in the method that suits you best. From trial downloads to cloud-hosted products, developerWorks features software especially for developers.



developerWorks: Sign in

Required fields are indicated with an asterisk (*).

Need an IBM ID?
Forgot your IBM ID?

Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.


All information submitted is secure.

Dig deeper into Open source on developerWorks

Zone=Open source, XML
ArticleTitle=Technical standards in education, Part 5: Take advantage of metadata