Generate URIs and IRIs from Templates

An introduction to the URI Template specification

The Universal Resource Identifier (URI) Template specification provides a mechanism that can be used to describe how to construct URIs for a broad variety of applications. This article introduces you to basic URI Template syntax and shows you how a template is expanded into a URI. View an illustration of the use of two URI Template implementations for Java™Script and Java language programs and learn about concepts related to the production of Internationalized Resource Identifiers (IRIs).

James Snell (jasnell@us.ibm.com), Software Engineer, Emerging Technologies, IBM

James SnellJames M. Snell is a software engineer working for IBM's WebAhead group focusing on the development and practical application of key emerging technologies for IBMs own use.


developerWorks Contributing author
        level

22 January 2008

Universal Resource Identifiers (URIs) are, without question, one of the single most important characteristics of Web-based applications. URIs provide a simple, consistent and persistent means of identifying and locating resources wherever they may exist online.

Historically, URIs have been hidden behind Web-browser interfaces and forms. They are generally treated as opaque tokens whose internal structures and data are more important to the server than to the browser. In such cases, an application is usually creating the URIs that are intended for that application's own consumption. There are times, however, when it becomes necessary for one application to allow other systems to construct the URIs that are to be consumed.

For instance, a Web-based service could choose to provide a means of allowing clients to request the current weather conditions for any given longitude and latitude coordinate on Earth. Each coordinate is specified by a URI like http://geoweather.example.org?long=36.330892&lat=-119.654846. Obviously, given the design, it would be impractical for the server to define every possible URI for this service. Instead, the service needs a way of describing the URI pattern so that client applications can construct the appropriate URL for the resources being accessed.

The URI Template Specification proposes a syntax that can be used to describe how meaningful URIs can be constructed.

URI Templates

A URI Template is a string value that includes a mixture of literal character sequences and specialized tokens that, when processed with a set of input values, yields a URI. For example:

Listing 1. Sample URI Template
http://{-append|.|host}example.org{-opt|/|segments}{-listjoin|/|segments}
/page/{id=~home}{-opt|?|a,b,c}{-join|&|a,b,c}

The templating syntax has been intentionally designed to be both easy to process and easy to read. Looking at this example, anyone familiar with basic URI syntax should be able to quickly decipher the kinds of URIs the template has been designed to produce.

To process the template, I first split it into each of its component parts -- essentially an array of literal values and tokens that have been delimited using the "curly bracket" characters "{" and "}" (ASCII codepoints 0x7B and 0x7D). Literal values are copied directly into the produced URI. Tokens, on the other hand, must be processed in order to determine an appropriate replacement value. Listing 2 shows the sample URI Template divided into its distinct components:

Listing 2. Distinct components of the sample URI Template
[http://] [{-append|.|host}] [example.org] [{-opt|/|segments}] [{-listjoin|/|segments}]
[/page/] [{id=~home}] [{-opt|?|a,b,c}] [{-join|&|a,b,c}]

In this example there are six distinct tokens. Listing 3 shows the tokens that appear in the sample URI Template, in order of appearance:

Listing 3. Tokens that appear in the sample URI Template
{-append|.|host}
{-opt|/|segments}
{-listjoin|/|segments}
{id=~home}
{-opt|?|a,b,c}
{-join|&|a,b,c}

Listing 4 shows the three basic forms that template tokens can take:

Listing 4. The basic forms of URI Template tokens
{variable}
{variable=default}
{-operation|arg|variable(s)}

The leading and trailing curly brackets are used to distinguish the token from literal values in the template.

Tokens using either the form {variable} or {variable=default} represent a simple replacement operation. That is, the token is to be replaced with a value associated with the name of the variable; or, if no value as been defined, with either the default value or an empty string. For example, if you have a template http://example.org/{foo=baz}, and your application has associated the value bar with the variable named foo, the {foo} token is replaced with bar resulting in an expanded result of http://example.org/bar. Had a value not been assigned to the variable named foo, the default value baz would have been selected, producing an expanded result of http://example.org/baz. Had a default value not been specified, the token would have been replaced with an empty string, producing an expanded result of http://example.org/.

Tokens using the form {-operation|arg|variable(s)} are a bit more complicated. The -operation identifies one of six distinct types of operations used to generate the replacement value for the token. The arg specifies a sequence of literal characters that will be inserted into the resulting URI depending on the semantics of the -operation. The variable(s) component is a listing of one or more named variables that serve as input for the -operation.

The -operation must be one of either:

  1. -append -- indicating that the value of arg is to be appended to the value of the variable, if defined.
  2. -prefix -- indicating that the value of arg is to be prepended to the value of the variable, if defined.
  3. -opt -- indicating that the value of arg is to be used as the token's replacement value only if a value for any of the named variables has been defined.
  4. -neg -- indicating that the value of arg is to be used as the token's replacement value only if a value for any of the named variables has not been defined.
  5. -listjoin -- indicating that the value of the named variable is a list, whose members are to be concatenated together using the value of arg.
  6. -join -- indicating that multiple variable=value pairs, concatenated using the value of arg, are to be used as the token's replacement value.

For the -append, -prefix and -listjoin operations, only a single-named variable may be specified in the token, for example, {-prefix|/|foo}. For -append and -prefix, the value associated with the named variable must be a single string value. For -listjoin, the value associated with the named variable can either be a single string value or a list of zero or more string values.

For the -opt, -neg and -join operations, multiple-named variables may be listed in the token, for example, {-opt|/|foo,bar,baz}. For -opt, the value of arg is used as the replacement value only if a value has been defined for all of the listed variables. For -neg, the value of arg is used as the replacement value only if a value has not been defined for all of the listed variables.

The -join operation is a special case that has been designed to make it easy to produce traditional-style URI query parameters. Each of the listed variables, and their associated values or defaults, are used to produce a string of name=value pairs delimited using the value of arg. For example, given the token {-join|&|a,b=bar}, and a value of foo associated with the variable a, the replacement value for the token is a=foo&b=bar.

To better understand how these operations work, it helps to walk through the example given previously.

In the template, there are six variables named host, segments, id, a, b and c. Table 1 lists values associated with each of the named variables.

Table 1. Listing of values associated with the sample URI Template variables
VariableValueDescription
hostwwwThe string value www
segments["foo","bar","baz"]A list of three string values foo, bar and baz
idUndefined
axString value x
byString value y
czString value z

Given these values, the replacements for each component of the URI Template can be calculated, as Table 2 illustrates:

Table 2. Calculating the replacement values for the sample URI Template
LiteralTokenDescriptionReplacement
http://Insert the literal value http://http://
{-append|.|host}Insert the value www and append the literal value . (ASCII codepoint 0x2E)www.
example.orgInsert the literal value example.orgexample.org
{-opt|/|segments}Because a value for segments has been defined, insert the literal value / (ASCII codepoint 0x2F)/
{-listjoin|/|segments}Concatenate the values foo, bar, baz with the literal value / (ASCII codepoint 0x2F)foo/bar/baz
/page/Insert the literal value /page//page/
{id=~home}Insert the default literal value ~home~home
{-opt|?|a,b,c}Because values for the variables a, b and c have been defined, insert the literal value ? (ASCII codepoint 0x3F)?
{-join|&|a,b,c}Concatenate a=x, b=y and c=z with the literal value & (ASCII codepoint 0x26)a=x&b=y&c=z

After determining each of the replacement values, what you end up with is the following:

Listing 5. Reassembling the components of the expanded URI Template
[http://] [www.] [example.org] [/] [foo/bar/baz] [/page/] [~home] [?] [a=x&b=y&c=z]

Which, when the individual components are recombined, forms the following complete URI:

Listing 6. The resulting URI
http://www.example.org/foo/bar/baz/page/~home?a=x&b=y&c=z

URI Templates are useful in a variety of contexts and applications. For this reason, having multiple implementations available in as many programming languages as possible is important. In the sections that follow, I demonstrate the use of JavaScript and Java implementations that I have put together for the Apache Abdera project.


URI Templates in JavaScript

The URI Template JavaScript library available at http://www.snellspace.com/public/template.js provides a complete implementation of the current URI Template specification. Use of the library is a simple matter of importing the library and creating an instance of the Template class as Listing 7 illustrates:

Listing 7. Using the JavaScript URI Template implementation
<html>
<head>
<script src="template.js"></script>
</head>
<body>
<script>
  var template = new Template("http://example.org/{foo.bar.baz}");
  var context = {
    "foo" : {
      "bar" : {
        "baz" : "t\u00e9st"
      }
    }
  } ;
  document.write( template.expand( context ) );
</script>
</body>

The output of the code in Listing 7 is the URI http://example.org/t%c3%a9st. Note that the template processor was able to extract the replacement value from the context object by following the object's properties and automatically converted the Unicode character \u00E9 (lowercase e with acute) to the appropriate UTF-8 percent-encoding.


URI Templates in Java

For Java developers, an implementation of URI Templates has been provided as part of the Apache Abdera project. Like the JavaScript library, the Java implementation provides a complete implementation of the current URI Template specification. It also provides a few additional capabilities such as the ability to use java.util.Map instances or ordinary Java objects as the source of values for variable replacements. Listing 8 illustrates the basic use of the Abdera URI Template implementation:

Listing 8. Using the Abdera URI Template implementation
Template template = new Template("http://example.org/{foo}/{bar}/{baz}");

Map map = new HashMap();
map.put("foo","a");
map.put("bar","b");
map.put("baz","c");

System.out.println(template.expand(map));

The context interface, illustrated in Listing 9, allows an application to provide customized resolution of the template variable replacement values.

Listing 9. Using a custom context interface implementation to provide variable replacement values
final Template template = new Template("http://example.org/{foo}/{bar}/{baz}");

Context context = new CachingContext() {
  protected <T> T resolveActual(String var) {
    return (T) (
      var.equals("foo") ? "a" :
      var.equals("bar") ? "b" :
      var.equals("baz") ? "c" : null
    );
  }
  public Iterator<String> iterator() {
    return template.iterator();
  }
};

System.out.println(template.expand(context));

Alternatively, developers can use the public fields and getter methods from ordinary Java objects to provide values for the template expansion as Listing 10 illustrates:

Listing 10. Using a Java object to provide variable replacement values
public class MyObject {
  public String getFoo() { return "a"; }
  public String getBar() { return "b"; }
  public String getBaz() { return "c"; }
}
Template template = new Template("http://example.org/{foo}/{bar}/{baz}");
MyObject myobj = new MyObject();
System.out.println(template.expand(myobj));

In many cases, when developers use Java objects to provide replacement values, it is convenient to be able to associate the template directly with the object class being used. For this purpose, the Abdera implementation provides two special annotation objects (JDK 1.5 or higher).

The @URITemplate annotation, shown in Listing 11, associates a template with a Java class:

Listing 11. Using the @URITemplate annotation
@URITemplate("http://example.org/{foo}/{bar}/{baz}")
public static class MyObject {
  public String getFoo() { return "a"; }
  public String getBar() { return "b"; }
  public String getBaz() { return "c"; }
}

Instances of the class can then be expanded into URIs using the static Template class expandAnnotated method as Listing 12 shows:

Listing 12. Using the expandAnnotated method
MyObject myobj = new MyObject();
System.out.println(Template.expandAnnotated(myobj));

By default, the Template processor automatically attempts to map template variable names to public fields and getter methods on the Java class. For instance, a variable named foo is mapped to either a field named foo or a method named either foo or getFoo. The mapping is case insensitive and only matches if the method has a return value and no input parameters.

For most applications, the default mapping is more than adequate. However, in some cases, the names of the fields or methods will not match the names of the template variables. To handle those situations, the @VarName attribute can be used to associate an alternative variable name mapping as shown in Listing 13:

Listing 13. Specifying an alternative variable name mapping using the @VarName attribute
@URITemplate("http://example.org/{foo}/{bar}/{baz}")
public static class MyObject {
  public String getFoo() { return "a"; }
  public String getBar() { return "b"; }
  @VarName("baz") public String getSomething() { return "c"; }
}

Internationalized Templates

As defined by the current draft specification, templates are only capable of producing URIs that contain characters from the US ASCII character set as allowed by RFC 3986 -- the URI specification (see Resources). For many applications, however, the ability to support characters from extended character sets is critical. To support the needs of those applications, both the Java and JavaScript implementations support the notion of "Internationalized Templates" capable of producing IRIs as defined by RFC 3987 (see Resources).

Internationalized Templates are generally identical to URI Templates with the exception that variable names may contain any character allowed by the IRI specifications iunreserved production; may contain bi-directional characters; and can be used to produce IRIs. The following example shows an Internationalized Template containing Greek variable names and literal components to produce an IRI:

Listing 14. An Internationalized Template containing characters from the Greek alphabet and literal components
http://παραδειγμα.org/{ονομα}{-listjoin|/|τμηματα}
http://παραδειγμα.org/ευρωπη/α/β/γ

The most difficult, and oftentimes confusing, aspect of using non-ASCII characters in a template is the support for bi-directional languages such as Hebrew and Arabic. Because of the complex and often confusing results that can occur when rendering strings containing a mixture of left-to-right and right-to-left characters, it is very important for implementors to follow a simple set of rules when constructing bi-directional templates.

The first rule is that Internationalized Templates must be stored and transmitted in logical order. In other words, regardless of whether the characters contained in the template are to be displayed right-to-left, the logical ordering of characters is always left-to-right.

When rendered for display, however, Internationalized Templates should be rendered left-to-right, as if preceded by the Unicode bi-directional formatting character U+202D and followed by U+202C. Template variable names, on the other hand, must be rendered using a left-to-right embedding; that is, as if they were preceded by the Unicode bi-directional formatting character U+202A and followed by U+202C. These rules ensure that when displayed, Internationalized Templates render properly and consistently while allowing variable names to still be read naturally.

The following examples illustrate the difference between the Logical and Presentation ordering of characters in an Internationalized Template. Capital letters represent right-to-left characters.

Listing 15. Logical ordering of characters in an Internationalized Template
http://example.org/{-prefix|XYZ|ABCD}
Listing 16. Presentation ordering of characters in an Internationalized Template
http://example.org/{-prefix|XYZ|DCBA}

Internationalized templates may directly contain any Unicode bi-directional formatting characters necessary to ensure that the template can be rendered properly but the template processor will remove those characters from the template prior to processing. URIs and IRIs produced by the template processor will not contain any bi-directional formatting codes.


Conclusion

The URI Template specification is an evolving project. Hopefully this article has given you a basic understanding of URI Template syntax and how to expand the template into a URI. You should have gained an understanding of the URI Template implementations for JavaScript and Java language programs and should be familiar with some of the concepts related to producing IRIs. If this article has sparked any questions or comments for you about the URI Template specification, please direct your feedback and discussion to the W3C URI mailing list at http://lists.w3.org/Archives/Public/uri/. Feedback and discussion of the JavaScript and Java URI Template implementations should be directed to the Apache Abdera developers mailing list located at http://incubator.apache.org/abdera/project.html#lists.

Resources

Learn

Get products and technologies

More downloads

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Web development on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Web development, SOA and web services
ArticleID=283185
ArticleTitle=Generate URIs and IRIs from Templates
publish-date=01222008