Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Two modes of implementing an XML-based localization pack: embed and extend

A globalization technique for supporting multiple languages

Bei Shu (shub@cn.ibm.com), Staff Software Engineer, User Technologies, IBM China Software Development Laboratory
Photo of Bei Shu
Bei Shu is a software engineer in the Globalization Certification Laboratory in Shanghai, China. She joined IBM in April, 2001 and has been working on designing and implementing globalization solutions for e-business systems. She co-authored the IBM Redbook SG24-6851-00 e-Business Globalization Solution Design Guide: Getting Started, which offers a comprehensive overview of globalization technologies and demonstrates how to realize cost-effective globalization. She has extensive experience in XML and its related technologies. You can reach her at shub@cn.ibm.com.

Summary:  In this article, IBM software engineer Bei Shu shows you how to enable multiple language support in your Web applications using different XML technologies from the architect perspective. She presents two approaches to implementing XML-based localization pack managers using XPath and XSLT -- embed and extend.

Date:  18 Jun 2003
Level:  Intermediate
Also available in:   Japanese

Activity:  7895 views
Comments:  

The localization pack is one of the key elements in the globalization architecture. XML is the recommended source form for localization packs because it is cross-platform, Unicode-encoded, and flexibly structured. This article presents two approaches to implementing XML-based localization pack managers using XPath and XSLT: embed and extend. With embed, the localization pack manager module is embedded in the main programs, while with extend, the localization pack manager module works outside the main programs. Both have their advantages and can be applied to actual development according to different requirements.

Note: To better understand this article, you should be familiar with XML, XSL, XPath, and XSLT. In addition, if you are not familiar with globalization-related terms -- such as locale, localization pack, and localization pack manager -- please refer to "Globalization Architecture Imperatives" in Resources for a quick overview.

Going global

In today's global e-business environment, businesses need to support customers and suppliers from many different countries. It is no longer sufficient for software to support one language and territory at a time, as defined by the former national language support model. In the software world, the trend is toward globalization, especially for e-business systems. This article presents an important globalization technique for supporting multiple languages -- that is, localization pack implementation based on XML.

What's a localization pack?

In the conventional programming model, a non-globalized software product contains various language and culture-dependent elements in the executable, such as national language translation resources, language- and culture-dependent templates for formatting messages, and culture-dependent business logic. These elements are not isolated from the program code.

A localization pack is a standardized approach for software to support multiple languages and locales through a single executable. Language- and culture-dependent elements are separated from the core logic of the software at the source code level, as well as the compiled and static-linked module level. The language and culture-independent portion is called the core module, and the language- and culture-dependent portions are called localization packs. In situations where the software is required to support multiple locales simultaneously, the software needs locale-dependent services that can be switchable based on a locale-ID that's designated explicitly by the user or implicitly by the application.

Here's a very simple example for localization packs: In a Windows program, a single key ("Msg1") associates with a single string value ("Hello"), which varies ("Bonjour") according to the specific language version (French).

XML source format

Because a localization pack needs a platform-independent format and an all-in-one character repository, IBM's globalization organization recommends using XML as the source format for localization packs. XML is cross platform and Unicode encoded, and thus capable of holding multilingual data without data loss. It also provides a flexible, tree-like structure to accommodate the need for various kinds of structured locale data. XML is a W3C recommendation and an Internet standard, so it can meet most Web application requirements.

XML localization pack examples

Using XML, you can describe complex locale data with multiple hierarchical structures. Here is a simple example of two XML-based localization packs that contain greeting messages.


Figure 1. Localization pack for the United States (en_US)
Localization pack for the United States

Figure 2. Localization pack for China (zh_CN)
Localization pack for China (zh_CN)

What's a localization pack manager?

A localization pack manager is the code module that manages the locating, loading, and accessing of a localization pack's resources. Existing platform services provide basic support for many localization pack functions. Figure 3 depicts the localization pack manager workflow.


Figure 3. Localization pack manager workflow
Localization pack manager workflow

You can implemetn localization pack managers with various approaches, even for a single localization pack format such as XML. A localization pack manager uses the XSLT or path expressions of XPath to access the designated nodes in the localization pack XML files. Many public programs or APIs implement XSLT and XPath recommendations; in this article, I use the open source Xalan-Java package from the Apache XML Project (see Resources), fully implementing XSLT and XPath as the basic building blocks of localization pack managers. Xalan-Java is a feature-rich and robust XSLT processor for transforming XML documents into HTML, text, or other XML document types. You can use it from the command line, in an applet or servlet, or as a module in another program. Here I use it as the localization pack manager in presentation programs.

A typical e-business application mainly uses HTML as the interface that's shown to users. In a multilingual e-business environment, the XML-based localization pack manager can (even though it's a kind of XML parser) have different implementations according to the different HTML-generation approaches. This article discusses two implementations for a localization pack manager: embed mode and extend mode. Both work well in the typical J2EE environment. The application presentation layer is composed of JSPs, servlets, or portlets that get the program data from the back-end (or business logic) and create the layout HTML (or another markup language).


Embed mode implementation

This section discusses the implementation of the XML-based localization pack manager in the application environment described above. The codes accessing localization pack XML are embedded in the programs -- such as the JSP, servlet, or portlet -- where HTML source codes are produced line by line. Figure 4 illustrates the workflow of this approach.


Figure 4. Embed localization pack workflow
Embed localization pack workflow

The programs for UI presentation get their logic data from back-end business logic, and get UI messages from localization pack XML files, and then combine them to create the HTML layout.

XPathAPI introduction

Presentation programs need an XML parser to access the nodes from localization pack XML files. The Java classes XPathAPI and (the more advanced) CachedXPathAPI in Xalan-Java can be used by such programs as an XPath processor, which can be used as a standalone unit in Xalan-Java.

XPathAPI includes several static methods such as selectNodeIterator, selectNodeList, and selectSingleNode to fetch a node or node list from an XML document tree through an XPath string:

NodeIterator nl = XPathAPI.selectNodeIterator(doctree, xpathstring);
String myvalue = nl.nextNode().getNodeValue();

You can also use the eval method to evaluate whether an XPath string is valid for a document tree to access one or more nodes.

Using XPathAPI for a localization pack manager

You can define a commonly-used class for an embed localization pack manager named Embed_LPM.class which encapsulates the initialization, evaluation, and selection of the localization pack XML documents. The class structure would look like the following:


Listing 1. Class structure of Embed_LPM.class
package com.ibm.gcl.lpm;
import org.apache.xpath.*;
import org.w3c.dom.*;
import javax.xml.*;

public class Embed_LPM {
    private Document doc;
        public Embed_LPM(Locale cLocale) throws Exception {
        Inputstream XMLStream = FileConverter.toStream(cLocale);
        //Convert the XML file for specific locale to a InputStream.
        InputSource in = new InputSource(XMLStream);
        DocumentBuilderFactory dfactory = DocumentBuilderFactory.newInstance();
        doc = dfactory.newDocumentBuilder().parse(in);
    }
    public String SelectNodeValue(String xpathstring) {
        NodeIterator nl = XPathAPI.selectNodeIterator(doc, xpathstring);
        String myvalue = nl.nextNode().getNodeValue();
        Return myvalue;
    }
    … … … … }

Embed implementation details

Here is a very simple example to illustrate the procedure. Suppose you have a page that contains the greeting message "hello" and need to show different translations of "hello" based on the specific locale (for example, zh_CN for China). The target HTML is:


Target HTML

The detailed process is shown in Figure 5. The presentation program gets the current locale, creates an Embed_LMP instance using the current locale, constructs an XML document tree from the localization pack XML for the current locale, calls the selectNodeValue method to obtain the nodes in the tree, and inserts them into the right part of the result HTML. Dynamic program data from back-end business logic can also be placed at the specified location in the HTML. Figure 5 only includes the localization pack manager's working scope.


Figure 5. Embed localization pack implementation
Embed localization pack implementation

The following three code listings demonstrate the call procedure for a localization pack manager in JSPs, servlets, and portlets. Although they are all feasible, using JSP is the most suitable approach because the page layout is needed to write explicitly in the program and JSP is designed to create the layout in a convenient way.

Listing 2 shows how the localization pack manager works in JSPs.


Listing 2. Call procedure for a localization pack manager in JSPs
<%@ page language="Java" %>
<%@ page import="...... %>
<html>
<body>
<p>
<%
       Locale locale = (Locale)session.getValue("clocale");
       Embed_LPM eblpm = new Embed_LPM(locale);
       String greetingmsg = eblpm.selectNodeValue("/greetings/Msg1"));
%> 
<%=greetingmsg%>
</p>
</body>
</html>

Listing 3 shows how a servlet calls a localization pack manager


Listing 3. Servlet calls a localization pack manager
public void doGet(HttpServletRequest request, HttpServletResponse response)
    throws ServletException, IOException, java.net.MalformedURLException {
    response.setContentType("text/html; charset=UTF-8");
    PrintWriter out = response.getWriter();
    Locale locale = request.getLocale();
    Embed_LPM eblpm = new Embed_LPM(locale);
    out.write(" < html > < body > < p > ");
    out.write(eblpm.selectNodeValue(" / greetings / Msg1 "));
    out.write(" < / p > < / bodyl > < / html > ");
    out.close();
}

Listing 4 shows how a portlet calls a localization pack manager.


Listing 4. Portlet calls a localization pack manager
public void doView(
                PortletRequest request,
                PortletResponse response)
    throws PortletException, IOException {
    PrintWriter pw = response.getWriter();
    Locale locale = request.getLocale();
    Embed_LPM eblpm = new Embed_LPM(locale);
    pw.println(" < html > < body > < p > ");
    pw.println(eblpm.selectNodeValue(" / greetings / Msg1 "));
    pw.println(" < / p > < / bodyl > < / html > ");
 }

How to handle messages that contain variables

The translatable strings contained in localization packs can contain variables -- for example, the variable XXX in the sentence "The account number XXX doesn't exist." You don't need to separate this into two strings in localization packs. To retain the original meaning of the sentence after it's translated into other languages with various grammars, you should regard the whole sentence as one string to be translated, while retaining the XXX as a way to notify translators that it is a variable. For formal use, the variable can be marked with a "#" symbol -- or "#1", "#2", and so forth -- if multiple variables exist in one sentence.

When the localization pack manager encounters this situation, it can use the string operations to combine the separate parts into a whole result string.


Listing 5. Combining separate parts into a complete string
< warning1 > The account number # doesn't exist.< / warning1 >
		//Node in localization pack XML


public String stringReplace(
    String sourcestring,
    String symbol,
    String replacement) {
        String resultstring =
            stringBefore(selectNodeValue("//warning1"), symbol) + 
            replacement + stringAfter(selectNodeValue("warning1"), symbol);
    Return resultstring;
}
// String operations to be added in localization pack manager


print(stringReplace(selectNodeValue(" //warning1"), "#", accountnumber)); 
		//Caller in JSP/…


Extend mode implementation

In addition to the Java XML parser, XSL can also parse XML using XPath expressions. In this mode, the resulting HTML isn't generated line by line in the programs (as in JSPs, servlets, or portlets), but is transformed from an XML file (containing the business logic data) and an external XSL file. In Figure 6, the localization pack manager residing in the XSL works outside the main presentation programs. I refer to this as extend mode to distinguish it from the embed mode because of the different location and implementation of localization pack managers.


Figure 6. Extend localization pack workflow
Extend localization pack workflow

The XML business logic data may have the following content that is requested from the back end as a whole stream in XML format, or is generated temporarily within the presentation program for use in transformation:

<root>
	<userinfo>
		<accountnumber>37522109</accountnumber>
	</userinfo>
</root>

Using XSL to access XML localization pack nodes

You can use the following XSL syntax to access the nodes from the localization pack XML files:


Listing 6. XSL for accessing localization pack nodes
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:param name="lpfile">localizationpack.xml</xsl:param>
  <xsl:template match="/">
    <html>
      <body bgcolor="#6a6dee" text="#000000">
        <p>
          <xsl:value-of select="document($lpfile)//greetings/Msg1"/>
        <p>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>

XSL transformations

The XSLTProcessor and XSLTProcessorFactory classes contained in Xalan-Java can transform an XML/XSL pair into the resulting XML document. Presentation programs can call the methods in XSLTProcessor and XSLTProcessorFactory for transformation.


Listing 7. Transforming with XSLTProcessor and XSLTProcessorFactory
XSLTProcessor processor = XSLTProcessorFactory.getProcessor();
processor.process(
    new XSLTInputSource(inputXMLStream),
    new XSLTInputSource(inputXSLStream),
    new XSLTResultTarget(resultStringWriter));
    
String result = resultStringWriter.toString();

Extend implementation details

Figure 7 depicts the detailed working process of the extend localization pack manager. The presentation programs can get the current locale and send the URL of the corresponding localization pack XML as a parameter to the XSL. It's the XSL’s responsibility to manage the localization packs and obtain the required locale data. For simplicity, the resulting HTML doesn't contain the content from the XML data, but it's fairly easy to add the codes to XSL.


Figure 7. Extend localization pack implementation
Extend localization pack implementation

One program can create multiple XML business logic data files and match multiple XSL files, while one XSL file can be applied to multiple programs to transform their XML data files. So the number of XSL files depends on the design requirements of the presentation programs.

The following three code listings demonstrate how the transformation is performed in JSPs, servlets, and portlets. In contrast to embed mode, servlets and portlets are much more competent than JSP in extend mode because the page layout is implicitly contained in the XSL.

The transformation process for JSPs is shown in Listing 8.


Listing 8. Transforming in JSP
<%@ page language="java" %>
<%@ page import="...... %>

<%
  Localec clocale = (Locale)session.getValue("clocale");
  XSLTProcessor processor = XSLTProcessorFactory.getProcessor();
  processor.setStylesheetParam("xsllocale", "clocale" );//send the locale to XSL
  processor.process(new XSLTInputSource(inputXMLStream), 
            new XSLTInputSource(inputXSLStream),
            new XSLTResultTarget(resultStringWriter));
%>

<%=resultStringWrite.toString()%>

The transformation process for servlets is shown in Listing 9.


Listing 9. Transforming in a servlet
public void doGet(HttpServletRequest request, HttpServletResponse response)
  throws ServletException, IOException, java.net.MalformedURLException {
  response.setContentType("text/html; charset=UTF-8");
  PrintWriter out = response.getWriter();
  Locale clocale = request.getLocale();
  XSLTProcessor processor = XSLTProcessorFactory.getProcessor();
  processor.setStylesheetParam("xsllocale", "clocale");//send the locale to XSL 
  processor.process(
        new XSLTInputSource(inputXMLStream),
        new XSLTInputSource(inputXSLStream),
        new XSLTResultTarget(resultStringWriter));
  out.write(resultStringWriter.toString());
  out.close();
}

The transformation process for portlets is shown in Listing 10.


Listing 10. Transforming in a portlet
public void doView(PortletRequest request, PortletResponse response)
  throws PortletException, IOException {
  PrintWriter pw = response.getWriter();
  Locale clocale = request.getLocale();
  XSLTProcessor processor = XSLTProcessorFactory.getProcessor();
  processor.setStylesheetParam("xsllocale", "clocale");//send the locale to XSL
  processor.process(
        new XSLTInputSource(inputXMLStream),
        new XSLTInputSource(inputXSLStream),
        new XSLTResultTarget(resultStringWriter));
  pw.println(resultStringWriter.toString());
}

Handling messages that contain variables

The sentence to be translated that contains variables also needs to be stored separately in localization packs and combined together when accessed from the localization pack manager. Because the extend mode localization pack manager is located in the XSL, combining them is the XSL's responsibility.

Here I use the same sentence that was used in the embed section for demonstration:

<xsl:value-of select="substring-before(document($lpfile)//warning1,'#')" />
<xsl:value-of select="//accountnumber" />
<xsl:value-of select="substring-after(document($lpfile)//warning1,'#')" />

Similarly, it can process multiple variables using more complex code. For convenience, you can package the code in a common XSL template for string replacement, thus making it reusable.


Summary

The embed localization pack manager can enable a typical J2EE application to support multiple languages without making significant changes to the framework design. This is done by extracting translatable strings from presentation programs and replacing them with code to access the corresponding content from the localization packs, which also need to be organized during the extracting phase.

Embed mode is simple and classic, and includes no extra files beyond the localization packs. However, since the page layout and program data are merged in the program code, if the layout changes the program may need to be rebuilt.

As the use of XML as the interface between presentation modules and business logic modules becomes more and more common, modules in the front-end often use XSL for transformation to the target UI (which could be HTML, VXML, WML, or something else). In such cases, the extend localization pack manager plays the best role in supporting multiple languages. The use of XSL separates the page layout from the program data, so you don't need to write or use an XML parser because XSL does it for you. What you need to do when the layout changes is to modify the XSL files without rebuilding the program. However, the number of files doubles because each program needs at least one corresponding XSL file.

The XML-based localization pack implementation approaches discussed in this article provide a great deal of flexibility for the design, development, and deployment of globalized software. The XML-formatted localization packs are organized into separate packages and stored outside the core module executable without being compiled into the executable. New languages can be easily added using the same structure as the existing localization packs in XML content organizations. These localization pack changes do not affect how the system runs. In addition to the two modes mentioned here, developers can choose from lots of other approaches according to their specific scenarios and requirements.


Resources

About the author

Photo of Bei Shu

Bei Shu is a software engineer in the Globalization Certification Laboratory in Shanghai, China. She joined IBM in April, 2001 and has been working on designing and implementing globalization solutions for e-business systems. She co-authored the IBM Redbook SG24-6851-00 e-Business Globalization Solution Design Guide: Getting Started, which offers a comprehensive overview of globalization technologies and demonstrates how to realize cost-effective globalization. She has extensive experience in XML and its related technologies. You can reach her at shub@cn.ibm.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=XML
ArticleID=12284
ArticleTitle=Two modes of implementing an XML-based localization pack: embed and extend
publish-date=06182003
author1-email=shub@cn.ibm.com
author1-email-cc=