Globalization Certification Laboratory
IBM China Development Lab
September 2002
© Copyright International Business Machines Corporation 2002. All rights reserved.
The Internet transcends national boundaries and geographical barriers. Today, over two-thirds of Internet users would prefer to conduct business with a Web site in their own language. Therefore, globalization is becoming a consideration for Web application developers.
Globalization is the proper design and execution of systems, software, services, and procedures so that one instance of software, executing on a single server or end user machine, can process multilingual data, and present culturally correct information (e.g. collation, date and number format). Globalization is more than translation.
This article provides programming tips to developers for globalization in Web services. It discusses:
- How to use the UDDI's multilingual support features to define, publish, and find multilingual Web services
- How to enable language-sensitive operations using SOAP APIs
- How to enable WebSphere Application Server Version 4.0 to support multilingual applications.
The information presented was gained from our real working practices in the Globalization Certification Laboratory.
Currently, there are many articles about Web services available from the IBM Web sites.
While many articles are focused on developing Web services in general, or are oriented toward an English-speaking audience, this article targets developers of applications intended for a global audience. Understanding globalization issues is the primary focus of our group. Therefore, we understand what is involved in providing globalized applications and can appreciate the difficulty Web developers may have in learning yet another set of skills. By writing this article, we hope to quickly and easily provide Web application developers with the information they need to make their applications available to global audiences. This article tells how to address the most important considerations in providing and using globalized Web services.
The software used in the article is:
- WebSphere Application Server Advanced Edition Version 4.0
- IBM WebSphere UDDI Registry Preview Version 1.0.5
This article provides programming tips to developers who need to globalize Web services. It also provides some entry-level information for implementing globalization. Finally, it provides a good starting point for users who are interested in making multilingual applications run on WebSphere Application Server.
Publishing a multilingual business using UDDI
Universal Description, Discovery, and Integration (UDDI) is a standards-based specification for describing and enabling discovery of Web services. The UDDI Business Registry provides a place for publishing and locating Web over the Web.
Defining a multilingual business
There are three categories of Business Registry Information provided by the UDDI Registry for service providers. The first one is the White Pages, which includes basic information such as the business name, text description, and contact information. The second one is the Yellow Pages, which describes the business categories. The third category is the Green Pages, which specifies technical information, such as, a description of services offered and how to invoke these services.
One concern a service provider has is how to present services in multiple languages. Global e-business requires that the business be tailored to fit the needs of international or culturally diverse customers. People want to see application interfaces and Web content in their preferred languages; therefore, this requires that the business service be offered in more than one language.
In UDDI Version 1.0, there are many description fields in data structure types, such as businessEntity, businessService, etc. Among them, businessEntity is very commonly used. Its structure represents all basic information about a business or entity that publishes descriptive information about the entity as well as the services that it offers. The description fields of businessEntity, as shown in Figure 1, contain one or more short business descriptions, each of which may have multilingual data.
Figure 1. The description fields of the businessEntity in UDDI Version 1.0

Let's look at an example. An airline company wants to publish its service dynamically in several languages to fulfill the needs of customers in various countries. The company would define the businessEntity as shown in Figure 2.
Figure 2. Multilingual business description using UDDI Version 1.0

Then, in a real application, a multilingual business description can be provided based on end-userâs preferences.
Figure 3 shows a multilingual business description for an airline company.
Figure 3. Multilingual business description in a real application

Publishing a multilingual business
UDDI data is published to and stored in a UDDI registry. There are two types of UDDI registry: a public registry, which is for public-access, and a private registry, which is used only for intra-enterprise access.
There are several public registries available. The IBM UDDI Business Test Registry can be accessed at:
http://www-3.ibm.com/services/uddi/testregistry/protect/registry.html
In our project, we used a private registry service provided by IBM WebSphere UDDI Registry Preview Version 1.0.5. When using this UDDI Registry Preview, there are two things to consider. These considerations are described below.
The Registry Preview Version 1.0.5 uses a database to store the UDDI data. To enable multilingual support, the database must have multilingual data support. In DB2, a UTF-8 encoded database is needed for this purpose. UTF-8 is a type of Unicode encoding which contains all the characters from every language.
To enable the Registry Preview to create and use a UTF-8 encoded database, before installing the Preview, you must set the db2codepage system environment variable.
As shown in Figure 4, in the Command Line Processor, type:
db2set db2codepage = 1208 |
Figure 4. Set the db2codepage system variable to UTF-8

The second consideration when using the Registry Preview is deciding how to set the locale.
The locale is a globalization term which identifies the supported language set. In the Registry Preview
Version 1.0.5, the definition of the locale uses RFC 1766 style language tags.
RFC 1766 is an internet standard which defines the tag for the identification of languages.
For example, suppose you want
to support Simplified Chinese in an application. You would use attribute xml:lang=zh-CN
in the businessService definition for registering this language representation.
The Registry Preview UI can accept the locale ID in this style.
Finding a multilingual business
There are two ways in which a service requester can discover appropriate service providers and get corresponding service descriptions from the UDDI Registry. One is the static method which is to browse the UDDI Web site and specify a search term. The other is the dynamic method which is way is to call the methods provided by UDDI4J package.
When a requester uses the UDDI4J package to find a business, it can pass the user's preferred language identifier as a parameter to the finding program that the service provider supplies. The program looks through the UDDI Registry and returns the service description in the specified language to the user. Listing 1 shows sample code to achieve this.
Listing 1: Finding a multilingual business
Vector descriptionVector = businessInfo.getDescriptionVector();
String businessDescription = "";
//Default en-us description
String defaultDescription = "";
for (int i = 0; i < descriptionVector.size(); i++) {
Description description = (Description)descriptionVector.elementAt(i);
//Check if there is description according to the locale
if (description.getLang().compareToIgnoreCase(localeString) == 0) {
businessDescription = description.getText();
break;
}
//Get the en-us description as the default
if (description.getLang().compareToIgnoreCase("en-us") == 0) {
defaultDescription = description.getText();
}
}
//If there is no description for this locale, then set to the default
//description
if (businessDescription == "") {
businessDescription = defaultDescription;
}
/**
* Some limitation exists in the encoding convertion between
the Preview Version 1.0.5
* and UDDI implementation in uddi4j.jar of WebSphere Application
Server Version 4.0,
* here is a workaround:
*
* 1. Convert the original string from the registry to bytes:
* In fact, this bytes sequence represents a UTF-8 encoded string. But
* the orignal string is the result of converting this sequence to string as
* ISO-8859-1 encoding, which is wrong. So, using ISO-8859-1 as
* the parameter to get the correct bytes sequence.
* "ISO-8859-1 means ISO Latin Alphabet No. 1, or ISO-LATIN-1."
*
* 2. Convert the bytes to string:
* Process the bytes sequence as UTF-8 encoding sequence, and the
* processing result is a string.
*/
businessDescription = new String(
businessDescription.getBytes("ISO-8859-1"), "UTF-8");
//Output the description
|
Implementing a multilingual Web service using SOAP
SOAP is a lightweight protocol used to exchange information in a Web services environment. As an XML based protocol, it uses UTF-8 as default encoding; therefore, it can contain multilingual data without losing any data. This section describes how to configure environment under WebSphere Application Server for multilingual applications, then how to implement multilingual Web Services from service provider and service requestor side, respectively.
Preparing the multilingual environment
Before we go further, let's look at several tasks that you need to do to run multilingual applications on WebSphere Application Server. You need to specify the supported languages in a file and in a system property, in order to determine the character set of the request and the response.
encoding.properties
The encoding properties file, which is under \WAS_ROOT\properties,
contains the list of language and character set pairs. WebSphere Application Server gets
language information in context. When the charset attribute is not explicitly
specified in the HTTP request or response, Application Server looks at the
encoding properties file to select the character set. For example, UTF-8
contains all the characters in the world; therefore, it is a good idea to
use UTF-8 for all the language settings for a multilingual application.
Figure 5 shows how to set UTF-8 encoding in encoding.properties file.
Figure 5. Set encoding.properties to UTF-8

default.client.encoding
This is a system property, which defines the client input code set for parsing the input values. It is used when the charset attribute is not explicitly specified in the HTTP request and the input code set cannot be determined by examining the encoding.properties. For multilingual applications, usually we pick UTF-8 as the character set.
In WebSphere Application Server Version 4.0.1, a new system property called client.encoding.override has been introduced which can be used to override any client preferences for parsing the input values. This new property should be specified instead of the system property default.client.encoding on the JVM settings of the application server to be absolutely sure that the correct encoding is used to parse the input values.
Figure 6 shows how to set UTF-8 encoding in the default.client.encoding file.
Figure 6. Set default.client.encoding to UTF-8

If you do not use the settings described above, then WebSphere Applicaiton Server will pick up ISO-8859-1 as the default character set (see Unicode and WebSphere , from Kentaro Noji and Debasish .Banerjee) In this case, many non-English characters will be missing.
Implementing multilingual Web services
Now that you have prepared the environment, you are ready to implement multilingual Web services. Let's see how to make that happen using a service provider and service requester in a simple example.
The SOAP protocol does not provide a way to carry language-sensitive data. The service provider must provide an interface with locale information in order to provide a multilingual business operation. Typically, there are two ways to do this.
- Define locale information on the service interface directly.
Listing 2 illustrates this method.
Listing 2. The getMultilingualData functionpublic String getMultilingualData(String key, String localeString) { String bundleName = "multilingualdata"; PropertyResourceBundle multilingualDataBundle = null; Locale locale = new Locale("en", "US"); // Construct locale instance from input string if (localeString != null) { StringTokenizer st = new StringTokenizer(localeString, "_"); String language = st.nextToken(); String country = st.nextToken(); locale = new Locale(language, country); } // Load appropriate resource bundle file try { multilingualDataBundle = (PropertyResourceBundle)PropertyResourceBundle. getBundle(bundleName, locale); } catch (Exception e) {} // Fetch the information and return the result return multilingualDataBundle.getString(key); } - Define the data structure, which includes a locale property, on the service interface.
The service requester needs to set this property value before invoking the service. Here is sample code to illustrate this method.
Listing 3 shows how to set request the locale in the InputData data structure.
Listing 3. The demo requester functionpublic String demoRequester() { InputData inputdata = new InputData(); inputdata.setKey("first"); // set locale property of inputdata data structure here inputdata.setLocale("en_US"); ... // call demoProvider web service with inputData data structure OutputData outputdata = service.demoProvider(inputdata); ... return outputdata.getMultilingualData(); }
Listing 4 shows how to set locale sensitive output in the OutputData data structure.
Listing 4. The demo provider functionpublic OutputData demoProvider(InputData inputdata) { String bundleName = "multilingualdata"; PropertyResourceBundle multilingualDataBundle = null; Locale locale = new Locale("en", "US"); // get locale information from InputData data structure String localeString = inputdata.getLocale(); OutputData outputData = new OutputData(); if (localeString != null) { StringTokenizer st = new StringTokenizer(localeString, "_"); String language = st.nextToken(); String country = st.nextToken(); locale = new Locale(language, country); } try { multilingualDataBundle = (PropertyResourceBundle) PropertyResourceBundle. getBundle(bundleName, locale); } catch (Exception e) {} outputdata.setKey(inputdata.getKey()); // get multilingual data with user preferred language outputdata.setMultilingualData(multilingualDataBundle. getString(outputdata.getKey())); ... return outputData; }
Listing 5 shows the locale attribute of service interface.
Listing 5. The InputData classpublic class InputData { private String key = ""; private String locale = "en_US"; public String getKey() { return key; } public void setKey(String keyString) { key = keyString; } public String getLocale() { return locale; } public void setLocale(String localeString) { locale = localeString; } ... }
Listing 6 shows the return data structure of the service provider .
Listing 6. The OutputData classpublic class OutputData { private String key; private String multilingualData; public String getKey() { return key; } public void setKey(String keyString) { key = keyString; } public String getMultilingualData() { return multilingualData; } public void setMultilingualData(String multilingualDataString) { multilingualData = multilingualDataString; } ... }
The service requester typically:
- Determines the user's preferred language, using one of the methods described below.
- Calls a multilingual supported service provider and passes the userâs language preference.
- Returns the result in the specified language to the user.
There are two ways to get the user's preferred language.
- Get user preferred language from the browser language setting.
The sample JSP source in Listing 7 illustrates this method.
Listing 7. Sample JSP code to get a user's language preference from a browserString acceptLanguage = request.getHeader("Accept-Language"); String locale = "en_US"; if (acceptLanguage != null) { StringTokenizer st = new StringTokenizer(acceptLanguage, ","); if (st.hasMoreElements()) { localeStr = (String)st.nextElement(); } }
Figure 7 shows how a user sets his or her language preferences in a browser.
Figure 7. Setting language preferences in Microsoft Internet Explorer 5.0
- Alternatively, the application can provide a user interface
component to collect the user's preferred language, and locale information
can be fetched from there directly. The following example illustrates this method.
Figure 8 shows a language-selection box for an end-user provided by an application.
Figure 8. Providing a language-selection box for an end-user
Figure 9 shows sample HTML code to create the language-selection box.
Figure 9. Sample HTML code for the language selection box shown in Figure 8
Figure 10 shows a language-selection box for an end-user provided by WebSphere Portal Registry.
Figure 10. Getting locale information directly from a WebSphere Portal registry profile
This article has provided you with the basic information you need to know to develop a simple Java program with multilingual support in Web services. You can also use these general tips to develop any multilingual application, not only Web services. Using the information presented here, you can begin to consider your globalization requirements and write your own programs. Enjoy.
The authors want to thank all the guys at Globalization Certification Laboratory for their help, especially Jin Bo Xu, for his contributions to the article.
- UDDI Organization Homepage
- UDDI Data Structure Reference Version 1.0
- UDDI Registry Preview UI
- Simple Object Access Protocol (SOAP) Version 1.1
- Apache SOAP Project
- Web Services Toolkit
- Unicode and WebSphere
- RFC 1766 Tags for the Identification of Languages

Xiao Hui Zhu is a software engineer at IBMDevelopment Lab in China. She has worked for IBM since November, 1994, starting her career as a project manager in the Globalization organization and performing various roles, including tester, architect, coordinator, and consultant. Currently, she is the technical leader for the Globalization Certification Laboratory located in Shanghai. You can reach her at zhuxiaoh@cn.ibm.com.

Ting Yong Zhu is a software engineer at IBM Development Lab in China, and has worked for IBM since August 2000. He loves to surf in the Linux world and explore object-oriented technologies. He is the key engineer for developing multilingual Web service show box. You can reach him at zhuty@cn.ibm.com.

Yang Wang is a software engineer at the IBM Development Lab in China and has worked for IBM since November 2000. He has performed various roles, including tester, coordinator and developer. He is the key engineer for developing multilingual web service show box. You can reach him at wangang@cn.ibm.com.
Comments (Undergoing maintenance)





