An old joke in the internationalization community goes like this:
"A person who speaks three languages is called trilingual. And a person who speaks two languages is called bilingual. So what do you call someone who only speaks one language?"
"American."
Today, providing a software product solely in English is no longer acceptable from a usability, quality, marketing, and, in some cases, legal standpoint. Enabling your product for the world market simply makes economic sense. And the enablement process is relatively straightforward, as this article will show.
A few notes before we begin. Because the Eclipse Platform adopts the internationalization implementation provided with the Java™ SDK, it's helpful to read the Java Tutorial: Internationalization before continuing. The tutorial presents a fine overview of the issues and steps involved in the process. We will assume that you've already read the tutorial so we can underscore the key points, surface other noteworthy items, and cover Eclipse-specific issues and steps in this article. And when you run into unfamiliar terminology or acronyms in this article, jump to our glossary.
Overview of internationalization
Internationalization is the process of creating software for the world market. Besides the economic benefits, some countries require products to pass certain localization requirements set by the government before it can be introduced to their markets.
The process of internationalizing the Eclipse Platform was accomplished in two steps:
- NLS-enabling the product -- This step covers coding techniques and UI design issues. Enabling a product for National Language Support (NLS) ensures that the product is designed for NL function and uses proper APIs to handle national language data. During this step, smart coding practices -- such as avoiding hardcoded strings, making input buffers large enough to hold translatable text, properly parsing strings that contain non-Latin characters, not localizing strings saved as part of a file format, and isolating the national language elements from program code -- must be weighed and considered so the translation can be completed with minimal expense and effort.
- Translating the product -- This step involves translating the domestic language elements into a foreign language. As with words and phrases, pictures and symbols may also be interpreted differently by various cultures. It is during the translation verification step that all translations are reviewed for contextual accuracy, icons or clip art are modified to ensure that there are no user misinterpretations, and page layouts are checked for inadvertent text truncation. While verifying the product's functional integrity after translation, this step also looks for hidden cultural impacts.
What does internationalization affect?
Let's begin with a distillation of the list of culturally dependent data presented in the Java Internationalization tutorial (see Resources), reordered by the likelihood that the typical developer will encounter it, and followed by details on each:
- Text
Messages, labels on GUI components
Online help (*.html), Plug-in manifest (plugin.xml)
-
Data formatting
Numbers, dates, times, currencies
Phone numbers, postal addresses
-
Regional and personal substitutions
Measurements
Honorifics and personal titles
- Multimedia considerations
Messages, labels on GUI components
Resource bundles nicely handle language-dependent texts. The strategy is either to load all
strings at once into a ResourceBundle subclass or to retrieve them individually. The Eclipse Java Development Tooling (JDT)
in V2.0 provides wizards to support the detection of translatable strings. We'll return to them shortly.
Loading translated strings into memory is only the first step. The next step is to pass them to the appropriate controls for display (such as a label, text field, menu choice, etc.). The page designer and programmer must work together to ensure that the chosen layout allows for appropriate resizing and reflowing of the dialog. The layout support in the Standard Widget Toolkit (SWT) library relies heavily on the programmer to "do the right thing" by specifying layout descriptions that react appropriately to changes in field sizes. The Eclipse.org article "Understanding Layout in SWT" (see Resources) covers the implementation issues in detail.
This is particularly important because text length increases during translation. English phrases are often shorter than their equivalent translations, usually on the order of 40 percent. Font sizes also may need to be modified to accommodate the local language.
Online help (*.html), Plug-in manifest (plugin.xml)
These forms of text content are more involved than simple key/value-oriented properties files, so the steps to their externalization are slightly more complex.
In the case of the manifest file, it is coupled with a similarly named property file, plugin.properties, containing only the externalized text. Special care must be taken with manifest files like plugin.xml and fragment.xml, since the attributes of the tags can contain both translated and untranslated text. Consider the benign example below.
Listing 1. Plug-in manifest file before translation
<plugin
name="Java Development Tools UI"
id="org.eclipse.jdt.ui"
version="2.0.0"
provider-name="Object Technology International, Inc."
class="org.eclipse.jdt.internal.ui.JavaPlugin">
|
Here we see a mix of translatable text, untranslatable text, and "gray area" translatable text, all as tag attributes. Clearly, the id and class attributes are not translatable, since they represent programming identifiers. It is equally certain that the name attribute should be translated.
You might be tempted to consider the version attribute (because of the locale-dependent decimal separator) or provider attribute (because of the locale-dependent legal attribution "Inc.") as candidates for translation, since they will be displayed
to the end user. However, version numbers are generally left untranslated for two reasons: End users attribute little meaning to their numeric value, and programmers sometimes write code that expects version numbers to
be a composed string like "3.5.4". It is arguably a better design decision that the version information be stored as separate numbers like major, minor, and service update to avoid the need to parse a version string, but that discussion is beyond the scope of this article.
The provider name is left untranslated as well, since "Inc." has legal meaning that defies accurate translation. After identifying what text needs to be externalized, our example now looks like Listing 2.
Listing 2. Plug-in manifest file after translation
<plugin
name="%plugin.name"
id="org.eclipse.jdt.ui"
version="2.0.0"
provider-name="Object Technology International, Inc."
class="org.eclipse.jdt.internal.ui.JavaPlugin">
|
plugin.properties contains the externalized string, "Java Development Tools UI" associated with the key plugin.name.
This simple example demonstrates that translating isn't simply providing equivalent words or phrases for your text. It also involves an understanding of the local cultural considerations and potential legal impacts. This is why a translation professional is necessary, as well as translation verification testing.
Numbers, dates, times, currencies
The Java library includes classes that handle the necessary formatting for numbers (decimal separator, thousands separator, grouping), dates (MDY, DMY, first day of work week), times (12- or 24-hour format, separator), and currencies (local symbol, shown as suffix or prefix, leading separator or none).
Phone numbers, postal addresses
These are more subtle and less common text translation concerns, but still noteworthy. Many applications simply allow free-format entry of phone numbers since there are so many local variations. Postal addresses are
straightforward: Adding a "State/Province" field and allowing for multiple address lines
is generally sufficient.
Regional and personal substitutions
Honorifics and personal titles
Though less common in the United States, the proper enablement of honorifics (Mr., Mrs., Dr.) is considered absolutely necessary elsewhere to avoid a serious breach of etiquette.
Measurements
These are less frequently encountered. This involves substitution of measurement indications with corresponding conversion (for example, miles vs. kilometers). In many cases, users will need simultaneous display of a measure in different units, or an easy way of toggling between them.
In general, products should select regionally neutral sounds, colors, graphics, and icons.
This means no Homer Simpson "D'oh!" sound associated with error messages. If you're thinking that no serious development organization would do such a thing, consider an icon that is
typical of those that are proposed and rejected:![]()
The developer wanted to convey a metaphor for "IP router" by using a symbol harkening back to a national highway that traversed the United States from Chicago to Los Angeles, called Route 66. Most Americans would find this metaphor obtuse; imagine the confusion of the hapless non-U.S. user.
Similarly, the image below may be intuitive to many North American users: 
But in recognition studies, others from outside the United States have guessed that this is a birdhouse. This is the more universally accepted image for mail:![]()
To avoid confusing and potentially offensive visuals, the best course is to engage professional graphic artists who are aware of cultural issues.
Now let's turn to the actual steps for internationalizing your Eclipse plug-in:
- Move translatable strings into *.properties files
- Separate presentation-dependent parameters
- Use proper locale-sensitive data formatting, substitutions APIs
- Test in domestic language
- Create initial translated plug-in fragment
- Prepare and send domestic language materials for translation
- Repackage and validate translated materials
- Deploy fragments
We'll discuss each of these steps in detail.
Step 1. Move translatable strings into *.properties files
Fortunately, Eclipse's Java Development Tooling provides considerable help to properly separate translatable strings. The Find Strings to Externalize choice from a package context menu displays the Externalize Strings wizard. This wizard will lead you through the steps to locate hardcoded strings in your code, classify them as translatable or not, then modify the code to use a resource bundle where appropriate.
We'll introduce the Externalize Strings wizard with an example, the canonical HelloWorld before using the wizard:
Listing 3. HelloWorld before translation
package com.jumpstart.nls.example.helloworld;
public class HelloWorld {
static public void main(String[] args) {
System.out.println("Copyright (c) IBM Corp. 2002.");
System.out.println("Hello.");
System.out.println("How are you?");
System.out.println("Goodbye.");
}
}
|
Selecting Find Strings to Externalize from the package context menu will display a selection dialog of all the compilation units in the package that may contain translatable text. Since we only have one Java file at this time, selecting Externalize Strings directly from the HelloWorld.java file is easier. It displays the Externalize Strings wizard:
Figure 1. Externalize Strings wizard

By selecting an entry from the table and one of the buttons to the right, you can mark the strings as belonging to one of three cases:
Translate
Action: An entry is added in the properties files; the autogenerated key and access code is substituted in the code for the original string. The string used to specify the key is marked as nontranslatable with a comment marker, such as "// $NON-NLS-1$"
Never Translate
Action: The string is marked as nontranslatable with a comment marker. The Externalize Strings wizard will not flag it as untranslated in subsequent executions.
Skip
Action: Nothing is modified. Subsequent executions of the Externalize Strings wizard will flag the string as potentially translatable.
Note: The trailing number in the // $NON-NLS-1$ comment marker indicates which string is not to be translated in the case where are there are several strings on a single line. For example:
x.foo("Date", "TOD", "Time"); // $NON-NLS-2$
|
Here, the middle parameter is flagged as non-NLS. The other two are skipped.
Returning to our example, note that the total number of strings for each category is summarized below the list. The key names of the externalized strings are autogenerated based on the string value, but they can be renamed directly in the table. In addition, an optional prefix can be specified (S_
in the example below).
Figure 2. Externalize Strings wizard

Hint: Clicking the icon in the first column of a given row will advance to the next choice: Translate, Never Translate, or Skip.
Now that we've identified what strings are translatable, continue to the next step to choose how they will be externalized. Here's the page displayed after selecting Next; the Property file name and resource bundle accessor Class name were modified to more specific values than the defaults.
Figure 3. Externalize Strings wizard

The resource bundle accessor class will contain code to load the properties file and a static method to fetch strings from the file. The wizard will generate this class or you can specify your own existing alternative implementation. In the latter case, you may want to uncheck the Use default substitution choice and specify an alternative code pattern for retrieving externalized strings. If the accessor class is outside of the package (for example, a centralized resource bundle accessor class), you can optionally specify that you want to Add [an] import declaration to the underlying source.
The Externalize Strings wizard uses the JDT Refactoring framework, so the next two pages should look familiar. First, a list of warnings.
Figure 4. Externalize Strings wizard

And a side-by-side presentation of the proposed changes:
Figure 5. Externalize Strings wizard

Once you select Finish, the wizard performs the source code modifications, creates the resource bundle accessor class and generates the initial properties file. Here is the code for the standard resource bundle accessor class:
Listing 4. Standard resource bundle accessor class
package com.jumpstart.nls.example.helloworld;
import java.util.MissingResourceException;
import java.util.ResourceBundle;
public class HelloWorldMessages {
private static final String BUNDLE_NAME =
"com.jumpstart.nls.example.helloworld.HelloWorld"; //$NON-NLS-1$
private static final ResourceBundle RESOURCE_BUNDLE =
ResourceBundle.getBundle(BUNDLE_NAME);
private HelloWorldMessages() {}
public static String getString(String key) {
try {
return RESOURCE_BUNDLE.getString(key);
} catch (MissingResourceException e) {
return '!' + key + '!';
}
}
}
|
The only variation in this generated code is the value assigned to the static final BUNDLE_NAME. Before we continue, check out some noteworthy guidelines contributed by Erich Gamma and Thomas Mäder of the JDT team.
Guidelines for managing resource bundles and properties files
These guidelines are designed to:
- Reduce the number of NLS errors -- in other words, the values of externalized strings that are not found at runtime.
- Enable cross-referencing between the keys referenced in the code and the keys defined in the properties file.
- Simplify the management of the externalized strings. Using a centralized property file can result in frequent change conflicts. In addition, it requires the use of prefixes to make keys unique and complicates the management of the keys.
To achieve these goals, we propose the following guidelines:
- Use a properties file per package, and qualify the keys by class name
For example, all the strings for the JDT search component are in SearchMessages.properties, with key/value pairs like:
SearchPage.expression.pattern=(? = any character, * = any string) ShowTypeHierarchyAction.selectionDialog.title=Show in Type Hierarchy - Use a dedicated static resource bundle accessor class
Let the Externalize Strings wizard generate this class. It should be named like the properties file. So in our example, it would be called SearchMessages. When you need to create formatted strings, add the convenience methods to the bundle accessor class. For example:
Listing 5. Static resource bundle accessor classpublic static String getFormattedString(String key, Object arg) { String format= null; try { format= RESOURCE_BUNDLE.getString(key); } catch (MissingResourceException e) { return "!" + key + "!";//$NON-NLS-2$ //$NON-NLS-1$ } if (arg == null) arg= ""; //$NON-NLS-1$ return MessageFormat.format(format, new Object[] { arg }); } public static String getFormattedString (String key, String[] args) { return MessageFormat.format(RESOURCE_BUNDLE.getString(key), args); }
- Do not use computed keys
There is no easy way to correlate a computed key in the code with the key in the properties file. In particular, it is almost impossible to determine whether a key is no longer in use.
- The convention for the key name is <classname>.<qualifier>
Example: PackageExplorerPart.title
Step 2. Separate presentation-dependent parameters
Not all externalized text is simply words and phrases that will be translated to a target language. Some are more specifically related to your plug-in's implementation. Examples include properties, preferences, and default dialog settings.
Here are a few specific examples that might find their way into a properties file:
- Size or layout constraints -- For example, the appropriate width of a nonresizable table column.
- Default fonts that are dependent on the language or operating system -- A good default font for Latin-1 languages is an invalid choice for DBCS languages.
For those plug-ins that subclass from AbstractUIPlugin, NL-related parameters can also be found in its default preference stores (pref_store.ini) and dialog settings (dialog_settings.xml). The Eclipse Workbench does not use default preference stores, opting instead to store defaults in properties files and initialize them via AbstractUIPlugin's initializeDefaultPreferences(IPreferenceStore) method.
Step 3. Use proper locale-sensitive data formatting, substitutions APIs
Please refer to the detailed coverage in the Java Tutorial: Internationalization.
Step 4. Test in domestic language
Testing the readiness of a product for translation is nontrivial and beyond the scope of this article. However, Part 2 presents strategies for validating the NL-sensitive aspects of your product.
Step 5. Create initial translated plug-in fragment
At this point, we could simply copy our domestic language property files to similarly named files with locale-specific suffixes (for example, MyMessages_xx.properties, where xx is the language), and move to Step 6, Prepare and send domestic language materials for translation. In this case, the product is delivered with its code and whatever languages it supports as a single install.
However, this approach has a few drawbacks. First, the code and its NL resources are intermingled in the same directory JAR file. If the translation lags the code delivery, the plug-in JAR file must be updated, despite the fact that the underlying code is unchanged. Second, files other than property files are not inherently locale-sensitive, so they must be segregated to separate directories for each language (for example, HTML, XML, images).
To address these issues, the Eclipse Platform introduces the notion of another reusable component that complements plug-ins: a plug-in fragment. A plug-in fragment provides additional functionality to its target plug-in. At runtime, these plug-in contributions are merged along with all dependent fragments. These contributions can include code contributions and contributions of resources associated with a plug-in, like property and HTML files. In other words, the plug-in has access to the fragment's contents via the plug-in's classloader.
How and why to use fragments to provide the translatable information
A plug-in fragment is an ideal way to distribute Eclipse-translated information, including HTML, XML, INI, and bitmap files. Delivering translations in a nonintrusive way, the Eclipse Platform translations are packaged in fragment JAR files and are added to existing Eclipse installations without changing or modifying any of the original runtime elements. This leads to the notion of a language pack.
The Eclipse Platform merges plug-in fragments in a way that the runtime elements in the fragment augment the original targeted plug-in. The target plug-in is not moved, removed, or modified in any way. Since the fragment's resources are located by the classloader, the plug-in developer has no need to know whether resources are loaded from the plug-in's JAR file or one of its fragments' JAR files.
The Java programming language supports the notion of a language pack with the resource bundle facility. The Java resource bundles do not require modification of the application code to support another language. The *.properties file namespace avoids collisions through the following naming convention: basename_lang_region_variant. At runtime, the ResourceBundle facility finds the appropriate properties file for the current locale.
The approach to deploying files, such as HTML and XML files in fragments, is slightly different from Java resource bundles in that the Eclipse fragment uses a directory structure to sort out the different language versions.
Example fragment contents
The plug-ins and the plug-in fragments reside in separate subdirectories found immediately under the eclipse subdirectory. Looking at our example fragment, as deployed on a German system, we see an \nl folder, fragment.xml and an nl1.jar file.
Figure 6. Fragments subdirectories

Typically, translated *.properties files are suffixed according to the resource bundle rules and deployed in JAR files. In contrast, when a view needs an input file type whose name is not locale-sensitive like resource bundles (such as *.xml), we define a subdirectory structure for each language version of the file. The de subdirectory above is one such example, where de = German.
Fragment manifest
Each plug-in folder can optionally contain a fragment manifest file, fragment.xml. The manifest file describes the plug-in fragment, and is almost identical to the plug-in manifest (plugin.xml), with the following two exceptions:
- The
classattribute is gone since fragments do not have a plug-in class. They just follow their target's specification. - There are no dependencies because the fragments have the same dependencies as their target plug-in.
Manifests used to describe a national language fragment are typically quite simple, specifying only the <fragment> and <runtime>/<library> tags. Here's the example fragment manifest file in its entirety:
Listing 6. Fragment manifest file
<?xml version="1.0" encoding="UTF-8"?> <fragment id="com.jumpstart.nls.example.helloworld.nl1" name="NLS Example Plugin NL Support" version="1.0.0" provider-name="IBM" plugin-id="com.jumpstart.nls.example" plugin-version="1.0.0"> <runtime> <library name="nl1.jar"/> <library name="$nl$/"/> </runtime> </fragment> |
The <fragment> tag attributes are:
name-- User-displayable name for the extension.id-- Identifier for this fragment configuration; used to uniquely identify this fragment instance.plugin-id-- Reference to the target extension point. This plug-in fragment merges with this target extension.plugin-version-- Version of the fragment plug-in.version-- Version specification in major.minor.service format.
The <runtime> section contains a definition of one or more libraries that make up the plug-in fragment runtime. The referenced libraries are used by the platform execution mechanisms where the plug-in loads, merges, and executes the correct code required by the plug-in. Each <library> subtag has a name attribute. It specifies the library file or directory export mask. To include the contents of a folder as well as its subfolders in a library, use the mask $foldername$/ where foldername is the directory that is to be added to the library search path. We will see later how we use this mask to include the \nl folder's
contents plus its subfolders' contents.
Building a fragment
The Eclipse Workbench comes with a tool used in plug-in development: the Plug-in Development Environment (PDE). The PDE contains support for developing plug-in fragments.
Let's examine how to build a fragment for NL translations using the PDE. There is no practical limit to the number of languages in a given fragment. The fragment then is the basis of our "Language Pack" containing one or more language translations. However, in this example, we'll confine our language pack to the German translation.
To build a plug-in fragment, start the New Project wizard (File > New > Project...), select the Plug-in Development category, then Fragment Project type. On the first page of the New Fragment Wizard, type the project name. Keep in mind that the project name will also become the fragment ID. For example, starting a project adding NL support to the HelloWorld example, we would name our project "com.jumpstart.nls.example.helloworld.nl1".The trailing ".nl1" is not required, but does help distinguish fragments that represent "language packs" from fragments that add code and functionality.
Figure 7. Starting a fragment project

Press Next. We see the default values for the project's source folder and runtime library on the second page:
Figure 8. Defining fragment folders

These values seem reasonable, so pressing Next again, we arrive at the Fragment Code Generators page. Select the second radio button to indicate we want to create the fragment manifest file from a template, then select the Default Fragment Generator wizard from the list.
Figure 9. Selecting the default wizard

After pressing Next, we see the Simple Fragment Content page. This page has two entries used to target our fragment on an existing plug-in. We must supply the plug-in target ID and version. We can use the Browse button to select the plug-in we want to extend.
Figure 10. Targeting the fragment

Now let's proceed to the fragment manifest editor, which is similar to the plug-in manifest editor in that it is a multipage editor with Overview, Runtime, Extensions, Extension Points, and Source pages.
Figure 11. Fragment manifest editor

Notice the tabbed pages corresponding to sections of the fragment.xml file. We will use the Runtime page to point the fragment classpath at the libraries containing our translations.
We specified the nl1.jar in the new fragment wizard, so that library is already included in the classpath of this fragment. What's missing is the inclusion of the \nl folder, plus its subfolders' contents. You can add new runtime libraries by selecting More from the Runtime Libraries section of the Overview page or by turning to the Runtime page, selecting New..., then entering the folder mask $foldername$/.
Figure 12. Fragment runtime information

Taking a look at the source page of the fragment manifest editor, we see that the PDE generates all the XML necessary to describe our plug-in fragment.
Figure 13. Fragment source

Step 6. Prepare and send domestic language materials for translation
Producing correct translations requires specific skills, which you must purchase. (Unfortunately, your four years of high school German classes do not qualify you.) There are many companies that will gladly produce professional-quality translations.
For the Eclipse Platform, this step was accomplished in two phases. The first phase involved sending all the externalized text to a translation center. This first-pass translation is done out of context. The translator does not see the running product, nor do they have product-specific experience. They have tools at their disposal to help speed the translations and ensure consistency. But ultimately, they rely on translation testers to validate the running product in the target language (the second phase).
The risk and consequences of performing an out-of-context translation, the results of which are sometimes quite amusing, are discussed in Part 2.
Step 7. Repackage and validate translated materials
Now having the translated files, we reassemble them in their appropriate directories/JAR files as described in Step 5. The NL1 Fragment folder contains language versions of the plugin.properties file. After translating the HelloWorld.properties file to German, we rename it to HelloWorld_de.properties and store it in the NL1 Fragment source folder. Note that the nl\de (German) folder is new and is created manually, not by the PDE. These language-specific folders segregate the versions of nonproperties files (such as hello.xml shown below) as we add translations over time.
Figure 14. Reassembled fragment project

Be aware that the translated properties files will very likely contain accented characters that are codepage-dependent, so properties files must be converted to the ISO 8859-1 codepage expected by the PropertyResourceBundle class. The native2ascii utility (see Resources) will handle codepage conversions and insert any necessary Unicode escapes.
The term Unicode escape deserves a bit more explanation. The native2ascii conversion utility, included with the Java SDK, accepts a source encoding and produces output encoded in ISO 8859-1, plus it transforms characters outside this codepage to the notation known as Unicode escapes. This notation is \udddd, where dddd = the codepoint of the desired character in the Unicode codepage.
Here's an example. Consider the French phrase "Son père est allé à l'hôtel"
(his father went to the hotel). This contains four accented characters that are not part of the Latin-1 codepage. Transforming this phrase with the native2ascii utility yields:
Son p\u00e8re est all\u00e9 \u00e0 h\u00f4tel
There are no longer any accented characters, and the resulting string is composed entirely of characters having codepoints that are found in ISO 8858-1. But what are the \u00e8,
\u00e9, \u00e0, and \u00f4 that were substituted? They are the Unicode codepoints of the accented characters in \udddd notation.
A little caveat when using the native2ascii utility: It assumes that the source encoding is the same as the active codepage of the machine that executes it. However, translators typically save the translations in their default country codepage, and this codepage is different in each country and each operating system. So the person responsible for integrating the translations will need to either know in which codepage that the translators save their files, or ask that they save it in a common codepage. You can specify the source encoding when invoking native2ascii with the-encoding parameter.
Tip: If you are uncertain of the source codepage, you can spot-check the output of native2ascii against Table 3. Unicode codepoints of common accented Latin characters, later in this article. If you find \udddd notations in your converted files that are not in this table (such as \u0205), it is likely that you specified the incorrect source encoding. There is no equivalent spotcheck for DBCS languages, where practically all the characters in the converted files are Unicode escapes. You simply have to be careful and validate against the running product.
Testing the translation merits its own article. Part 2 describes the process and lessons learned during the recent translation verification of the Eclipse Platform, and includes a view (an Eclipse plug-in, of course) for performing a quick check of property file translations.
Fragment sources, similar to plug-in sources, may be packaged in a JAR file. Using the PDE to generate the JAR package, select the "fragment.xml" file and choose Create Fragment JARs... from the pop-up menu. A wizard will guide you in creating a build script to produce all the required JARs for your fragment.
Figure 15. Selecting the fragment.xml file

To deploy this example fragment, copy the fragment.xml, the \nl directory, and JAR to the com.jumpstart.example.helloworld.nl1 subdirectory in the plugins directory. This completes our example and the steps for internationalization.
Internationalization of the Eclipse Platform (V1.0)
The focus of this section is not a how-to for Eclipse internationalization, but rather a review of the V1.0 implementation and its translation-related build processes. Some of these topics are quite specific to the platform itself and describe processes that are not relevant to those simply wishing to NL-enable their plug-ins to the platform. This section is aimed at those who wish to understand how the platform was NL-enabled and want to navigate the considerable number of files, subdirectories, and JARs that encompass the translated product.
The Eclipse Platform comes internationalization-ready including the ability to enter double-byte characters and support bidirectional data. It is available in the following languages:
- Brazilian Portuguese
- French
- German
- Italian
- Japanese
- Korean
- Simplified Chinese
- Spanish
- Traditional Chinese
These are collectively referred to within IBM® as the group-1 languages.
What and where are the various elements that were translated?
The Eclipse Platform relies on the internationalization framework provided in the Java SDK. Therefore, the bulk of the translatable text is located in *.properties files. But there are other file formats that include translatable information, such as HTML, XML, and bitmaps (summarized in Table 2. Eclipse-specific (non-Java) translatable resources).
Translation is an art, a balance between time, effort, and costs. It takes into consideration how quickly a product evolved and how quickly it needs to reach the market. There are always analyses and give-and-take decisions about what must be translated and what would be nice to translate.
In a nutshell, Eclipse concentrates its translation in three major areas:
- Plug-ins
- Online help
- Welcome view
The files involved are:
- *.properties files
- *.html files and their associated "wiring" files
- welcome.xml file
Eclipse plug-ins: .properties files
In Eclipse, each product plug-in contains properties files. Translating the plug-in means translating its properties files. There are two main steps in handling the properties files:
Font-related .properties files
Each operating system uses a different set of fonts, and some languages have their own set of fonts (most notably the DBCS languages). The operating system suggests a default font, but the page designer's optimal layout is tied to a specific font, so using it may result in poorly formatted pages. To ensure that translated text is displayed correctly, the Eclipse Workbench uses a particular set of fonts consistent with each language/operating system combination.
Note: Only those adding support for a new operating system or language will need to create new jfacefonts* property files. Never modify the contents of another plug-in's subdirectory, including the fonts specified in the JFace properties file that come with the Eclipse Platform.
The various Eclipse fonts are defined in the JFace plug-in. You can find the default Eclipse fonts.properties files in:
Location of the default Eclipse fonts.properties files
eclipse\plugins\org.eclipse.ui\workbench.jar
org\eclipse\jface\text\JFaceTextMessages.properties
org\eclipse\jface\resources\jfacefonts.properties (OS default)
org\eclipse\jface\resources\jfacefonts_linux.properties
org\eclipse\jface\resources\jfacefonts_windows2000.properties
org\eclipse\jface\resources\jfacefonts_windowsnt.properties
|
There are four org.eclipse.ui jfacefonts properties files for each language, all stored in the same NL fragment JAR, <eclipse_root>\fragments\org.eclipse.ui.nl1\nl1.jar.
For example, Italy has jfactfonts_it.properties, jfactfonts_linux_it.properties, jfactfonts_windowsnt_it.properties, and jfactfonts_windows2000_it.properties. Spain has jfactfonts_es.properties, jfactfonts_linux_es.properties, jfactfonts_windowsnt_es.properties, and jfactfonts_windows2000_es.properties, etc.
Add support for a new language by copying the base jfacefonts files above, adding a country suffix (such as jfactfont_linux_xx.properties for country code "xx"), and inserting the new jfacefonts files into the NL fragment JAR.
End-user messages.properties files
To translate the plug-ins' end-user messages:
- Translate the .properties files.
- Convert the properties files from the locally encoded codepage to the encoding expected by PropertyResourceBundle.
- Follow the Java resource naming convention to rename your files with the correct locale (for example, en = English, de = German, fr = French, it = Italian).
- Insert the properties file back into the source directory of your NL fragment JAR you created with the PDE.
The Java resource file naming conventions allow for a language, region, and variant (basename _language_region_variant). However, Eclipse only specifies the language in all but three cases (Portuguese, Chinese/China, and Chinese/Taiwan):
<name>_<locale>_<region>.properties
PropertyResourceBundle will automatically find the resource file corresponding to the current locale at runtime. For example, an English plugin.properties would initially be copied to plugin_de.properties for German in preparation for translation.
Eclipse online help: .html and .properties files
Eclipse has five online help options. The base language documentation is delivered as plugin-ins in their own subdirectory, and the translated versions are delivered as plug-in fragments. The "documentation" plug-ins are located as follows:
Table 1. Location of Eclipse documentation plug-ins
| Online help | Location |
| Workbench User Guide | eclipse\plugins\org.eclipse.platform.doc.user |
| Java Development User Guide | eclipse\plugins\org.eclipse.jdt.doc.user |
| Platform Plug-in Developer Guide | eclipse\plugins\org.eclipse.platform.doc.isv |
| JDT Plug-in Developer Guide | eclipse\plugins\org.eclipse.jdt.doc.isv |
| PDE Guide | eclipse\plugins\org.eclipse.pde.doc.user |
The plug-in manifest file, plugin.xml in each of the above directories, defines the help contributions. The necessary help infoset and wiring files are defined in the same directory. To translate the online help:
- Translate the doc.properties file of each online help.
- Convert the translated doc.properties from the locally encoded codepage and rename according to locale-specific resource file naming convention.
- Insert the properties file back into the source directory of your NL fragment JAR you created with the PDE.
- Translate the *.html files of the online help.
- Put the translated HTML into the corresponding plug-in fragment directory.
If you translate the online help into more than one language, you must create a language directory structure to hold your country-translated HTML. These files are not locale-sensitive like property files and, thus, must keep their original file names and extensions. To differentiate, for example, the set of Japanese HTML files vs. those in German, you must save them under a different directory name, each one representing a country.
For example, the German set of Workbench User Guide *.html files are stored in eclipse\fragments\org.eclipse.platform.doc.user.nl1\nl\de. The Japanese welcome.xml is in eclipse\fragments\org.eclipse.platform.doc.user.nl1\nl\ja.
Eclipse's Help > Welcome menu choice: welcome.xml file
The welcome.xml is the introductory view displayed when the Workbench first opens or when the user selects Help > Welcome. It is located in eclipse\plugins\org.eclipse.sdk_1.0.0.
Here are the steps to translate the welcome.xml file:
- Translate the welcome.xml file.
- Save the translation in UTF-8 format (using Notepad, for example). If you forget to save it in UTF-8, the XML parsing code will display an error if it encounters an accented character (that is, a codepoint that isn't part of the UTF-8 codepage).
- Just like the .html files, if you translate the welcome.xml into many languages, you will need to save each welcome.xml under a locale-specific directory name. For example, the German welcome.xml is in eclipse\fragments\org.eclipse.sdk_1.0.0\nl\de, and the Japanese welcome.xml is in eclipse\fragments\org.eclipse.sdk_1.0.0\nl\ja.
Enabling your product for the world market simply makes economic sense. And the steps above show that the process is relatively straightforward. Here's a short quiz:
The majority of IBM's worldwide software sales revenue is within the United States. True or False?
False. Indeed, more than 50 percent of IBM software revenue comes from outside the United States. Fortunately, those with products based on the Eclipse Platform benefit from having ready translations of the base product. All that is left is to follow the clear steps outlined in this article to open your Eclipse-based product to a worldwide market.
Eclipse-specific (non-Java)-translatable resources
Here is a summary of the previously presented list of translatable resources, along with a brief explanation of how they are processed.
Table 2. Eclipse-specific (non-Java)-translatable resources
| Translated items | Required or optional | High-level steps |
| Plug-in files | Required |
|
| Plug-in "About" file | Optional |
|
| Online help | Required |
|
| Splash* | Optional |
To localize the splash screen, you will need to create locale subdirectories under eclipse/splash. The names of these directories follow the standard Java locale-naming conventions. For example, the platform looks up the splash screen for USA english locale (en_US) as follows:
<image file> is the name of the splash file (for example, splash_full.bmp or splash_full.xpm). |
| Product configuration* | Optional |
|
| Plug-in product files* | Required |
|
| License* | Optional |
|
For more information about translatable resources, see Resources.
Unicode codepoints of common accented Latin characters
Table 3. Unicode codepoints of common accented Latin characters
| Characters | |
| \u00e0 | a grave |
| \u00e1 | a acute |
| \u00c0 | A grave |
| \u00c1 | A acute |
| \u00c2 | A circumflex |
| \u00e2 | a circumflex |
| \u00c3 | A tilde |
| \u00e4 | a dieresis |
| \u00c4 | A dieresis |
| \u00e8 | e grave |
| \u00c8 | E grave |
| \u00e9 | e acute |
| \u00c9 | E acute |
| \u00ea | e circumflex |
| \u00eb | e dieresis |
| \u00cb | E dieresis |
| \u00ea | e circumflex |
| \u00ca | E circumflex |
| \u00ef | i dieresis |
| \u00ec | i grave |
| \u00ed | i acute |
| \u00cc | I grave |
| \u00cd | I acute |
| \u00ee | i circumflex |
| \u00ce | I circumflex |
| \u00f6 | o dieresis |
| \u00d6 | O dieresis |
| \u00e3 | a tilde |
| \u00f4 | o circumflex |
| \u00d4 | O circumflex |
| \u00f2 | o grave |
| \u00d2 | O grave |
| \u00f3 | o acute |
| \u00d3 | O acute |
| \u00f5 | o tilde |
| \u00d5 | O tilde |
| \u00f1 | n tilde |
| \u00d1 | N tilde |
| \u00f9 | u grave |
| \u00d9 | U grave |
| \u00fa | u acute |
| \u00da | U acute |
| \u00fb | u circumflex |
| \u00db | U circumflex |
| \u00fc | u dieresis |
| \u00dc | U dieresis |
| \u00df | s sharp |
| Special symbols | |
| \u00ba | masculine ordinal indicator |
| \u00a7 | section sign |
| \u00aa | feminine ordinal indicator |
| \u00ac | not sign |
| \u00b9 | 1 superscript |
| \u00b2 | 2 superscript |
| \u00b3 | 3 superscript |
| \u00a3 | pound sign |
| \u00a2 | cents sign |
| \u00b0 | degree sign |
Codepoint
Characters can be represented by one or more bytes of information. Codepoints are the hexadecimal values assigned to each graphic character.
Codepage
A codepage is a specification of code points for each graphic character in a set or in a collection of graphic character sets. Within a given codepage, a codepoint can have only one specific meaning. You can display the active codepage on the Windows® operating system with the CHCP command (only one codepage is active at any given moment).
Encoding
The codepage associated with a given piece of data. A file is said to be encoded in a given codepage. For example, Notepad will encode its data in codepage 437 on a U.S. English machine by default. The Save As dialog allows the user to select several other possible encodings, Unicode and UTF-8 most notably.
Internationalization
Internationalization refers to the process of developing programs without prior knowledge of the language, cultural data, or character encoding schemes they're expected to handle. In system terms, it refers to the provision of interfaces that enable internationalized programs to change their behavior at runtime for specific language operation.
Single-Byte Coded Character Set (SBCS)
In an SBCS, a one-byte codepoint represents each character in the set. Typically, SBCS is used to represent the characters of the European languages, the Cyrillic languages, English, Arabic, and Hebrew, to name a few.
Double-Byte Coded Character Set (DBCS)
In a DBCS), a two-byte codepoint represents each character in the set. Languages that are ideographic in nature, such as Japanese, Chinese, and Korean, have more characters than can be represented internally by 256 code points and, thus, require double-byte coded character sets.
Localization (sometimes abbreviated "L10N")
Localization refers to the process of establishing information within a computer system specific to each supported language, cultural data, and coded character set combination.
Mixed-Byte Character Set
This is a set of characters containing single-byte characters and double-byte characters. On the MBCS, each byte of data must be examined to see if it is the first byte of a double-byte or single-byte character. If the byte is in a certain range (greater than X'80', for example), then it is the first byte of a double-byte character.
NLS
National Language Support.
Unicode
Directly from Unicode.org: "Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language."
Note: While it is true that Java text manipulation classes are Unicode-centric, this is often not the case for data stored outside of your program's auspices. Java programmers must take into consideration the data encoding by performing local codepage-to-Unicode transformations where necessary.
Learn
-
Read Part 2 of this series.
-
The Eclipse Platform adopts the internationalization implementation provided with the Java SDK, so it's helpful to read the Java Tutorial: Internationalization.
-
"Understanding Layouts in SWT" covers implementation issues in detail.
-
Also read about "Creating product Bbranding" for more information on translatable resources.
-
For an introduction to the Eclipse Platform and how it operates, read "Working the Eclipse Platform."
-
Visit Eclipse.org for more information about Eclipse.
-
Expand your Eclipse skills by visiting IBM developerWorks' Eclipse project resources.
-
Browse all of the Eclipse content on developerWorks.
-
Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.
-
Stay current with developerWorks technical events and webcasts.
Get products and technologies
-
Check out the native2ascii utility (included with the Java SDK), which handles codepage conversions and inserts any necessary Unicode escapes.
-
Download the RBManager, which automates the tedious tasks associated with creating, updating, and managing resource bundle files.
-
Check out the latest Eclipse technology downloads at IBM alphaWorks.
-
Innovate your next open source development project with
IBM trial software, available for download or on DVD.
Discuss
-
Get involved in the developerWorks community by participating in developerWorks blogs.
Dan Kehn is a senior software engineer at IBM in Research Triangle Park, N.C. His interests in object-oriented programming go back to 1985, long before it enjoyed the acceptance it has today. He has a broad range of software experience, having worked on development tools like VisualAge for Smalltalk, operating system performance and memory analysis, and UI design. He worked as a consultant for object-oriented development projects throughout the United States, as well as four years in Europe. His recent interests include object-oriented analysis/design, programming tools, and Web programming with the WebSphere Application Server. Then he joined the Eclipse Jumpstart team, whose primary goal is to help ISVs to create commercial offerings based on the Eclipse Platform. In another life, he authored several articles about diverse Smalltalk topics, like meta-programming, team development, and memory analysis.
Scott Fairbrother works on the Eclipse Jumpstart team at IBM in Research Triangle Park, N.C. He is a software developer with more than 20 years of experience. He has developed object-oriented application frameworks for business process management, written specifications for IBM middleware on Windows 2000, and has authored on the subject of Microsoft Visual Studio .NET.
Cam-Thu Le joined IBM in 1983, with experience spanning many aspects of software creation: development, testing, and National Language Support (NLS) planning and coordination. Le has led the National Language versions of IBM products to worldwide concurrent general availability, including the 4690 Point of Sales product and VisualAge for Smalltalk. Le joined the Eclipse Project as the NLS focal point, and coordinated the building and testing of the NL versions of the Eclipse Workbench and WebSphere Studio Workbench.





