Many projects are not able to fulfill globalization (G11N) requirements for language, cultural, or multilingual support, and the product messages are not translatable for product delivery. 1 When the global market needs the latest product for on demand business, it is always difficult to change the current product infrastructure and create code changes that would address globalization issues. This happens for both traditional waterfall development projects. It is even more risky for agile development projects if the globalization requirements are not properly planned and addressed during the Inception stage.
Globalization requirements are usually not considered high priority during the planning and implementation phases of the software lifecycle. Late globalization changes in code make it difficult to correct the programming model and cause problems in updating the code. This also leads to high cost due to schedule delays, development workload, and extra test effort, and can result in lower quality delivery for globalization support.
The following recommendations will enable better effectiveness and efficiency to deliver globalization requirements for the software.
- Prioritize globalization requirements for iterative and incremental development projects
- Provide Unicode, translatability, and multi-cultural support in the Inception phase
- Globalize products by reusing components and examples
- Identify major globalization problems and assist developers to resolve problems for early defect removal
- Use Web 2.0 technology to speed up collaborative communication for stakeholders
This article details each of these recommendations.
While placing the highest value on functioning, proven, stakeholder-valued capabilities for agile development, IT organizations typically assign globalization requirements a lower priority. This is usually because many projects are initiated for English customers and the development teams think globalization is just a matter of translating an English version of a product into another language. They believe it is more efficient to deliver the first product release to English speaking countries rapidly and then internationalize the product in the next release or when they have international customers. They are not aware that thinking internationally and winning global business requires globalization to be planned and designed into the product from the very beginning of iterative and incremental development projects (see Figure 1). Global thinking affects software architecture.
Figure 1. Earlier involvement of globalization efforts is critical
Learning about globalization before planning and designing for agile development projects is the key to project success. Globalization requires more than simply "translating" the English products. According to the IBM glossary of globalization terms:
Globalization is the process of developing, manufacturing, and marketing software products that are intended for worldwide distribution. It is the design and execution of systems, software, services, and procedures so that one instance of software, executing on a single server or user machine, can process multilingual data that is culturally correct in a multicultural environment, such as the Internet. Presentation of data should include these capabilities: allowing each individual user to select a language for the user interface that may differ from the language of the data that is being processed and presenting information, such as dates and numbers, that is culturally correctly for each user even if they are from different regions.
For example, a date of 01/02/03 may be interpreted as 2001 February 3 in Japan, January 2, 2003 in the US, or 1 February 2003 in France. The language support, cultural support, and multilingual support of the applications are even more critical for e-business applications, because most Internet users are from non-US countries (see Figure 2). 2
Figure 2. Most Internet users are from non-US regions. Source: www.internetworldstats.com
IBM has provided customers with globally enabled solutions and constructed architectural principles to meet the growing demand for global application design. As shown in Figure 3, the scope of globalization is much wider than only translating English products to other languages. Although late changes in requirements can be accommodated in iterative and incremental development projects, it is crucial to have a good design for globalization architecture from the very beginning. While synthesizing a candidate architecture and preparing the environment for the project in the Inception phase, it is necessary to plan and design key globalization ingredients like the use of Unicode, translatability, and multi-cultural support. Reusable components/APIs such as the International Component for Unicode (ICU) should be considered in order to improve product quality and to estimate the cost, schedule, and resources. During the Elaboration phase, the globalization experts and testers should be involved to validate the critical globalization architecture and evaluate whether the executable prototypes have addressed the major architecture issues. This is recommended even if the current customers are in the US only. If this is not done at the beginning of the project, you run the risk of high cost for translation, extra test and development efforts, and the difficulty of supporting Unicode at a later date. At the Construction phase, the remaining minor globalization requirements, such as address or name format, can be continuously and incrementally enhanced. Last-minute surprises are minimized when globalization requirements are planned, designed, implemented, and tested appropriately.
Figure 3. Scope of globalization
During the Inception phase of iterative and incremental development planning, it is important to address complex globalization architectural requirements. Unicode support is a suggested globalization requirement that needs to be prioritized highly at the beginning. It is the universal character encoding scheme for written characters and text. Unicode-based programs are required for internationalizing software, especially when characters in different character sets need to be displayed on the same Web page simultaneously (see Figure 4) or when application users are in different countries. A Unicode database allows users to store multilingual data in a single database instead of creating one database for each different language, and it helps reduce the cost to maintain the database. Unicode support is also recommended by IBM's Global Architectural Imperative (GAI) for the benefits of better character handling and for the supports of normalization, collation, and conversions among the Unicode Transformation Formats. Furthermore, support for Unicode can extend application life and broaden integration possibilities for future needs. Text in any language can be exchanged worldwide, and legacy code pages can be easily converted.
Figure 4. Display characters from different scripts on the same Web page simultaneously
The key to efficiency and correctness in a globalized system is to have a single executable that provides total support for all locales, as shown in Figure 5. The benefits are reduced cost, ease in distributing updates and fixes, faster time to market, and simplified support for new languages and cultures. Planning it ahead can greatly reduce the overall development, testing, and maintenance costs in later stages. In order for developers to avoid hard-coding strings and truncation for messages or dialogs within the user interface, it is essential to fulfill translatability requirements in the Inception phase. Translation is typically expensive, so translatability issues such as text rendering, room for text expansion, and the ability to select the language in a localization pack installer become significantly important at an early stage.
Figure 5. Single Source, Single Executable for all locales
The IBM globalization organization proposes to use the localization pack manager to manage the location, loading, and accessing of localization pack resources. The localization pack manager acts as the intermediary between core program code and localization packs, and existing platform services provide basic support for many localization pack functions (see Figure 6). It is necessary to design file structures of localization packs based on locale names (e.g., en_US, fr_FR, ja_JP, etc.) so that they can be accessible to the Localization Pack Manager Bean within a Single Executable code.
Figure 6. Localization pack manager workflow
Applying the right artifacts for agile modeling is critical to project success. Reusable APIs for multi-cultural support -- like International Components for Unicode (ICU) -- are available on the ICU website 3 to developers for rapid application development and better code quality. For example, there are API references for ICU4C, ICU4J, and ICU4JNI. The ICU is a mature, portable set of C/C++ and Java libraries for Unicode support, software internationalization (I18N), and globalization (G11N). It provides cultural support such as multi-calendar, date/time format and complex text layout for bi-directional (BIDI), Indic, and Thai languages. These reusable components and APIs greatly help in reducing risk and strengthening solution quality and consumability.
Reusability is an important element in agile development. It saves cost and quickens code delivery. As mentioned in the previous section, ICU provides a mature, portable set of C/C++ and Java libraries for Unicode support and globalization support for software application development. It is widely known that C and C++ languages do not provide full support for Unicode and standards-compliant text handling services. To remedy this, the ICU4C libraries keep track of the industry standards (e.g., Unicode and Common Locale Data Repository) and provide flexible and portable APIs to fulfill the software globalization requirements. ICU4J is commonly used because Sun Java has a long release schedule and is infrequently updated for evolving standards. IBM and the ICU team contribute to provide the globalization technology into Suns Java so that high performance and richer APIs are available for customizable I18N/G11N and Unicode support. Using ICU saves lots of implementation effort to improve the software product quality and it is why more and more companies are using ICU for their software development.
Figure 7: Dojo calendar datepicker widget
As you can see in Figure 7, you do not have to worry about the complex calendar algorithm anymore. All you need to know is how to specify the reusable component and customize it for your application use. Although those reusable components can still have some globalization issues, developers can enhance them based on the reported globalization defects and contribute the solutions to the Dojo toolkit.
Reusable APIs or components are not restricted to ICU or the Dojo toolkit. There are many other libraries or toolkits available. Organizations can leverage development resources across departments and product teams and share the artifacts with each other by utilizing online communication such as wikis, forums, blogs, or RSS (Really Simple Syndication 2.0) subscriptions for more efficient coding and better quality improvement of the globalized components. Usually, globalized APIs or components do not need to be implemented by individual teams, which simply duplicate development effort. Best coding practices, FAQs, or guidelines can be published and referenced among product development teams. One example of this kind of aid to application developers coding globally is the G11N Cookbook published by the IBM Globalization Center of Competency. It contains a collection of best coding practices, Unicode support for IBM WebSphere®, XML and database encoding configuration, FAQs, and so on (See Figure 8 for an XML sample). All these available resources and reusable components can lead to rapid software delivery and improve product quality.
Figure 8: XML sample in G11N Cookbook
The IBM Rational Unified Process (RUP®) proposes an iterative approach for testing to be performed throughout the whole development project. However, Globalization Verification Test (GVT) always starts late in the Construction phase in order to reduce the cost of test resources. Therefore, is very difficult and expensive to modify globalization architectures -- such as database Unicode encoding and file structures of resource bundles for localization packs -- at this late stage. It usually causes unsatisfactory delivery quality for globalization support or even results in failure to ship. Thus, globalization experts and testers must begin their collaboration early in the development cycle, and continuous involvement should be a requirement to help analyze and validate the globalization architecture, including Unicode and translatability, for early defect removal (See Figure 9).
Figure 9: Continuous globalization enhancement and quality improvement should be addressed as early as possible in the development lifecycle, ideally during the requirements gathering phase.
Projects initiated in the United States commonly use ISO8859-1 (Latin-1 encoding) in products for the database encoding. This leads to non-support for the double-byte character input/output used by the Japanese, Korean, and Chinese character sets. If globalization testers can address this oversight at the beginning of the project, they can easily change the database encoding to Unicode and programs will have better interaction with the database. By contrast, using the wrong database encoding and reporting this problem late in the Construction phase causes great complexity. To change the database encoding at this point, lots of programs need to be modified. Under these circumstances, the only solution for the applications to go global is to provide a database for each language so that each database supports a single character encoding. However, this solution makes the implementation and maintenance of the databases complex and costly.
Another basic architectural concept for globalization is the use of a single executable for all languages. If the translatable messages are not isolated from programming logic, developers may make the mistake of hardcoding strings or implementing their own message files without following a standard convention. This will require a huge amount of code change for translatability support or a greatly increased translation cost later. IBM globalization verification test teams use pseudo translation to examine software without even translating messages.
An internal tool shows any occurrence of hardcoded string, which would be impossible to translate. Hence, having the globalization experts or globalization testers involved early to validate the Unicode encoding and translatability support is very crucial to project success.
IBM's Globalization Vision is that "a user can access a server from anywhere in the world, using a client in the language of his or her choice, work with applications, and interact with other users in the language and cultural conventions of their choice" (see Figure 10). Globalization tasks involve not just programming and deployment skills, but also cultural, translation, and language expertise. This test-driven development technique allows testers to give the developers rapid feedback to improve code quality and remove software defects as early as possible. Globalization testers must have strong technical skills and work closely with developers to address globalization issues and provide solutions to developers if possible. For example, globalization testers can remind developers to check examples in the GCoC Cookbook or National Language Design Guide for how to use Unicode encoding in different programming technologies and follow ICU rules for cultural format. If the globalization testers can validate the major GVT guidelines and provide feedback to the developers at the early stage, it will reduce the cost of defects. Efficient globalization use cases and effective communication between developers and testers help continuously improve product quality and deliver satisfactory software to customers more rapidly.
Figure 10. IBM's globalization vision: Users can access a server from anywhere in the world, using a client in the language of his or her choice.
Iterative and incremental agile software development requires an environment that facilitates rapid communication and extensive collaboration between team members and stakeholders. The collaboration includes (but is not limited to) globalization requirements, project plans, and test results. Stakeholders across teams need to work closely and communicate continuously to understand requirements and feedback. With the growing demand for globally distributed development and testing, collaborating effectively becomes difficult to achieve sometimes. Time zone difference may lead to information delays, and globalization requirements or feedback may not be addressed in real time. One way to improve this is to use modern Web 2.0 technologies such as wikis, RSS feeds, and blogging.
Web 2.0 is a collection of open source and lightweight user interfaces that help team members speed up collaborative communication and interact frequently, which moves the whole product team toward its goals. As shown in Figure 11, a wiki can be used as a project control site. It contains project scope, project schedule, and project status. Project-related documents can also be attached, such as the stakeholders' document, design documents, globalization requirements, or agile daily scrum meeting minutes. One good practice for the wiki: Allow everyone editing privileges so information can be updated frequently. A wiki makes communication and collaboration simpler for globally distributed teams and also strengthens the agile values of responding to change and working software mentioned in agile Manifesto.
Figure 11: Sample Wiki site for project control purpose
As noted above, another useful Web 2.0 technology is RSS (Really Simple Syndication 2.0). RSS is a family of Web-feed formats used to publish frequently updated content, which helps hasten communication flow. It allows people to subscribe to a feed of updated messages to gather real-time information and knowledge. Information exchange becomes two-way (see Figure 12). As mentioned earlier, it is really beneficial for developers to utilize reusable globalization components and APIs to reduce repeated work and common errors. Previously developed code and experience can be shared with other developers via a blog. They can pick up that code then modify it for a better fit, or they can provide feedback to the originator. By continuously enhancing the components and code, the quality of the globalization requirements can be significantly improved.
These Web 2.0 technologies provide a platform for better collaborative communication among globally distributed teams, especially in an agile development environment.
Figure 12: Sample Web 2.0 technologies
Early and efficient globalization involvement is the key to success in globally emerging, on demand businesses. The international economy brings investment opportunities for emerging markets such as China, India, Central Europe, Russia, and Brazil. It is critical to fulfill globalization requirements earlier to continuously reduce risks such as late major code changes or schedule delays for iterative and incremental development. This article proposes an effective way to prioritize globalization requirements, to implement correct globalization architectures, to speed up code delivery, to improve product quality, and finally to achieve efficient collaboration across teams.
Globalization is not just a matter of translating an English version of a product. In fact, globalization needs to extend into the areas of architecture and requirements gathering. Although late changes are possible in iterative development, the critical globalization architecture issues have to be addressed as early as possible so that overall cost can be saved, software quality to global customers can be assured, and software can be delivered more rapidly and effectively.
1 For more information from IBM on globalization and G11N requirements, see http://www.ibm.com/software/globalization/index.jsp
2 The information in Figure 2 comes from InternetWorldStats.com, which describes itself as "a free directory for Internet Market Research. Its aim is to make the Internet Usage Statistics available to the business community, the academic community, and to the general public."
3 Some good API references are available on the ICU website at http://www.icu-project.org/. There are some other good documents to refer to there.
- IBM Globalization Strategy, Global Architecture Imperatives (Globalization Architecture Imperatives, March 2003)
- International Component for Unicode (ICU) (ICU Home Page)
- Globalization (IBM Unicode: Globalizing your e-business)
- What is Unicode? (Unicode Home Page 1991-2007)
- Agile Software Development (Wikipedia)
- Iterative and incremental development (Wikipedia)
- Introduction to Agile at IBM: Overview (Being Agile in IBM)
- IRUP 3.0 (2006)
- Iterative and incremental development (Wikipedia)
- Manifesto for Agile Software Development (2001)
- IBM Globalization: Globalization Cookbook v1.0 (June 2004)
- e-Business Globalization Solution Design Guide (IBM Redbooks)
- Web 2.0 (Wikipedia)
- Dale Schultz, "Second 2007 G11N Conference" (GCoC July 2007)
- Dale Schultz, "Globalization Verification Test" (GCoC July 2007)
- Ibrahim Meru, "Globalization Rules & Guidelines for Developing Internationalized Software" (April 2007)
- Bei Shu, "A globalization technique for supporting multiple languages" (developerWorks June 2003)
- Participate in the discussion forum.
- A new forum has been created specifically for Rational Edge articles, so now you can share your thoughts about this or other articles in the current issue or our archives. Read what your colleagues the world over have to say, generate your own discussion, or join discussions in progress. Begin by clicking HERE.
Global Rational User Group Community
Joyce Hsieh is a Staff Software Engineer at IBM China Software Development Lab in Taipei, Taiwan. She obtained her Master of Science degree in Computer Science from University of Southern California, USA. She has nine years' experience in software development and testing, including Electronic Data Interchange (EDI) application implementation for banking, Web application development for e-Learning, product development for IBM WebSphere Host Access Transformation Services (HATS), Function/Globalization Verification Test for Tivoli, DB2, WECM, and ECM products. Joyce is an IBM Certified Solutions Expert for DB2 UDB V5 Database Administration, Sun Certified Java Programmer (SCJP) for Java 2, and Sun Certified Web Component Developer (SCWCD) for J2EE.
Wendy Wang is a Software Engineer at IBM China Software Development Lab (CSDL) in Taipei, Taiwan. She has worked as a test lead on globalization tests for Enterprise Content Management (ECM) products since 2004. Her GVT product experience includes DB2 Content Manager, Document Manager, Records Manager, and Web Interface for Content Management (WEBi). She has great interest in the latest innovations, particularly Web 2.0 technologies.