Determining application compatibility: Code page support
Many code pages that were provided with IBM® SDK, Java™ Technology Edition, Version 8 to support character encoding are no longer available in this release. If your application relies on one of these code pages, switch to an ICU4J Unicode-based charset instead.
To use the ICU4J charset, download and install the ICU4J and ICU4J charset jars. For more
information, see How to Install and Build. Then, create an instance of the charset as shown
in the following example –
Charset cs = Charset.forName("IBM-5123");For more
information, see ICU API documentation.Major differences between available character sets are detailed in the following list:
- GBK character sets: The IBM GBK character set in Java 8 is based on GB18030. However, the OpenJDK Java 11 GBK character set is based on Unicode 2.0 based encoding, which is most similar to the IBM x-mswin-936A character set.
- Bidirectional support: IBM SDK, Java Technology Edition V8 contained bidirectional support, as described in the IBM Documentation. The bidirectional layout transformations that are enabled by the -DJAVABIDI command line system property in Java 8 are no longer available (Code pages: IBM420, IBM424, x-IBM856, IBM862, IBM864, IBM-867, x-IBM1046, windows-1255, windows-1256, ISO-8859-6, ISO-8859-8, x-MacArabic, x-MacHebrew, UTF-8).
- Arabic character sets: Although some bidirectional support is available in the Java 11 OpenJDK character sets, the Arabic character sets do not support Arabic deshape.