Special Arabic characters

Some Arabic characters have different representations in different code pages, and so need special handling during code page conversion.

Because these characters are not represented in all code pages, a normal conversion results in substitute control characters (SUB), which is a loss of data.

Lam-Alef
This character is represented as a single character in code pages 420, 864, and 1046, which are used for visual presentation in addition to the Unicode Arabic Presentation Forms-B (uFExx range). This character is represented as two characters, Lam and Alef, in code pages 425, 1089, and 1256, which are used for implicit representation in addition to the Unicode Arabic u06xx range.
Tail of Seen family of characters
The visual code pages 420, 864, and 1046 represent the final form of the Seen family of characters as two adjacent characters: the three quarters shape and the Tail. The implicit code pages 425, 1089, 1256, and the Unicode Arabic u06xx range, do not represent the Tail character. In Unicode Arabic Presentation Forms-B (uFExx range), the final form for characters in the Seen family is represented as one character.
Tashkeel or diacritic characters except for Shadda
These characters are not represented in code pages 420 and 864. Conversion of Tashkeel characters from code pages 425, 1046, 1089, 1256, and Unicode to 420 or 864 results in SUB characters.
Yeh-Hamza final form
Code pages 420 and 864 have no unique character for the Yeh-Hamza final form; it is represented as two characters: Yeh final form and Hamza. In other code pages, such as 425, 1046, 1089, 1256, and Unicode, the Yeh-Hamza final form is represented as one character or two characters depending on the user's input; whether it is one key stroke (Yeh-Hamza key) or two strokes (Yeh key + Hamza key). The conversion from the previous code pages to 420 or 864 converts the Yeh-Hamza final form character to the Yeh-Hamza initial form; a special handling process must convert it to the Yeh final form and Hamza.

To avoid the loss of such characters during conversion, various Arabic shaping options are available to properly handle them.