Configuring bidirectional text support

IBM® z/OS® Connect can support CICS services which have their data formatted as bidirectional text (bidi). This capability allows REST clients to more easily exchange data in logical order, while allowing the underlying application to operate upon data that is in visual order.

Before you begin

Bidi SOR application support allows bidi text, in all single byte, EBCDIC and character field types, to retain its visual order. Configurations for how to process and render bidi text are specified with attributes in a zosconnect_bidiConfig element in server.xml.

To enable support for bidi text, you must have the minimum z/OS Connect EE API toolkit and z/OS Connect server versions installed that are required to use the capability. To learn more, see Capabilities compatibility.

Note: Bidi support in CICS applications comes with the following limitations:
  • Bidi text transformations do not apply to service fields that are overridden as type Boolean or Date. The field types that are supported are only non-numeric, SBCS, EBCDIC, and Unicode fields.
  • This enhancement implements the open source ICU4J Bidi Transformation Engine, and inherits its own limitations and restrictions. To learn more about these, see the Unicode ICU Bidi Transform documentation on GitHub for more information.
  • Bidi text conversions can only be used with CICS COMMAREA services and CICS channel services.

In the server.xml, bidi service configurations are expressed by statically defining how incoming and outgoing left-to-right (LTR) or right-to-left (RTL) text direction may be transformed by the service.

Selecting bidi configurations in the API Toolkit

To select a bidi configuration to be used in a CICS service, perform the following:
  1. Open the service with the Service Project Editor.
  2. Select the Configuration tab.
  3. Within the Optional Configuration section, specify the unique ID of the bidi configuration you will use for this given service.

    This ID is the id attribute of the zosconnect_bidiConfig element in server.xml.

Configuration elements

The zosconnect_bidiConfig element attributes are listed in the Reference section.

The following table features valid Arabic digit shaping options that can be used for the previously mentioned inArabicShapingOptions and outArabicShapingOptions attributes. All options beginning with the DIGITS_ prefix are mutually exclusive from one another in their use.

Table 1. Arabic digit shaping options
Option Effect
DIGITS_AN2EN Replace Arabic-Indic digits with European digits (U+0030 … U+0039).
DIGITS_EN2AN Replace European digits (U+0030 … U+0039) with Arabic-Indic digits.
DIGITS_EN2AN_INIT_AL Replace European digits (U+0030 … U+0039) with Arabic-Indic digits if the most recent strongly directional character is an Arabic letter, meaning that its bidi direction value is RIGHT_TO_LEFT_ARABIC. The initial state of the start of the the text is assumed to be an Arabic letter, so European digits at the start of the text will change.
DIGITS_EN2AN_INIT_LR Replace European digits (U+0030 … U+0039) with Arabic-Indic digits if the most recent strongly directional character is an Arabic letter, meaning that its bidi direction value is RIGHT_TO_LEFT_ARABIC. The initial state at the start of the text is assumed to not be an Arabic letter, so European digits at the start of the text will not change.
Option modifiers
DIGIT_TYPE_AN_EXTENDED Use extended Eastern Arabic-Indic digits (U+06F0 … U+06F9).

This option modifies the behavior of the preceding four options DIGITS_AN2EN, DIGITS_EN2AN, DIGITS_EN2AN_INIT_AL, and DIGITS_EN2AN_INIT_LR.

The following table features valid Arabic letter shaping options that can be used for the previously mentioned inArabicShapingOptions and outArabicShapingOptions attributes. All options are mutually exclusive from one another in their use.

Table 2. Arabic letter shaping options
Option Effect
LETTERS_SHAPE Replace normative letter characters in the U+0600 Arabic block with shaped ones in the U+FE70, Presentation Forms B, block. Performs Lam-Alef ligature substitution.
LETTERS_SHAPE_TASHKEEL_ISOLATED Replace normative letter characters in the U+0600 Arabic block, except for the TASHKEEL characters at U+064B … U+0652, with shaped characters in the U+FE70, Presentation Forms B, block. The TASHKEEL characters will always be converted to the isolated forms rather than to their correct shape.
LETTERS_UNSHAPE Replace shaped letter characters in the U+FE70, Presentation Forms B, block with normative characters in the U+0600 Arabic block. Converts Lam-Alef ligatures to pairs of Lam and Alec characters, consuming spaces if required.
The following table features Arabic memory shaping options that can be used for the previously mentioned inArabicShapingOptions and outArabicShapingOptions attributes. All options are mutually exclusive from one another in their use.
Table 3. Arabic memory shaping options
Option Effect
LENGTH_FIXED_SPACES_AT_BEGINNING The result must have the same length as the source. If more room is necessary, then try to consume spaces at the beginning of the text.
LENGTH_FIXED_SPACES_AT_END The result must have the same length as the source. If more room is necessary, then try to consume spaces at the end of the text.
LENGTH_FIXED_SPACES_NEAR The result must have the same length as the source. If more room is necessary, then try to consume spaces next to modified characters.
LENGTH_GROW_SHRINK Allow the result to have a different length than the source.

The following table features valid Arabic Lam-Alef memory shaping options that can be used for the previously mentioned inArabicShapingOptions and outArabicShapingOptions attributes. All options beginning with the LAMALEF_ prefix are mutually exclusive from one another in their use.

Table 4. Lam-Alef memory shaping options
Option Effect
LAMALEF_AUTO The result must have the same length as the source. Shaping mode: for each LAMALEF characters found, expand LAMALEF using space at end. If there is no space at end, use spaces a the beginning of the buffer. If there is no space at the beginning of the buffer, use space near the LAMALEF character. De-shaping mode: perform the same function as LAMALEF_END.
LAMALEF_BEGIN The result must have the same length as the source. If more room is necessary, then try to consume spaces at the beginning of the text.
LAMALEF_END The result must have the same length as the source. If more room is necessary, then try to consume spaces at the end of the text.
LAMALEF_NEAR The result must have the same length as the source. If more room is necessary, then try to consume spaces next to modified characters.
LAMALEF_RESIZE Allow the result to have a different length than the source.
Option modifiers
SPACES_RELATIVE_TO_TEXT_BEGIN_END This option affects the meaning of BEGIN and END options. If this option is not used, the default for BEGIN and END will be as the following:
Default for visual LTR, visual RTL, and logical text:
  1. BEGIN always refers to the start address of physical memory.
  2. END always refers to the end address of physical memory.
If this option is used, then it will swap the preceding meanings of BEGIN and END for visual LTR text only. The effect on BEGIN and END memory options is as follows:
  1. BEGIN for visual LTR text will begin on the right side of the visual text. This is the same as END is default behavior.
  2. BEGIN for logical text will be the same as BEGIN in default behavior.
  3. END for visual LTR text will be the end, or left side, of the visual text.
  4. END for logical text is the same as END in default behavior

This option modifies the behavior of the preceding five options LAMALEF_AUTO, LAMALEF_BEGIN, LAMALEF_END, LAMALEF_NEAR, LAMALEF_RESIZE.

The following table features valid Arabic Seen memory shaping options that can be used for the previously mentioned inArabicShapingOptions and outArabicShapingOptions attributes. All options are mutually exclusive from one another in their use.

Table 5. Arabic Seen memory shaping options
Option Effect
SEEN_TWOCELL_NEAR The results have the same length as the source.

Shaping mode: The SEEN family character will expand into two characters using space after the SEEN family character. If there are no spaces found, an ArabicShapingException will occur.

De-shaping mode: Any SEEN character followed by a TAIL character will be replaced by one cell SEEN and a space will replace the TAIL.

SHAPE_TAIL_NEW_UNICODE Shaping will use the new Unicode code point for TAIL, 0xFE73. If this option is not specified, the old unofficial Unicode TAIL code point is used, 0x200B. This is the default.

De-shaping will not use this option.

Shaping mode: Only shaping.

The following table features valid Arabic Tashkeel memory shaping options that can be used for the previously mentioned inArabicShapingOptions and outArabicShapingOptions attributes. All options are mutually exclusive from one another in their use.

Table 6. Tashkeel memory shaping options
Option Effect
TASHKEEL_BEGIN The results have the same length as the source.

Shaping mode: Tashkeel characters will be replaced by spaces, which are placed at the beginning of the buffer.

TASHKEEL_END The results have the same length as the source.

Shaping mode: Tashkeel characters will be replaced by spaces, which are placed at the end of the buffer.

TASHKEEL_RESIZE Allow the result to have a different length than the source.

Shaping mode: Tashkeel characters will be removed, and the buffer length will shrink.

TASHKEEL_REPLACE_BY_TWTWEEL The results have the same length as the source.

Shaping mode: Tashkeel characters will be replaced by Tatweel if it is connected to adjacent characters, or replaced by spaces if it is not connected.

The following table features valid Arabic YehHamza memory shaping options that can be used for the previously mentioned inArabicShapingOptions and outArabicShapingOptions attributes. All options are mutually exclusive from one another in their use.

Table 7. YehHamza memory shaping options
Option Effect
YEHHAMZA_TWOCELL_NEAR

The results have the same length as the source.

Shaping mode: The YehHamza character will expand into two characters using the space after the character. If there are no spaces found, an ArabicShapingException will occur.

De-shaping mode: Any Yeh, final or isolated, character followed by a Hamza character will be replaced by one cell YehHamza character and a space will replace the Hamza.

Example

The following is an example of a configured zosconnect_bidiConfig element in server.xml which specifies multiple bidi rules:
<zosconnect_bidiConfig id="bidiConfig1"
inOrder="LOGICAL" inDirection="LTR"
inHostOrder="VISUAL" inHostDirection="RTL"
inSymmetricSwapping="true"
inArabicShapingOptions="DIGITS_EN2AN, DIGIT_TYPE_AN_EXTENDED"

outHostOrder="VISUAL" outHostDirection="RTL"
outOrder="LOGICAL" outDirection="LTR"
outSymmetricSwapping="true"
outArabicShapingOptions="DIGITS_AN2EN, DIGIT_TYPE_AN_EXTENDED"
/>

In this example, data is received by the service in logical order, and is transformed into visual order for use by the host application. The DIGIT_TYPE_AN_EXTENDED option modifier is used with DIGITS_EN2AN to convert European digits to Eastern Arabic-Indic digits for use on the host application.

Using this same bidi configuration, data is sent from the host application in visual order, and is transformed to logical order by the IBM z/OS Connect service. The same DIGIT_TYPE_AN_EXTENDED option modifier is used with DIGITS_AN2EN to convert Eastern Arabic-Indic digits to European digits in order to be sent by the service.

Troubleshooting

If there is an issue with your bidi service during runtime of the z/OS Connect server, you may receive BAQR7119E, BAQR7120E, BAQR7121E, BAQR7122E, or BAQR7123E messages from the z/OS Connect server. To learn more about these messages, see IBM z/OS Connect Runtime Messages.

For more information on how these errors relate to the ICU4J Bidi Transformation Engine, see the Unicode ICU Bidi Transform documentation on GitHub for more information.