SOSI

The SOSI option affects the treatment of values X'1E' and X'1F' in comments; alphanumeric, national, and DBCS literals; and in DBCS user-defined words.

SOSI option syntax

Read syntax diagramSkip visual syntax diagramNOSOSISOSI

Default is: NOSOSI

Abbreviations are: None

NOSOSI
With NOSOSI, character positions that have values X'1E' and X'1F' are treated as data characters.

NOSOSI conforms to 85 COBOL Standard.

SOSI
With SOSI, shift-out (SO) and shift-in (SI) control characters delimit ASCII DBCS character strings in COBOL source programs. The SO and SI characters have the encoded values of X'1E' and X'1F', respectively.

SO and SI characters have no effect on COBOL for Linux® source code, except to act as placeholders for host DBCS SO and SI characters to ensure proper data handling when remote files are converted from EBCDIC to ASCII.

When the SOSI option is in effect, in addition to existing rules for COBOL for Linux, the following rules apply:

  • All DBCS character strings (in user-defined words, DBCS literals, alphanumeric literals, national literals, and in comments) must be delimited by the SO and SI characters.
  • User-defined words cannot contain both DBCS and SBCS characters.
  • The maximum length of a DBCS user-defined word is 14 DBCS characters.
  • Double-byte uppercase alphabetic letters are not equivalent to the corresponding double-byte lowercase letters when used in user-defined words.
  • A DBCS user-defined word must contain at least one letter that does not have its counterpart in a single-byte representation.
  • Double-byte representations of single-byte characters for A-Z, a-z, 0-9, the hyphen (-), and the underscore (_) can be included within a DBCS user-defined word. Rules applicable to these characters in single-byte representation apply to these characters in double-byte representation. For example, in a user-defined word, the hyphen cannot appear as the first or the last character, and the underscore cannot appear as the first character.
  • For DBCS and national literals that contain X'1E' or X'1F' values, the following rules apply when the SOSI compiler option is in effect:
    • Character positions with X'1E' and X'1F' are treated as SO and SI characters.
    • Character positions with X'1E' and X'1F' are included in the character string in national hexadecimal notation and removed in basic notation.
  • For alphanumeric literals that contain X'1E' or X'1F' values, the following rules apply when the SOSI compiler option is in effect:
    • Character positions with X'1E' and X'1F' are treated as SO and SI characters.
    • Character positions with X'1E' and X'1F' are included in the character string in hexadecimal notation and removed in basic and null-terminated notation.
  • To embed DBCS quotation marks within an N-literal delimited by quotation marks, use two consecutive DBCS quotation marks to represent a single DBCS quotation mark. Do not include a single DBCS quotation mark in an N-literal if the literal is delimited by quotation marks. The same rule applies to single quotation marks.
  • The SHIFT-OUT and SHIFT-IN special registers are defined with X'0E' and X'0F' regardless of whether the SOSI option is in effect.

In general, host COBOL programs that are sensitive to the encoded values for the SO and SI characters will not have the same behavior on the Linux workstation.

Related references  
Character-strings (COBOL for Linux on x86 Language Reference)