Message Sets: Importing from COBOL: supported features

The COBOL importer uses a set of default values and behaviors when mapping COBOL data types to message model elements.

The following table shows how COBOL definitions influence the XML Schema settings in the message model.

COBOL Clause XML Schema data type Notes
PIC A xsd:string  
PIC G xsd:string Set the compile-time locale name to ja_JP in Window > Preferences > Importer > COBOL to process this.
PIC N xsd:string Set the compile-time locale name to ja_JP in Window > Preferences > Importer > COBOL to process this.
PIC X xsd:string  
PIC 9(n) n = 1-4 xsd:short DISPLAY, COMP, or COMP-3
PIC 9(n) n = 5-9 xsd:int DISPLAY, COMP, or COMP-3
PIC 9(n) n = 10-18 xsd:long DISPLAY, COMP, or COMP-3
PIC 9(n) n = 19-31 xsd:integer DISPLAY, COMP, or COMP-3
PIC 9(n)V9(m) xsd:decimal DISPLAY, COMP, or COMP-3 any virtual decimal point value
COMP-1 xsd:float  
COMP-2 xsd:double  
Any edited string xsd:string  
Any edited number xsd:string For example, a COBOL PICTURE clause that contains any of the following characters:
  • 'Z'
  • '+'
  • '-'
  • '.'
  • ','
  • 'B'
  • '0'
or a currency symbol.

If you want your integration node logical type to be a numeric one, make sure that the COBOL PICTURE clause does not contain any of these characters.

VALUE All Non-88 Level VALUE clauses can be imported as schema default values (option on import wizard).

The following table shows how COBOL definitions influence the physical MRM CWF characteristics of the elements that are generated in the message model.

COBOL Clause CWF Physical Type CWF Length Characteristics Other CWF characteristics
PIC X(n)

PIC A(n)

Fixed Length String Length = n

Length Units = Bytes

Justification = Left Justify

Padding Character = SPACE

PIC G(n)

PIC N(n)

Fixed Length String Length = n

Length Units = Characters

Justification = Left Justify

Padding Character = SPACE

PIC 9(n) DISPLAY n=1-31 External Decimal Length = n

Length Units = Bytes

Justification = Right Justify

Padding Character = '0'

Signed = Unticked

Sign Orientation = Trailing

PIC 9(n) COMP, COMP-4, COMP-5 or BINARY Integer Length = 2, 4 or 8 based on n

Length Units = Bytes

Signed = Unticked

Sign Orientation = Blank

PIC 9(n) COMP-3 n=1-18 Packed Decimal Length = CEILING((n+1)/2)

Length Units = Bytes

Signed = Unticked

Sign Orientation = Blank

PIC S9(n) DISPLAY n=1-31 External Decimal Length = n

Length Units = Bytes

Signed = Ticked

Sign Orientation = Trailing

*See Note 1

PIC S9(n) COMP or COMP-3

n=1-18

Integer or Packed Decimal Length = See COMP and COMP-3 definitions above

Length Units = Bytes

Signed = Ticked

Sign Orientation = Blank

PIC 9(m)V9(n) DISPLAY n=1-31 External Decimal Length = n+m

Length Units = Bytes

Signed = Unticked

Sign Orientation = Trailing

Virtual Decimal Point = n

PIC 9(m)V9(n) COMP or COMP-3 Integer or Packed Decimal Length = CEILING((n+m+1)/2) for COMP-3

Length = 2, 4 or 8 for COMP

Length Units = Bytes

Signed = Unticked

Sign Orientation = Blank

Virtual Decimal Point = n

COMP-1 Float Length = 4

Length Units = Bytes

Signed = Ticked

Sign Orientation = Blank

COMP-2 Float Length = 8

Length Units = Bytes

Signed = Ticked

Sign Orientation = Blank

SYNC Float, Integer or Packed Decimal   Leading Skip Count as appropriate

Trailing Skip Count as appropriate

Byte alignment as appropriate

*See note 2

Notes:
  1. Sign Orientation can take one of the following values, based on the SEPARATE, LEADING, or TRAILING keywords in the COBOL definition:
    • Leading
    • Leading Separate
    • Trailing
    • Trailing Separate
  2. The SYNC keyword causes the field to be aligned on a 1, 2, 4, or 8-byte boundary. This might cause 'slack bytes' to be added either before or after a field. Leading Skip Count is the number of such bytes that are added before a field; Trailing Skip Count is the number of such bytes that are added after a field.

    Leading Skip Count and Trailing Skip Count are calculated by the importer for each of the imported elements by the importer, irrespective of the SYNC clause. They have non-zero values when the SYNC clause is present.

    Where there is a repeating element, Leading Skip Count and Trailing Skip Count are used for the first occurrence of the repeating element; for subsequent occurrences, only the Trailing Skip Count is used.

    Refer to COBOL reference material for details of fields that require byte alignment.

  3. All files that you import must be syntactically correct. Results are unpredictable if the file being imported is not synctactically correct.
  4. COBOL data types that have keywords POINTER, COMP-X, INDEX, or PROCEDURE-POINTER, are not supported.
  5. COBOL clauses that contain the keyword NATIVE cause an error, and are not imported.
  6. COBOL level 66 and level 77 data items are not imported.
  7. Hexadecimal binary values cannot be attributed to non-numeric literals. They cannot reside in the LINKAGE SECTIONs that are imported by the COBOL importer. They can reside elsewhere in the COBOL file. Alternatively, you can convert the hexadecimal value to a character string for PIC X, or to a decimal number for PIC 9.
  8. If element names clash with Java™ language keywords, the element names are modified by prefixing the element name with a single underscore character.
  9. Object-oriented extensions to COBOL 85 are not supported. For example, OBJECT-REFERENCE is not supported.
  10. COBOL OCCURS DEPENDING ON clause. The Byte Alignment, Leading Skip Count, and Trailing Skip Count CWF properties of elements within such a structure are not set up properly. You must correct these using the message editor.
  11. When the imported COBOL source file contains QUOTE or QUOTES in the value clause of a picture string, the default behavior is to enter the data with double quotation marks, unless you set the COBOL QUOTE compile option to SINGLE on the Import Options page of the COBOL importer wizard.

Signed external decimal numbers

The MRM Custom Wire Format (CWF) and TDS components of IBM® App Connect Enterprise support the External Decimal (also known as Zoned Decimal) data format for numeric data. Numeric data in this format is stored internally as decimal character data. For example, in a system that uses the EBCDIC code, the number 1234 stored in a 4-byte external decimal field is stored as the character string '1234', and its actual internal hexadecimal representation is 'F1F2F3F4'.

With signed external decimal numbers, the sign can be incorporated into the actual data by modifying the first half of the first or last byte (depending on whether you are using a sign-leading or sign-trailing representation). Typically, '0xC' is used to represent a positive number, '0xD' is used to represent a negative number and '0xF' is used to represent an unsigned number.

Note: In general, any of '0xA', '0xC', '0xE' or '0xF' can be used to indicate a positive value, and '0xB' or '0xD' can be used to indicate a negative value. The actual preferred representation is dependent upon the actual hardware architecture.

On ASCII machines, there are a number of mechanisms for the internal representation of external decimal data. One representation ('Sign ASCII') that is employed by IBM's pSeries machines, uses the normal ASCII codes ('0' [hex 30] to '9' [hex 39]) for the first or last digit of both unsigned and positive numbers, and the characters 'p' [hex 70] to 'y' [hex 79] for negative numbers.

An alternative method (Sign EBCDIC Custom) is used on some other ASCII based machines. This method uses the same characters as an EBCDIC based machine, even though the actual internal hexadecimal representations of them are different. If you use this technique, the character string for both EBCDIC and ASCII platforms is identical. You could potentially receive a message from an EBCDIC platform (created from a COBOL copybook that contains such entries as PIC XXX and PIC S999), and convert the whole message to ASCII, or the other way around. The character string that represents the external decimal field in the message (after the ASCII to EBCDIC, or EBCDIC to ASCII, conversion) maps to the code point that represents the correct sign for the decimal.

This method includes the limitation that curly brace characters are variant (they have different code points in different EBCDIC code pages). This mechanism works only for those EBCDIC code pages where the curly brace characters '{' and '}' (which are used to represent signed 0) have exactly the code points X'C0' and X'D0'. For example, it works for code page 500 but not for code page 871, where the curly braces have code points X'8E' and X'9C.

In an ASCII environment (determined by the CCSID property at run time), the default for both input and output is the 'Sign ASCII' representation. You can specify the applicable representation in the CWF physical layer for local attributes and local elements of types decimal, float, and integer.

Note: This option is appropriate only for those elements or attributes that have an external decimal physical representation, and that have an embedded ('Leading' or 'Trailing') sign (determined by the Sign Orientation property).

The following table shows the internal representation (both character and actual hexadecimal value) of the first or last digit for external decimal numbers with an included (embedded) leading or trailing sign respectively. (The table does not specify the representation for unsigned values, which are 0x30-0x39 for ASCII and 0xF0-0xF9 for EBCDIC.)

  ASCII environment EBCDIC environment   ASCII environment EBCDIC environment
  Positively signed values   Negatively signed values
Digit Sign ASCII Sign EBCDIC Custom     Sign ASCII Sign EBCDIC Custom  
0 0(30) {(7B) {(C0)   p(70) }(7D) }(D0)
1 1(31) A(41) A(C1) q(71) J(4A) J(D1)
2 2(32) B(42) B(C2) r(72) K(4B) K(D2)
3 3(33) C(43) C(C3®) s(73) L(4C) L(D3)
4 4(34) D(44) D(C4) t(74) M(4D) M(D4)
5 5(35) E(45) E(C5) u(75) N(4E) N(D5)
6 6(36) F(46) F(C6) v(76) O(4F) O(D6)
7 7(37) G(47) G(C7) w(77) P(50) P(D7)
8 8(38) H(48) H(C8) x(78) Q(51) Q(D8)
9 9(39) I(49) I(C9) y(79) R(52) R(D9)

The next table gives some examples for a range of simple numbers that are representative of what can be transmitted or received using these approaches.

  Sign leading Sign trailing
  ASCII Environment EBCDIC Environment ASCII Environment EBCDIC Environment
Decimal value Sign ASCII Sign EBCDIC Custom   Sign ASCII Sign EBCDIC Custom  
1234
31 32 33 34
"1234"
31 32 33 34
"1234"
F1 F2 F3 F4
"1234"
31 32 33 34
"1234"
31 32 33 34
"1234"
F1 F2 F3 F4
"1234"
+1234
31 32 33 34
"1234"
41 32 33 34
"A234"
C1 F2 F3 F4
"A234"
31 32 33 34
"1234"
31 32 33 44
"123D"
F1 F2 F3 C4
"123D"
-1234
71 32 33 34
"q234"
4A 32 33 34
"J234"
D1 F2 F3 F4
"J234"
31 32 33 74
"123t"
31 32 33 4D
"123M"
F1 F2 F3 D4
"123M"
7890
37 38 39 30
"7890"
37 38 39 30
"7890"
F7 F8 F9 F0
"7890"
37 38 39 30
"7890"
37 38 39 30
"7890"
F7 F8 F9 F0
"7890"
+7890
37 38 39 30
"7890"
47 38 39 30
"G890"
C7 F8 F9 F0
"G890"
37 38 39 30
"7890"
37 38 39 7B
"789{"
F7 F8 F9 C0
"789{"
-7890
77 38 39 30
"w890"
50 38 39 30
"P890"
D7 F8 F9 F0
"P890"
37 38 39 70
"789p"
37 38 39 7D
"789}"
F7 F8 F9 D0
"789}"