Message Sets: Importing from COBOL: supported features
The COBOL importer uses a set of default values and behaviors when mapping COBOL data types to message model elements.
The following table shows how COBOL definitions influence the XML Schema settings in the message model.
COBOL Clause | XML Schema data type | Notes |
---|---|---|
PIC A | xsd:string | |
PIC G | xsd:string | Set the compile-time locale name to ja_JP in | to process this.
PIC N | xsd:string | Set the compile-time locale name to ja_JP in | to process this.
PIC X | xsd:string | |
PIC 9(n) n = 1-4 | xsd:short | DISPLAY, COMP, or COMP-3 |
PIC 9(n) n = 5-9 | xsd:int | DISPLAY, COMP, or COMP-3 |
PIC 9(n) n = 10-18 | xsd:long | DISPLAY, COMP, or COMP-3 |
PIC 9(n) n = 19-31 | xsd:integer | DISPLAY, COMP, or COMP-3 |
PIC 9(n)V9(m) | xsd:decimal | DISPLAY, COMP, or COMP-3 any virtual decimal point value |
COMP-1 | xsd:float | |
COMP-2 | xsd:double | |
Any edited string | xsd:string | |
Any edited number | xsd:string | For example, a COBOL PICTURE clause that contains
any of the following characters:
If you want your integration node logical type to be a numeric one, make sure that the COBOL PICTURE clause does not contain any of these characters. |
VALUE | All | Non-88 Level VALUE clauses can be imported as schema default values (option on import wizard). |
The following table shows how COBOL definitions influence the physical MRM CWF characteristics of the elements that are generated in the message model.
COBOL Clause | CWF Physical Type | CWF Length Characteristics | Other CWF characteristics |
---|---|---|---|
PIC X(n) PIC A(n) |
Fixed Length String | Length = n Length Units = Bytes |
Justification = Left Justify Padding Character = SPACE |
PIC G(n) PIC N(n) |
Fixed Length String | Length = n Length Units = Characters |
Justification = Left Justify Padding Character = SPACE |
PIC 9(n) DISPLAY n=1-31 | External Decimal | Length = n Length Units = Bytes |
Justification = Right Justify Padding Character = '0' Signed = Unticked Sign Orientation = Trailing |
PIC 9(n) COMP, COMP-4, COMP-5 or BINARY | Integer | Length = 2, 4 or 8 based on n Length Units = Bytes |
Signed = Unticked Sign Orientation = Blank |
PIC 9(n) COMP-3 n=1-18 | Packed Decimal | Length = CEILING((n+1)/2) Length Units = Bytes |
Signed = Unticked Sign Orientation = Blank |
PIC S9(n) DISPLAY n=1-31 | External Decimal | Length = n Length Units = Bytes |
Signed = Ticked Sign Orientation = Trailing *See Note 1 |
PIC S9(n) COMP or COMP-3 n=1-18 |
Integer or Packed Decimal | Length = See COMP and COMP-3 definitions above Length Units = Bytes |
Signed = Ticked Sign Orientation = Blank |
PIC 9(m)V9(n) DISPLAY n=1-31 | External Decimal | Length = n+m Length Units = Bytes |
Signed = Unticked Sign Orientation = Trailing Virtual Decimal Point = n |
PIC 9(m)V9(n) COMP or COMP-3 | Integer or Packed Decimal | Length = CEILING((n+m+1)/2) for COMP-3 Length = 2, 4 or 8 for COMP Length Units = Bytes |
Signed = Unticked Sign Orientation = Blank Virtual Decimal Point = n |
COMP-1 | Float | Length = 4 Length Units = Bytes |
Signed = Ticked Sign Orientation = Blank |
COMP-2 | Float | Length = 8 Length Units = Bytes |
Signed = Ticked Sign Orientation = Blank |
SYNC | Float, Integer or Packed Decimal | Leading Skip Count as appropriate Trailing Skip Count as appropriate Byte alignment as appropriate *See note 2 |
|
- Sign Orientation can take one of the following
values, based on the SEPARATE, LEADING,
or TRAILING keywords in the COBOL definition:
- Leading
- Leading Separate
- Trailing
- Trailing Separate
- The SYNC keyword causes the field to be aligned
on a 1, 2, 4, or 8-byte boundary. This might cause 'slack bytes' to
be added either before or after a field. Leading Skip Count is
the number of such bytes that are added before a field; Trailing
Skip Count is the number of such bytes that are added after
a field.
Leading Skip Count and Trailing Skip Count are calculated by the importer for each of the imported elements by the importer, irrespective of the SYNC clause. They have non-zero values when the SYNC clause is present.
Where there is a repeating element, Leading Skip Count and Trailing Skip Count are used for the first occurrence of the repeating element; for subsequent occurrences, only the Trailing Skip Count is used.
Refer to COBOL reference material for details of fields that require byte alignment.
- All files that you import must be syntactically correct. Results are unpredictable if the file being imported is not synctactically correct.
- COBOL data types that have keywords POINTER, COMP-X, INDEX, or PROCEDURE-POINTER, are not supported.
- COBOL clauses that contain the keyword NATIVE cause an error, and are not imported.
- COBOL level 66 and level 77 data items are not imported.
- Hexadecimal binary values cannot be attributed to non-numeric literals. They cannot reside in the LINKAGE SECTIONs that are imported by the COBOL importer. They can reside elsewhere in the COBOL file. Alternatively, you can convert the hexadecimal value to a character string for PIC X, or to a decimal number for PIC 9.
- If element names clash with Java™ language keywords, the element names are modified by prefixing the element name with a single underscore character.
- Object-oriented extensions to COBOL 85 are not supported. For example, OBJECT-REFERENCE is not supported.
- COBOL OCCURS DEPENDING ON clause. The Byte Alignment, Leading Skip Count, and Trailing Skip Count CWF properties of elements within such a structure are not set up properly. You must correct these using the message editor.
- When the imported COBOL source file contains QUOTE or QUOTES in the value clause of a picture string, the default behavior is to enter the data with double quotation marks, unless you set the COBOL QUOTE compile option to SINGLE on the Import Options page of the COBOL importer wizard.
Signed external decimal numbers
The MRM Custom Wire Format (CWF) and TDS components of IBM® App Connect Enterprise support the External Decimal (also known as Zoned Decimal) data format for numeric data. Numeric data in this format is stored internally as decimal character data. For example, in a system that uses the EBCDIC code, the number 1234 stored in a 4-byte external decimal field is stored as the character string '1234', and its actual internal hexadecimal representation is 'F1F2F3F4'.
With signed external decimal numbers, the sign can be incorporated into the actual data by modifying the first half of the first or last byte (depending on whether you are using a sign-leading or sign-trailing representation). Typically, '0xC' is used to represent a positive number, '0xD' is used to represent a negative number and '0xF' is used to represent an unsigned number.
On ASCII machines, there are a number of mechanisms for the internal representation of external decimal data. One representation ('Sign ASCII') that is employed by IBM's pSeries machines, uses the normal ASCII codes ('0' [hex 30] to '9' [hex 39]) for the first or last digit of both unsigned and positive numbers, and the characters 'p' [hex 70] to 'y' [hex 79] for negative numbers.
An alternative method (Sign EBCDIC Custom) is used on some other ASCII based machines. This method uses the same characters as an EBCDIC based machine, even though the actual internal hexadecimal representations of them are different. If you use this technique, the character string for both EBCDIC and ASCII platforms is identical. You could potentially receive a message from an EBCDIC platform (created from a COBOL copybook that contains such entries as PIC XXX and PIC S999), and convert the whole message to ASCII, or the other way around. The character string that represents the external decimal field in the message (after the ASCII to EBCDIC, or EBCDIC to ASCII, conversion) maps to the code point that represents the correct sign for the decimal.
This method includes the limitation that curly brace characters are variant (they have different code points in different EBCDIC code pages). This mechanism works only for those EBCDIC code pages where the curly brace characters '{' and '}' (which are used to represent signed 0) have exactly the code points X'C0' and X'D0'. For example, it works for code page 500 but not for code page 871, where the curly braces have code points X'8E' and X'9C.
In an ASCII environment (determined by the CCSID property at run time), the default for both input and output is the 'Sign ASCII' representation. You can specify the applicable representation in the CWF physical layer for local attributes and local elements of types decimal, float, and integer.
The following table shows the internal representation (both character and actual hexadecimal value) of the first or last digit for external decimal numbers with an included (embedded) leading or trailing sign respectively. (The table does not specify the representation for unsigned values, which are 0x30-0x39 for ASCII and 0xF0-0xF9 for EBCDIC.)
ASCII environment | EBCDIC environment | ASCII environment | EBCDIC environment | ||||
---|---|---|---|---|---|---|---|
Positively signed values | Negatively signed values | ||||||
Digit | Sign ASCII | Sign EBCDIC Custom | Sign ASCII | Sign EBCDIC Custom | |||
0 | 0(30) | {(7B) | {(C0) | p(70) | }(7D) | }(D0) | |
1 | 1(31) | A(41) | A(C1) | q(71) | J(4A) | J(D1) | |
2 | 2(32) | B(42) | B(C2) | r(72) | K(4B) | K(D2) | |
3 | 3(33) | C(43) | C(C3®) | s(73) | L(4C) | L(D3) | |
4 | 4(34) | D(44) | D(C4) | t(74) | M(4D) | M(D4) | |
5 | 5(35) | E(45) | E(C5) | u(75) | N(4E) | N(D5) | |
6 | 6(36) | F(46) | F(C6) | v(76) | O(4F) | O(D6) | |
7 | 7(37) | G(47) | G(C7) | w(77) | P(50) | P(D7) | |
8 | 8(38) | H(48) | H(C8) | x(78) | Q(51) | Q(D8) | |
9 | 9(39) | I(49) | I(C9) | y(79) | R(52) | R(D9) |
The next table gives some examples for a range of simple numbers that are representative of what can be transmitted or received using these approaches.
Sign leading | Sign trailing | |||||
---|---|---|---|---|---|---|
ASCII Environment | EBCDIC Environment | ASCII Environment | EBCDIC Environment | |||
Decimal value | Sign ASCII | Sign EBCDIC Custom | Sign ASCII | Sign EBCDIC Custom | ||
1234 | 31 32 33 34
"1234" |
31 32 33 34
"1234" |
F1 F2 F3 F4
"1234" |
31 32 33 34
"1234" |
31 32 33 34
"1234" |
F1 F2 F3 F4
"1234" |
+1234 | 31 32 33 34
"1234" |
41 32 33 34
"A234" |
C1 F2 F3 F4
"A234" |
31 32 33 34
"1234" |
31 32 33 44
"123D" |
F1 F2 F3 C4
"123D" |
-1234 | 71 32 33 34
"q234" |
4A 32 33 34
"J234" |
D1 F2 F3 F4
"J234" |
31 32 33 74
"123t" |
31 32 33 4D
"123M" |
F1 F2 F3 D4
"123M" |
7890 | 37 38 39 30
"7890" |
37 38 39 30
"7890" |
F7 F8 F9 F0
"7890" |
37 38 39 30
"7890" |
37 38 39 30
"7890" |
F7 F8 F9 F0
"7890" |
+7890 | 37 38 39 30
"7890" |
47 38 39 30
"G890" |
C7 F8 F9 F0
"G890" |
37 38 39 30
"7890" |
37 38 39 7B
"789{" |
F7 F8 F9 C0
"789{" |
-7890 | 77 38 39 30
"w890" |
50 38 39 30
"P890" |
D7 F8 F9 F0
"P890" |
37 38 39 70
"789p" |
37 38 39 7D
"789}" |
F7 F8 F9 D0
"789}" |