Architecture exploitation
COBOL V6 continues to support the ARCH option (short for architecture) introduced in COBOL V5. This option exploits new hardware instructions and enables you to get the most out of your hardware investment.
The default setting for ARCH is 8, and other supported values are 9, 10, 11, 12, and 13.
For more information on the facilities available at each level, and the mapping of these ARCH levels to specific hardware models, see ARCH in the Enterprise COBOL for z/OS® Programming Guide.
Each successive ARCH level allows the compiler to exploit more facilities in your hardware leading to the potential for increased performance. To illustrate the benefits from a COBOL application perspective, each ARCH level will be examined in greater detail below.
ARCH(8)
Hardware Feature: Decimal Floating Point (DFP)
Why This Matters For COBOL Performance: Decimal Floating Point is a natural fit for the packed decimal (COMP-3) and external decimal (DISPLAY) types that are ubiquitous in most COBOL applications. Using ARCH(8) and some OPTIMIZE setting above 0 enables the compiler to convert larger multiply and divide operations on any type of decimal operands to DFP, in order to avoid an expensive callout to a library routine.
This is possible as the hardware precision limit for DFP is much greater than is allowed in the packed decimal multiply and divide instructions.
The overhead of converting to DFP means that it is not suitable for all decimal arithmetic that would not need a library call. However, the ARCH(10) option described later in this section enables much greater use of DFP to improve performance.
Hardware Feature: Larger Move Immediate Instructions
Why This Matters For COBOL Performance: MOVEs of literal data and VALUE clause statements are common in many COBOL applications. Lower ARCH settings and all earlier compiler releases only contained support for moving a single byte of literal data in a single instruction, for example, by using the MVI - Move Immediate Instruction.
Any larger literal data required storing the constant value in the literal pool and using a memory move instruction to initialize the data item. This was less efficient in time and space than being able to embed larger immediate values directly in the instruction text.
With ARCH(8), several new move immediate instruction variants are available to move up to 16 bytes of sign extended data using one or two of these new instructions.
Also, these instructions are exploited regardless of the data type, so binary, internal/external decimal, alphanumeric, and even floating point literals take advantage of these more efficient instructions.
ARCH(9)
Hardware Feature: Distinct Operands Instructions
Why This Matters For COBOL Performance: Updating a data item or index to a new value while retaining the original value occurs frequently in many contexts in a typical COBOL application. One instance is when processing a table as some base value for the table is updated to access the various elements within the table. Under lower ARCH settings or in all earlier compiler releases, almost all instructions available that took two operands to produce a result would also overwrite the input first operand with the result.
For example: a conceptual operation such as:
Implemented with a pre ARCH(9) instruction variant would conceptually have to perform the operation as:
C = A
This means if the original value of A is required in another context, it must first be saved:
T = T + B
C = T
With ARCH(9), the distinct-operands facility is exploited to take advantage of the new variants of many arithmetic, shift, and logical instructions that will not destructively overwrite the first operand.
So the operation can be implemented in a more straightforward way:
That removes the need for extra instructions to save the original value as it is naturally preserved with the distinct operand instruction form. This feature reduces path length leading to better performance.
ARCH(10)
Hardware Feature: Improved Decimal Floating Point (DFP) Performance
Why This Matters For COBOL Performance: Using ARCH(8) and an OPTIMIZE setting greater than 0 already enables the compiler to make use of DFP to improve performance of packed and external decimal arithmetic in some particular instances. ARCH(10) goes further by adding efficient instructions to convert between DISPLAY (in particular unsigned and trailing signed overpunch zoned decimal) types and DFP.
These ARCH(10) instructions lower the overhead for using DFP for arithmetic on zoned decimal data items and enable the compiler to make much greater use of DFP to improve performance when the surrounding conditions are optimal and the optimization level is greater than 0.
Instead of converting zoned decimal data items to packed decimal format to perform arithmetic, the compiler will convert zoned decimal data directly to DFP format and then back again to zoned decimal format after the computations are complete. This generally results in better performance, as the DFP instructions operate on in-register (compared to in-memory) data that is more efficiently handled by the hardware in many cases.
ARCH(11)
Hardware Feature: Improved conversion between packed decimal and Decimal Floating Point (DFP)
Why This Matters For COBOL Performance: At ARCH(10), the compiler is able to convert more efficiently between DISPLAY types and DFP, enabling the compiler to make significant use of DFP to improve performance of packed and external decimal arithmetic. While instructions to convert between packed decimal and DFP existed at ARCH(10), they were inefficient, and the benefit of performing packed arithmetic in DFP was outweighed by the cost of converting packed decimal values to and from DFP.
With ARCH(11), there are new instructions that convert between packed decimal and DFP more efficiently. They lower the overhead for using DFP arithmetic on packed decimal data items, enabling the compiler to make further use of DFP when the surrounding conditions are optimal and the optimization level is greater than 0.
Instead of performing arithmetic on packed decimal items, the compiler will convert packed decimal data to DFP format and then back again to packed decimal format after the computations are complete. This generally results in better performance, as the DFP instructions operate on in-register (compared to in-memory) data that is more efficiently handled by the hardware in many cases. Due to the more efficient conversion instructions, the benefit of performing arithmetic in DFP outweighs the added cost of converting between packed decimal and DFP instead of performing packed arithmetic directly.
Hardware Feature: Vector Registers
Why This Matters For COBOL Performance: The new vector facility is able to operate on up to 16 byte-sized elements in parallel. With ARCH(11), COBOL V6 is able to take advantage of the new vector instructions to accelerate some forms of INSPECT statements by working with 16 bytes at a time. This can be much faster than operating on 1 byte at a time.
ARCH(12)
Hardware Feature: Vector packed decimal instructions
Why This Matters For COBOL Performance: In ARCH(11) and below, packed decimal arithmetic can only be performed using in-memory data, or by converting the data to Decimal Floating Point (DFP). In ARCH(12), the new vector packed decimal facility enables the compiler to perform native packed decimal arithmetic on data-in registers. This provides the performance advantages of using registers instead of memory, while eliminating the overhead of converting data back and forth between packed decimal and DFP.
ARCH(13)
Hardware Feature: Ability to suppress hardware overflow exceptions on individual vector packed decimal instructions
Why This Matters For COBOL Performance:When a packed decimal overflow occurs, the hardware can suppress the overflow without doing anything, and this is the default COBOL behavior, or it can raise an exception. This is controlled by an application-wide hardware setting. As the correct behavior for COBOL programs is to have the overflow exception suppressed, Enterprise COBOL programs do not change this setting. In a pure COBOL application, all overflows are suppressed at the hardware level. In a mixed-language application, other languages turn this setting on, causing exceptions. As the setting is application-wide, this affects COBOL programs as well. The exceptions are handled by LE, which chooses to suppress them if they're generated from a COBOL program, but there's a performance penalty for LE getting involved. COBOL programs also do not turn the setting on and off, as in programs with few or no overflows, that would also incur a performance penalty.
At ARCH(13), the vector packed decimal instructions introduced at ARCH(12) can indicate, per instruction, whether the overflow should be suppressed or not. This allows the hardware to suppress the overflows for COBOL programs without getting LE involved, and without the overhead of changing the setting.