Architecture exploitation

COBOL 6 continues to support the ARCH option (short for architecture) introduced in COBOL 5, which can be used to fully exploit new hardware instructions. Starting from COBOL 6.3 with service applied, a TUNE option is introduced that can be used to specify the architecture for which the executable program will be optimized. Use both options to get the most out of your hardware investment.

ARCH tells the compiler which instructions to choose from, out of the available instructions. For example, ARCH(10) specifies the only instructions available on a zEC12 or zBC12 can be used. Instructions added on a z13®, z13s®, or the later hardware, will not be used, ensuring that your program will not contain instructions that don't exist on a zEC12 or zBC12.

While ARCH determines which instructions the compiler can choose from, the compiler still must makes decisions about which instructions to use for a given COBOL statement. These choices are governed by TUNE, which instructs the compiler to make optimal decisions for a particular hardware model, limited by the available instructions as determined by ARCH.

The default setting for ARCH is 10, and the default setting for TUNE is to match ARCH. Other supported values are 11, 12, 13, and 14. The TUNE level must always be greater or equal to the ARCH level. Set the ARCH value to match the architecture of the oldest machine where your application will run, including any disaster recovery (DR) machines. Set the TUNE value to match the architecture where your application will run most often.

For more information on the facilities available at each level, and the mapping of these ARCH levels to specific hardware models, see ARCH in the Enterprise COBOL for z/OS® Programming Guide.

For more information about the mapping of TUNE levels to specific hardware models, see TUNE in the Enterprise COBOL for z/OS Programming Guide.

Each successive ARCH level allows the compiler to exploit more facilities in your hardware leading to the potential for increased performance. To illustrate the benefits from a COBOL application perspective, each ARCH and TUNE level will be examined in greater detail below.

ARCH(10)

Hardware Feature: Improved Decimal Floating Point (DFP) Performance

Why This Matters For COBOL Performance: Older hardware models provided DFP instructions that the compiler could make use to improve performance of packed and external decimal arithmetic.ARCH(10) goes further by adding efficient instructions to convert between DISPLAY (in particular unsigned and trailing signed overpunch zoned decimal) types and DFP.

These ARCH(10) instructions lower the overhead for using DFP for arithmetic on zoned decimal data items and enable the compiler to make much greater use of DFP to improve performance when the surrounding conditions are optimal and the optimization level is greater than 0.

Instead of converting zoned decimal data items to packed decimal format to perform arithmetic, the compiler will convert zoned decimal data directly to DFP format and then back again to zoned decimal format after the computations are complete. This generally results in better performance, as the DFP instructions operate on in-register (compared to in-memory) data that is more efficiently handled by the hardware in many cases.

ARCH(10) gives the best performance on zEnterprise® EC12 and zEnterprise BC12.

ARCH(11)

Hardware Feature: Improved conversion between packed decimal and Decimal Floating Point (DFP)

Why This Matters For COBOL Performance: At ARCH(10), the compiler is able to convert more efficiently between DISPLAY types and DFP, enabling the compiler to make significant use of DFP to improve performance of packed and external decimal arithmetic. While instructions to convert between packed decimal and DFP existed at ARCH(10), they were inefficient, and the benefit of performing packed arithmetic in DFP was outweighed by the cost of converting packed decimal values to and from DFP.

With ARCH(11), there are new instructions that convert between packed decimal and DFP more efficiently. They lower the overhead for using DFP arithmetic on packed decimal data items, enabling the compiler to make further use of DFP when the surrounding conditions are optimal and the optimization level is greater than 0.

Instead of performing arithmetic on packed decimal items, the compiler will convert packed decimal data to DFP format and then back again to packed decimal format after the computations are complete. This generally results in better performance, as the DFP instructions operate on in-register (compared to in-memory) data that is more efficiently handled by the hardware in many cases. Due to the more efficient conversion instructions, the benefit of performing arithmetic in DFP outweighs the added cost of converting between packed decimal and DFP instead of performing packed arithmetic directly.

Hardware Feature: Vector Registers

Why This Matters For COBOL Performance: The new vector facility is able to operate on up to 16 byte-sized elements in parallel. With ARCH(11), COBOL 6 is able to take advantage of the new vector instructions to accelerate some forms of INSPECT statements by working with 16 bytes at a time. This can be much faster than operating on 1 byte at a time.

ARCH(11) gives the best performance on z13 and z13s.

ARCH(12)

Hardware Feature: Vector packed decimal instructions

Why This Matters For COBOL Performance: In ARCH(11) and below, packed decimal arithmetic can only be performed using in-memory data, or by converting the data to Decimal Floating Point (DFP). In ARCH(12), the new vector packed decimal facility enables the compiler to perform native packed decimal arithmetic on data-in registers. This provides the performance advantages of using registers instead of memory, while eliminating the overhead of converting data back and forth between packed decimal and DFP.

ARCH(12) gives the best performance on z14 and z14 ZR1.

ARCH(13)

Hardware Feature: Ability to suppress hardware overflow exceptions on individual vector packed decimal instructions

Why This Matters For COBOL Performance: When a packed decimal overflow occurs, the hardware can suppress the overflow without doing anything, and this is the default COBOL behavior, or it can raise an exception. This is controlled by an application-wide hardware setting. As the correct behavior for COBOL programs is to have the overflow exception suppressed, Enterprise COBOL programs do not change this setting. In a pure COBOL application, all overflows are suppressed at the hardware level. In a mixed-language application, other languages turn this setting on, causing exceptions. As the setting is application-wide, this affects COBOL programs as well. The exceptions are handled by LE, which chooses to suppress them if they're generated from a COBOL program, but there's a performance penalty for LE getting involved. COBOL programs also do not turn the setting on and off, as in programs with few or no overflows, that would also incur a performance penalty.

At ARCH(13), the vector packed decimal instructions introduced at ARCH(12) can indicate, per instruction, whether the overflow should be suppressed or not. This allows the hardware to suppress the overflows for COBOL programs without getting LE involved, and without the overhead of changing the setting.

ARCH(13) gives the best performance on z15® and z15 T02.

ARCH(14)

Hardware Feature: Vector packed-decimal enhancement facility 2

Why This Matters For COBOL Performance: This new facility adds performance improvements for COBOL programs that contain one or more of the following types of statements:
  • Exponentiation operations on packed or zoned decimal data items where the exponent is declared with one or more fractional digits
  • Arithmetic statements involving mixed decimal and floating-point data items
  • Statements using numeric-edited data items

ARCH(14) gives the best performance on z16™.