Programming with OpenMP device constructs

IBM® XL C/C++ for Linux, V13.1.6 partially supports the OpenMP Application Program Interface Version 4.5 specification. You can offload compute-intensive parts of an application and associated data to the NVIDIA GPUs by using these supported device constructs.

Supported device constructs

omp target data
omp target enter data
omp target exit data
omp target
omp target update
omp declare target
omp teams
omp distribute
omp distribute parallel for
omp distribute parallel for simd
omp distribute simd

For example, you can use the omp target directive to define a target region, which is a block of computation that operates within a distinct data environment and is intended to be offloaded onto a parallel computation device during execution. For more information about the OpenMP directives, see Pragma directives for parallel processing.

You can also use other OpenMP constructs with these OpenMP device constructs to exert finer control on parallelization, such as the combined constructs that are listed in Combined constructs.

GPU-related compiler options

You must specify the -qoffload option to enable the support for offloading OpenMP target regions to NVIDIA GPUs. For -qoffload to take effect, you must also specify the -qsmp option to enable support for OpenMP target regions. For more information, see -qoffload.

You can also use the -qtgtarch option to specify real or virtual GPU architectures where the code may run. For more information, see -qtgtarch.

GPU-related environment variables

You can control the computations offloaded to target devices by using the following environment variables:

XLSMPOPTS=target={mandatory | default | disabled} controls which device to execute target regions on.
XLSMPOPTS=cudamemcheckfriendly={off | on} controls whether to disable the check for pinned memory in the runtime and allow the program to be executed under the cuda-memcheck tool from the NVIDIA CUDA Toolkit.

For more information, see XLSMPOPTS in the XL C/C++ Compiler Reference.

GPU-related runtime functions

You can use the supported runtime functions, for example, to query the target environment or to manage device memory.

Table 1. Some useful OpenMP runtime functions for offloading computations to the NVIDIA GPUs
To query the target environment	To manage device memory
`omp_get_default_device` `omp_get_initial_device` `omp_get_num_devices` `omp_get_num_teams` `omp_get_team_num` `omp_is_initial_device`	`omp_target_alloc` `omp_target_associate_ptr` `omp_target_disassociate_ptr` `omp_target_free` `omp_target_is_present` `omp_target_memcpy`

For more information about OpenMP runtime functions, see OpenMP runtime functions for parallel processing in the XL C/C++ Compiler Reference.

Voice your opinion on getting help information

Ask IBM compiler experts a technical question in the IBM XL compilers forum