Programming with OpenMP device constructs
IBM® XL C/C++ for Linux® 16.1.1 fully supports the OpenMP Application Program Interface Version 4.5 specification. You can offload compute-intensive parts of an application and associated data to the NVIDIA GPUs by using the supported device constructs.
Supported device constructs
- omp target data
- omp target enter data
- omp target exit data
- omp target
- omp target update
- omp declare target
- omp teams
- omp distribute
- omp distribute parallel for
- omp distribute parallel for simd
- omp distribute simd
For example, you can use the omp target directive to define a target region, which is a block of computation that operates within a distinct data environment and is intended to be offloaded onto a parallel computation device during execution. For more information about the OpenMP directives, see Pragma directives for OpenMP parallelization.
You can also use other OpenMP constructs with these OpenMP device constructs to exert finer control on parallelization, such as the combined constructs that are listed in Combined constructs.
GPU-related compiler options
You must specify the -qoffload option to enable the support for offloading OpenMP target regions to NVIDIA GPUs. For -qoffload to take effect, you must also specify the -qsmp option to enable support for OpenMP target regions. For more information, see -qoffload.
You can also use the -qtgtarch option to specify real or virtual GPU architectures where the code may run. For more information, see -qtgtarch.
GPU-related environment variables
- XLSMPOPTS=target={mandatory | default | disabled} controls which device to execute target regions on.
- XLSMPOPTS=cudamemcheckfriendly={off | on} controls whether to disable the check for pinned memory in the runtime and allow the program to be executed under the cuda-memcheck tool from the NVIDIA CUDA Toolkit.
For more information, see XLSMPOPTS .
GPU-related runtime functions
To query the target environment | To manage device memory |
---|---|
|
|