Offloading computations to the NVIDIA GPUs
System prerequisites
To compile and link programs that contain code to be offloaded to the NVIDIA GPUs with IBM XL C/C++ for Linux, you must ensure the following operating system, hardware, and software requirements are met.- Use any IBM Power Systems™ server that has one or more NVIDIA GPUs installed and is supported by your Linux operating system distribution and the NVIDIA CUDA Toolkit.
- Use the supported little endian operating system.
- Install NVIDIA CUDA Toolkit 8.0.
Programming with supported OpenMP 4.5 device constructs
- omp target data
- omp target enter data
- omp target exit data
- omp target
- omp target update
- omp declare target
- omp teams
- omp distribute
- omp distribute parallel for
For example, you can use the omp target directive to define a target region, which is a block of computation that operates within a distinct data environment and is intended to be offloaded onto a parallel computation device during execution. For more information about the OpenMP directives, see Pragma directives for parallel processing in the XL C/C++ Compiler Reference.
You can also use other OpenMP constructs with these OpenMP device constructs to exert finer control on parallelization, such as the combined constructs that are listed in Combined constructs in the XL C/C++ Compiler Reference.
You must specify the -qoffload option to enable the support for offloading OpenMP target regions to NVIDIA GPUs. For -qoffload to take effect, you must also specify the -qsmp option to enable support for OpenMP target regions. For more information, see -qoffload in the XL C/C++ Compiler Reference.
You can also use the XLSMPOPTS=target={mandatory | optional | disable} environment variable to control which device to execute target regions on. For more information, see XLSMPOPTS in the XL C/C++ Compiler Reference.
| To query the target environment | To manage device memory |
|---|---|
|
|
Using IBM XL C/C++ for Linux with NVCC
The NVIDIA CUDA C++ compiler (NVCC) from the NVIDIA CUDA Toolkit partitions C/C++ source code into host and device portions. You can use IBM XL C/C++ for Linux as the host compiler for the POWER processor with NVCC 7.5 or 8.0. For more information, see the NVIDIA CUDA on IBM POWER8®: Technical overview, software installation, and application development downloadable from http://www.redbooks.ibm.com/redpapers/pdfs/redp5169.pdf.


