-qtgtarch

Pragma equivalent

None.

Purpose

Specifies real or virtual GPU architectures where the code may run. This allows the compiler to take maximum advantage of the capabilities and machine instructions that are specific to a GPU architecture, or common to a virtual architecture.

The compiler automatically detects the GPU architecture at compiler configuration time. The GPU architecture is encoded into the compiler configuration file. You can override the default by using the -qtgtarch option.

Syntax

Read syntax diagramSkip visual syntax diagram
                .-:----------------------------.   
                V .-default------------------. |   
>>- -qtgtarch-=---+-auto---------------------+-+---------------><
                  +-virtual_GPU_architecture-+     
                  '-real_GPU_architecture----'     

Default

-qtgtarch=default

Parameters

auto
The architecture of device 0 of the system on which the compiler is being executed.
default
The default architecture, which is determined as follows:
  1. The architecture specified by the cuda_cc_major and cuda_cc_minor properties which are set in the configuration file;
  2. If not specified, the architecture of device 0 of the system on which the compiler is being executed;
  3. If there is no device 0, sm_35.
real_GPU_architecture
A real GPU architecture, such as sm_35, sm_60, or sm_70, as defined by the CUDA Toolkit.
virtual_GPU_architecture
A virtual GPU architecture, such as compute_35, compute_60, or compute_70, as defined by the CUDA Toolkit. Virtual GPU architectures specify the features which are supported in the high level PTX code.

Rules

The PTX intermediate code is generated based on the specified virtual GPU architectures and then embedded in the resulting object file or executable. To generate and embed the compiled code images, specify real GPU architectures. The compiled code images for the real GPU architectures are generated from the PTX code.

Each -qtgtarch option is used to generate PTX code for exactly one virtual GPU architecture and optionally compiled code images for one or more compatible real GPU architectures. If you need to generate PTX code for multiple virtual GPU architectures, specify the -qtgtarch option multiple times, once for each virtual GPU architecture.

The compiler converts between virtual and real GPU architectures when needed, for example, when no virtual architecture is specified, or when multiple virtual GPU architectures are specified.

You can specify the -qtgtarch option multiple times, even for the same virtual GPU architecture. The resulting effect is cumulative.

Detailed rules for specifying the -qtgtarch option are listed as follows:

Table 1. Detailed rules for specifying one -qtgtarch option
Number of virtual GPU architectures specified Number of real GPU architectures specified The virtual GPU architectures for which the PTX code is generated The real GPU architectures for which the compiled code images are generated
0 At least one The virtual GPU architecture corresponding to the lowest level real GPU architecture specified The real GPU architectures specified
1 0 The virtual GPU architecture specified N/A
Note: When no compiled code image is embedded in the resulting object file or executable, a compiled code image will be generated from the PTX code using just-in-time compilation at link or execution time, if needed.
1 At least one The virtual GPU architecture specified The real GPU architectures specified
More than one 0 The lowest level virtual GPU architecture specified The real GPU architectures corresponding to all but the lowest virtual GPU architecture specified
More than one At least one The lowest level virtual GPU architecture specified The real GPU architectures specified and the real GPU architectures corresponding to all but the lowest virtual GPU architecture specified

Predefined macros

None.

Examples

Examples for specifying the -qtgtarch option are listed as follows:

Table 2. Examples for specifying the -qtgtarch option
Command examples The virtual GPU architectures for which the PTX code is generated The real GPU architectures for which the compiled code images are generated
-qtgtarch=sm_60
compute_60 sm_60
Note: The compiled code images are generated from the PTX code.
Assuming the compiler is running on a machine with a GPU with architecture sm_37:
-qtgtarch=auto
compute_37 sm_37
Note: The compiled code image is generated from the PTX code.
-qtgtarch=compute_35:
compute_37:sm_37:sm_60
compute_35 sm_37 and sm_60
Note: The compiled code images are generated from the PTX code.
-qtgtarch=sm_37:sm_60
compute_37 sm_37, and sm_60
Note: The compiled code images are generated from the PTX code.
Assuming the compiler is running on a machine with a GPU with architecture sm_37:
-qtgtarch=auto:sm_60
compute_37 sm_37, and sm_60
Note: The compiled code images are generated from the PTX code.
-qtgtarch=sm_35
-qtgtarch=sm_60
compute_35 and compute_60 sm_35 and sm_60
Note: The sm_35 and sm_60 compiled code images are generated from the PTX code for compute_35 and compute_60 correspondingly.

Related information



Voice your opinion on getting help information Ask IBM compiler experts a technical question in the IBM XL compilers forum Reach out to us