-qpdf1, -qpdf2

Pragma equivalent

None.

Purpose

Tunes optimizations through profile-directed feedback (PDF), where results from sample program execution are used to improve optimization near conditional branches and in frequently executed code sections.

Optimizes an application for a typical usage scenario based on an analysis of how often branches are taken and blocks of code are run.

Syntax

Read syntax diagramSkip visual syntax diagram
        .-nopdf2---------------------------------.   
        +-nopdf1---------------------------------+   
>>- -q--+-pdf1--+------------------------------+-+-------------><
        |       |    .-exename---------------. | |   
        |       '-=--+-pdfname--=--file_path-+-' |   
        |            +-defname---------------+   |   
        |            +-level--=--+-0-+-------+   |   
        |            |           +-1-+       |   |   
        |            |           '-2-'       |   |   
        |            +-unique----------------+   |   
        |            '-nounique--------------'   |   
        '-pdf2--+------------------------------+-'   
                |    .-exename---------------. |     
                '-=--+-pdfname--=--file_path-+-'     
                     '-defname---------------'       

Defaults

-qnopdf1 when -qpdf1 is not specified; -qnopdf2 when -qpdf2 is not specified.

-qpdf1=exename when -qpdf1 is specified without a suboption; -qpdf2=exename when -qpdf2 is specified without a suboption.

Parameters

exename
Names the generated PDF file as .<output_name>_pdf, where <output_name> is the name of the output file that is generated when you compile your program with -qpdf1.
pdfname=file_path
Specifies the directories and names for the PDF files and any existing PDF map files. If the PDFDIR environment variable is set, the compiler places the PDF and PDF map files in the directory that is specified by PDFDIR; otherwise, the compiler places these files in the current working directory. If the PDFDIR environment variable is set but the specified directory does not exist, the compiler issues a warning message. The name of the PDF map file follows the name of the PDF file if the -qpdf1=unique option is not specified. For example, if you specify the -qpdf1=pdfname=/home/joe/func option, the generated PDF file is called func, and the PDF map file is called func_map. Both of the files are placed in the /home/joe directory. You can use the pdfname suboption to do simultaneous runs of multiple executable applications by using the same directory. This approach is especially useful when you are tuning dynamic libraries with PDF.
defname
Names the generated PDF file as ._pdf.
level=0 | 1 | 2
Specifies different levels of profiling information to be generated by the resulting application. The following table shows the type of profiling information supported on each level. The plus sign (+) indicates that the profiling type is supported.
Table 1. Profiling type supported on each -qpdf1 level
Profiling type Level
0 1 2
Block-counter profiling + + +
Call-counter profiling + + +
Value profiling   + +
Cache-miss profiling     +

-qpdf1=level=1 is the default level. It is equivalent to -qpdf1. Higher PDF levels profile more optimization opportunities but have a larger overhead.

Notes:
  • Only one application that is compiled with the -qpdf1=level=2 option can be run at a time on a particular processor.
  • Cache-miss profiling information has several levels. Accordingly, if you want to gather different levels of cache-miss profiling information, set the PDF_PM_EVENT environment variable to L1MISS, L2MISS, or L3MISS (if applicable). Only one level of cache-miss profiling information can be instrumented at a time. L2 cache-miss profiling is the default level.
  • If you want to bind your application to a specified processor for cache-miss profiling, set the PDF_BIND_PROCESSOR environment variable equal to the processor number.
unique | nounique
You can use the -qpdf1=unique option to avoid locking a single PDF file when multiple processes are writing to the same PDF file in the PDF training step. This option specifies whether a unique PDF file is created for each process during run time. The PDF file name is <pdf_file_name>.<pid>. <pdf_file_name> is one of the following names:
  • .<output_name>_pdf by default.
  • The name that is specified by pdfname when this suboption is in effect.
  • ._pdf when the defname suboption takes effect.
<pid> is the ID of the running process in the PDF training step. For example, if you specify the -qpdf1=unique:pdfname=abc option, and there are two processes for PDF training with the IDs 12345678 and 87654321, two PDF files abc.12345678 and abc.87654321 are generated.
Note: When -qpdf1=unique is specified, multiple PDF files with process IDs as suffixes are generated. You must use the mergepdf program to merge all these PDF files into one after the PDF training step.

Usage

The PDF process consists of the following three steps:

  1. Compile your program with the -qpdf1 option and a minimum optimization level of -O2. By default, a PDF map file that is named .<output_name>_pdf_map and a resulting application are generated.
  2. Run the resulting application with a typical data set. Profiling information is written to a PDF file named .<output_name>_pdf by default. This step is called the PDF training step.
  3. Recompile and link or relink the program with the -qpdf2 option and the optimization level used with the -qpdf1 option. The -qpdf2 process fine-tunes the optimizations according to the profiling information collected when the resulting application is run.

Predefined macros

None.

Examples

Example 1
The example uses the -qpdf1=level=0 option to reduce possible runtime instrumentation overhead.
  1. Compile all the files with -qpdf1=level=0.
    xlc -qpdf1=level=0 -O3 file1.c file2.c file3.c
  2. Run with one set of input data.
    ./a.out < sample.data 
  3. Recompile all the files with -qpdf2.
    xlc -qpdf2 -O3 file1.c file2.c file3.c
If the sample data is typical, the program can run faster than without the PDF process.

Example 2

The following example uses the -qpdf1=level=1 option.
  1. Compile all the files with -qpdf1.
    xlc -qpdf1 -O3 file1.c file2.c file3.c
  2. Run with one set of input data.
    ./a.out < sample.data 
  3. Recompile all the files with -qpdf2.
    xlc -qpdf2 -O3 file1.c file2.c file3.c
If the sample data is typical, the program can now run faster than without the PDF process.

Example 3

The following example uses the -qpdf1=level=2 option to gather cache-miss profiling information.
  1. Compile all the files with -qpdf1=level=2.
    xlc -qpdf1=level=2 -O3 file1.c file2.c file3.c
  2. Set PM_EVENT=L2MISS to gather L2 cache-miss profiling information.
    export PDF_PM_EVENT=L2MISS
  3. Run with one set of input data.
    ./a.out < sample.data 
  4. Recompile all the files with -qpdf2.
    xlc -qpdf2 -O3 file1.c file2.c file3.c
If the sample data is typical, the program can now run faster than without the PDF process.

Example 4

This example demonstrates the usage of the -qpdf[1|2]=exename option.
  1. Compile all the files with -qpdf1=exename.
    xlc -qpdf1=exename -O3 -o final file1.c file2.c file3.c
  2. Run executable with sample input data.
    ./final < typical.data
  3. List the content of the directory.
    >ls -lrta
    -rw-r--r-- 1 user staff 50 Dec 05 13:18 file1.c
     -rw-r--r-- 1 user staff 50 Dec 05 13:18 file2.c
     -rw-r--r-- 1 user staff 50 Dec 05 13:18 file3.c
     -rwxr-xr-x 1 user staff 12243 Dec 05 17:00 final
     -rwxr-Sr-- 1 user staff 762 Dec 05 17:03 .final_pdf
  4. Recompile all the files with -qpdf2=exename.
    xlc -qpdf2=exename -O3 -o final file1.c file2.c file3.c
The program is now optimized by using PDF information.

Example 5

The following example demonstrates the usage of the -qpdf[1|2]=pdfname option.
  1. Compile all the files with -qpdf1=pdfname. The static profiling information is recorded in a file that is named final_map.
    xlc -qpdf1=pdfname=final -O3 file1.c file2.c file3.c
  2. Run the executable file with sample input data. The profiling information is recorded in a file that is named final.
    ./a.out < typical.data 
  3. List the content of the directory.
    >ls -lrta
    -rw-r--r-- 1 user staff 50 Dec 05 13:18 file1.c
     -rw-r--r-- 1 user staff 50 Dec 05 13:18 file2.c
     -rw-r--r-- 1 user staff 50 Dec 05 13:18 file3.c
     -rwxr-xr-x 1 user staff 12243 Dec 05 18:30 a.out
     -rwxr-Sr-- 1 user staff 762 Dec 05 18:32 final
  4. Recompile all the files with -qpdf2=pdfname.
    xlc -qpdf2=pdfname=final -O3 file1.c file2.c file3.c
The program is now optimized by using PDF information.

Example 6

The following example demonstrates the use of the PDF_BIND_PROCESSOR environment variable.
  1. Compile all the files with -qpdf1=level=1.
    xlc -qpdf1=level=1 -O3 file1.c file2.c file3.c
  2. Set PDF_BIND_PROCESSOR environment variable so that all processes for this executable file are run on processor 1.
    export PDF_BIND_PROCESSOR=1
  3. Run executable with sample input data.
    ./a.out < sample.data
  4. Recompile all the files with -qpdf2.
    xlc -qpdf2 -O3 file1.c file2.c file3.c
If the sample data is typical, the program can now run faster than without the PDF process.


Voice your opinion on getting help information Ask IBM compiler experts a technical question in the IBM XL compilers forum Reach out to us