IBM Support

LI81498: INCORRECT RESULTS FOR OPENMP GPU CODE WHEN COMPILED WITH -QSMP=OMP:NOOPT

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • INCORRECT RESULTS FOR OPENMP GPU CODE WHEN COMPILED WITH
    -QSMP=OMP:NOOPT USING XL C/C++ FOR LINUX, V16.1
    
    ----------------------------------------------------------------
    
    
    Actual results:
    
    mpirun has exited due to process rank 0 with PID 0 on
    node XXXX exiting improperly. There are three reasons this
    could occur:
    
    1. this process did not call "init" before exiting, but others
    in
    the job did. This can cause a job to hang indefinitely while it
    waits
    for all processes to call "init". By rule, if one process calls
    "init",
    then ALL processes must call "init" prior to termination.
    
    2. this process called "init", but exited without calling
    "finalize".
    By rule, all processes that call "init" MUST call "finalize"
    prior to
    exiting or it will be considered an "abnormal termination"
    
    3. this process called "MPI_Abort" or "orte_abort" and the mca
    parameter
    orte_create_session_dirs is set to false. In this case, the
    run-time cannot
    detect that the abort call was an abnormal termination. Hence,
    the only
    error message you will receive is this one.
    
    This may have caused other processes in the application to be
    terminated by signals sent by mpirun (as reported here).
    
    You can avoid this message by specifying -quiet on the mpirun
    command line.
    ----------------------------------------------------------------
    
    + grep 'out t1(0,0,0)' run.out
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_z_merge
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45_pre3D
    out t1(0,0,0)= (1.911613e-04, -3.186692e-04)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (3.687190e-04, -4.624840e-05)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45_pre3D
    out t1(0,0,0)= (3.043526e-04, 2.132187e-04)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (2.639426e-04, 2.615854e-04)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45_pre3D
    out t1(0,0,0)= (-1.261941e-04, -3.495250e-04)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (1.971447e-04, -3.150026e-04)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45_pre3D
    out t1(0,0,0)= (-6.531194e-05, -3.658238e-04)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (3.615046e-04, 8.606419e-05)
    rotth_omp45_pre3D_noopt
    + grep 'out t1(13,0,0)' run.out
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_z_merge
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45_pre3D
    out t1(13,0,0)= (-4.859035e-04, 9.440590e-06)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (2.845411e-04, 3.939896e-04)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45_pre3D
    out t1(13,0,0)= (-3.710392e-04, 3.138816e-04)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (1.617414e-04, -4.582917e-04)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45_pre3D
    out t1(13,0,0)= (4.743125e-04, 1.059203e-04)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (-1.831228e-04, -4.501752e-04)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45_pre3D
    out t1(13,0,0)= (-4.628781e-04, 1.481053e-04)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (2.245736e-04, 4.309967e-04)
    rotth_omp45_pre3D_noopt
    + grep 'out t1(117,20,17)' run.out
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04) rotth_z_merge
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04) rotth_omp45
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04)
    rotth_omp45_pre3D
    out t1(117,20,17)= (-2.669854e-04, -2.804822e-04)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (2.585692e-04, 3.522311e-04)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04)
    rotth_omp45_pre3D
    out t1(117,20,17)= (4.367254e-04, -1.399024e-05)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (-3.091341e-04, -3.088056e-04)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04)
    rotth_omp45_pre3D
    out t1(117,20,17)= (-3.876753e-04, -2.015757e-04)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (4.367254e-04, -1.399024e-05)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04)
    rotth_omp45_pre3D
    out t1(117,20,17)= (-2.887358e-04, 3.279581e-04)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (2.490658e-04, 3.590140e-04)
    rotth_omp45_pre3D_noopt
    ----------------------------------------------------------------
    
    The expected results:
    
    ++ grep 'out t1(0,0,0)' run.out
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_z_merge
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45_pre3D
    out t1(0,0,0)= (5.340375e-05, -3.677514e-04)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45_pre3D
    out t1(0,0,0)= (5.340375e-05, -3.677514e-04)
    rotth_omp45_pre3D_noopt
    out t1(0,0,0)= (5.340379e-05, -3.677514e-04) rotth_omp45_pre3D
    out t1(0,0,0)= (5.340375e-05, -3.677514e-04)
    rotth_omp45_pre3D_noopt
    ++ grep 'out t1(13,0,0)' run.out
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_z_merge
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45_pre3D
    out t1(13,0,0)= (-3.919085e-04, 2.874044e-04)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45_pre3D
    out t1(13,0,0)= (-3.919085e-04, 2.874044e-04)
    rotth_omp45_pre3D_noopt
    out t1(13,0,0)= (-3.919083e-04, 2.874045e-04) rotth_omp45_pre3D
    out t1(13,0,0)= (-3.919085e-04, 2.874044e-04)
    rotth_omp45_pre3D_noopt
    ++ grep 'out t1(117,20,17)' run.out
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04) rotth_z_merge
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04) rotth_omp45
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04)
    rotth_omp45_pre3D
    out t1(117,20,17)= (-2.708045e-04, 3.429135e-04)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04)
    rotth_omp45_pre3D
    out t1(117,20,17)= (-2.708045e-04, 3.429135e-04)
    rotth_omp45_pre3D_noopt
    out t1(117,20,17)= (-2.708045e-04, 3.429134e-04)
    rotth_omp45_pre3D
    out t1(117,20,17)= (-2.708045e-04, 3.429135e-04)
    rotth_omp45_pre3D_noopt
    

Local fix

  • N/A
    

Problem summary

  • USERS AFFECTED:
    Customers compiling OpenMP GPU programs are affected by this
    issue
    
    PROBLEM DESCRIPTION:
    A target region inside declare target functions is miscompiled
    

Problem conclusion

  • The compiler can now successfully handle target regions inside
    declare target functions. However, executing a target region
    when executing the device copy of the declare target
    procedure(that contains the target region) is still considered
    undefined
    behavior in OpenMP 4.5 and 5.0, and with this fix the user will
    encounter a trap if he attempts the aforementioned scenario.
    

Temporary fix

Comments

APAR Information

  • APAR number

    LI81498

  • Reported component name

    XL C/C++ LINUX

  • Reported component ID

    5725C7310

  • Reported release

    G11

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-05-11

  • Closed date

    2020-06-23

  • Last modified date

    2020-06-23

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    XL C/C++ LINUX

  • Fixed component ID

    5725C7310

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSXVZZ","label":"XL C\/C++ for Linux"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"G11","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
24 June 2020