IBM Support

IV62255: MISSED RLDIMI OPTIMIZATION FOR __BPERMD AND __POPCNT8

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When doing bitwise operations after the __bpermd and __popcnt8
    built-ins, the compiler is not recognizing a code pattern and is
    not replacing a rldicr/or with a rldimi instruction:
    
    
       === TEST CASE ===
    
    unsigned long long mm(unsigned long long *p)
    {
       unsigned long long a = __bpermd(0x0001, p[0]);
       unsigned long long b = __bpermd(0x0001, p[1]);
       return (a << 8) | b;
    }
    
    
    The assembly generated is using a rldicr and or instead of a
    combined rldimi:
    
      12| 00002C bpermd   7C8001F8   1     BPERMD   gr0=gr0,gr4
      13| 000030 bpermd   7C8319F8   1     BPERMD   gr3=gr3,gr4
      14| 000034 rldicr   780045E4   1     SLL8     gr0=gr0,8
      14| 000038 or       7C031B78   1     O        gr3=gr0,gr3
      15| 00003C bclr     4E800020   1     BA        lr
    
    Other built-ins have the rldimi generated, ex:
    
    unsigned long long mm3(unsigned long long *p)
    {
       unsigned long long a = __cntlz8(p[0]);
       unsigned long long b = __cntlz8(p[1]);
       return (a << 8) | b;
    }
    
      26| 000088 cntlzd 7C000074 1 CNTLZ8  gr0=gr0
      27| 00008C cntlzd 7C630074 1 CNTLZ8  gr3=gr3
      28| 000090 rldimi 7803402C 1 RI8     gr3=gr0,8,gr3,0xFFFFFF00
      29| 000094 bclr   4E800020 1 BA      lr
    

Local fix

Problem summary

  • PROBLEM DESCRIPTION:
    When doing bitwise operations after the __bpermd and __popcnt8
    built-ins, the compiler is not generating efficient code.
    
    USERS AFFECTED:
    Users who use bitwise operations after the __bpermd and
    __popcnt8 built-ins.
    

Problem conclusion

  • The fix will help the compiler generate efficient code when
    mixing bitwise operations after the __bpermd and __popcnt8
    built-ins.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV62255

  • Reported component name

    XL C/C++ FOR AI

  • Reported component ID

    5725C7200

  • Reported release

    D10

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2014-07-07

  • Closed date

    2014-12-17

  • Last modified date

    2014-12-17

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    LI78452

Fix information

  • Fixed component name

    XL C FOR AIX

  • Fixed component ID

    5725C7100

Applicable component levels

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSGH2K","label":"XL C for AIX"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"13.1","Edition":"","Line of Business":{"code":"LOB73","label":"Power TPS"}}]

Document Information

Modified date:
21 August 2024