A fix is available
APAR status
Closed as program error.
Error description
The compiler does not generate efficient code for the following test case: ======COMPILE COMMAND: xlC -q64 -O2 -qarch=pwr7 -qlist test.cpp -qaltivec ======TESTCASE: $ cat test.cpp extern "C" int vanyeq(vector unsigned long long a, vector unsigned long long b) { return vec_any_eq(a,b); } $ =======ACTUAL OUTPUT: Listing output: | 000000 PDEF vanyeq 7| PROC a,b,vs34,vs35 9| 000040 vcmpequw 10221886 1 VCMPEQUW vs33=vs34,vs35 9| 000044 xxlxor F00004D7 1 VXOR vs32=vs32,vs32 9| 000048 addi 38600001 1 LI gr3=1 9| 00004C xxsldwi F0010916 1 VSLDWI vs0=vs33,vs33,1 9| 000050 xxland F0000C12 1 VAND vs0=vs0,vs33 9| 000054 xxspltw F0200290 1 VSPLTW vs1=vs0,0 9| 000058 xxspltw F0020290 1 VSPLTW vs0=vs0,2 9| 00005C xxpermdi F0210051 1 VMRGHD vs33=vs1,vs0 9| 000060 vcmpgtuw 10010686 1 VCMPGTUW_ vs32,cr6=vs33,vs32 9| 000064 bclr 4C9A0020 1 BF CL.4,cr6,0x4/eq,taken=50%(0,0) 9| 000068 addi 38600000 1 LI gr3=0 10| CL.4: 10| 00006C bclr 4E800020 1 BA lr =======EXPECTED OUTPUT: Optimal output: | 000000 PDEF vanyeq_opt 12| PROC a,b,vs34,vs35 14| 000080 vcmpequw 10021886 1 VCMPEQUW vs32=vs34,vs35 16| 000084 vspltisw 103F038C 1 VSPLTISW vs33=-1 16| 000088 addi 38600001 1 LI gr3=1 15| 00008C vpkuwum 1000004E 1 VPKUWUM vs32=vs32,vs32 16| 000090 vcmpequw 10000C86 1 VCMPEQUW_ vs32,cr6=vs32,vs33 16| 000094 bclr 4C9A0020 1 BF CL.5,cr6,0x4/eq,taken=50%(0,0) 16| 000098 addi 38600000 1 LI gr3=0 17| CL.5: 17| 00009C bclr 4E800020 1 BA lr
Local fix
The provided test case can be coded as follows instead: extern "C" int vanyeq_opt(vector unsigned long long a, vector unsigned long long b) { vector unsigned int cmp = (vector unsigned int)vec_cmpeq((vector unsigned int)a, (vector unsigned int)b); cmp = (vector unsigned int)vec_pack(cmp,cmp); return vec_any_eq(cmp,vec_splats((unsigned int)(-1))); }
Problem summary
USERS AFFECTED: Users using vec built-ins vec_any_eq or similar routines maybe affected by this issue. PROBLEM DESCRIPTION: Compiler generates inefficient codes for vec_any_eq built-ins.
Problem conclusion
Compiler has been fixed to generate more efficient codes for vector built-ins like vec_any_eq.
Temporary fix
Comments
APAR Information
APAR number
LI78438
Reported component name
XL C/C++ FOR LI
Reported component ID
5725C7300
Reported release
D10
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2015-02-25
Closed date
2015-02-25
Last modified date
2015-02-25
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
XL C/C++ FOR LI
Fixed component ID
5725C7300
Applicable component levels
RD10 PSN IV62060
UP06/09/13
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSXVZZ","label":"XL C\/C++ for Linux"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"13.1","Line of Business":{"code":"LOB57","label":"Power"}}]
Document Information
Modified date:
17 October 2021