vec_permx

Purpose

Performs a partial permute of the first two arguments, which form an aligned 32-byte section of an emulated vector up to 256 bytes wide, using the partial permute control vector in the third argument. The fourth argument identifies which 32-byte section of the emulated vector is contained in the first two arguments.

Note: This built-in function is valid only when the -mcpu option is set to target Power10 processors.

Syntax

d=vec_permx(a,b,c,e)

Result and argument types

The following table describes the types of the returned value and the function arguments.

Table 1. Result and argument types
d a b c e
vector signed char vector signed char vector signed char vector unsigned char const int
vector unsigned char vector unsigned char vector unsigned char vector unsigned char const int
vector signed short vector signed short vector signed short vector unsigned char const int
vector unsigned short vector unsigned short vector unsigned short vector unsigned char const int
vector signed int vector signed int vector signed int vector unsigned char const int
vector unsigned int vector unsigned int vector unsigned int vector unsigned char const int
vector signed long long vector signed long long vector signed long long vector unsigned char const int
vector unsigned long long vector unsigned long long vector unsigned long long vector unsigned char const int
vector float vector float vector float vector unsigned char const int
vector double vector double vector double vector unsigned char const int
Note: e is constrained to values of 0 to 7, inclusive.

Result value

Let s be the concatenation of a and b. For each integer value i from 0 to 15, do the following: let j be the contents of bits 3 through 7 of byte element i of c. If e is equal to the contents of bits 0 through 2 of byte element i of c, the contents of byte element j of s are placed into byte element i of d. Otherwise, the contents of byte element i of d are set to zero.

This built-in function can be used to emulate permutes on vectors up to 256 bytes in length, and can also be used to perform a parallel table lookup on up to 256-bytes tables.