Using the SIMD libraries
The MASS SIMD library contains a set of frequently used math intrinsic functions that provide improved performance over the corresponding standard system library functions.
The SIMD libraries shipped with IBM® Open XL C/C++ are listed below:
- libmass_simd.a
- The generic SIMD library that runs on any supported POWER® processor. It provides balanced performance tuning across the range of processors while favoring Power9 and Power10. Unless your application requires this portability, use the appropriate architecture-specific library below for maximum performance.
- libmass_simdp7.a
- Contains functions that are tuned for the POWER7 architecture.
- libmass_simdp8.a
- Contains functions that are tuned for the POWER8 architecture.
- libmass_simdp9.a
- Contains functions that are tuned for the POWER9 architecture.
- libmass_simdp10.a
- Contains functions that are tuned for the Power10 architecture.
- libmass_simdp11.a
- Contains functions that are tuned for the Power11 architecture.
If you want to use the MASS SIMD functions, follow the procedure below:
- Provide the prototypes for the functions by including mass_simd.h in your source files.
- Choose the appropriate MASS SIMD library above and link it with your application. For instructions, see Compiling and linking a program with MASS.
The single-precision MASS SIMD functions accept single-precision arguments and return single-precision results. Likewise, the double-precision MASS SIMD functions accept double-precision arguments and return double-precision results. They are summarized in Table 1.
Double-precision function | Single-precision function | Description | Double-precision function prototype | Single-precision function prototype |
---|---|---|---|---|
acosd2 | acosf4 | Computes the arc cosine of each element of vx . |
vector double acosd2 (vector double vx); | vector float acosf4 (vector float vx); |
acoshd2 | acoshf4 | Computes the arc hyperbolic cosine of each element of
vx . |
vector double acoshd2 (vector double vx); | vector float acoshf4 (vector float vx); |
asind2 | asinf4 | Computes the arc sine of each element of vx . |
vector double asind2 (vector double vx); | vector float asinf4 (vector float vx); |
asinhd2 | asinhf4 | Computes the arc hyperbolic sine of each element of
vx . |
vector double asinhd2 (vector double vx); | vector float asinhf4 (vector float vx); |
atand2 | atanf4 | Computes the arc tangent of each element of vx . |
vector double atand2 (vector double vx); | vector float atanf4 (vector float vx); |
atan2d2 | atan2f4 | Computes the arc tangent of each element of
vx/vy . |
vector double atan2d2 (vector double vx, vector double vy); | vector float atan2f4 (vector float vx, vector float vy); |
atanhd2 | atanhf4 | Computes the arc hyperbolic tangent of each element of
vx . |
vector double atanhd2 (vector double vx); | vector float atanhf4 (vector float vx); |
cbrtd2 | cbrtf4 | Computes the cube root of each element of vx . |
vector double cbrtd2 (vector double vx); | vector float cbrtf4 (vector float vx); |
cosd2 | cosf4 | Computes the cosine of each element of vx . |
vector double cosd2 (vector double vx); | vector float cosf4 (vector float vx); |
coshd2 | coshf4 | Computes the hyperbolic cosine of each element of
vx . |
vector double coshd2 (vector double vx); | vector float coshf4 (vector float vx); |
cosisind2 | cosisinf4 | Computes the cosine and sine of each element of x , and
stores the results in y and z as follows:
|
void cosisind2 (vector double x, vector double *y, vector double *z) | void cosisinf4 (vector float x, vector float *y, vector float *z) |
divd2 | divf4 | Computes the quotient vx/vy . |
vector double divd2 (vector double vx, vector double vy); | vector float divf4 (vector float vx, vector float vy); |
erfcd2 | erfcf4 | Computes the complementary error function of each element of
vx . |
vector double erfcd2 (vector double vx); | vector float erfcf4 (vector float vx); |
erfd2 | erff4 | Computes the error function of each element of
vx . |
vector double erfd2 (vector double vx); | vector float erff4 (vector float vx); |
expd2 | expf4 | Computes the exponential function of each element of
vx . |
vector double expd2 (vector double vx); | vector float expf4 (vector float vx); |
exp2d2 | exp2f4 | Computes 2 raised to the power of each element of
vx . |
vector double exp2d2 (vector double vx); | vector float exp2f4 (vector float vx); |
expm1d2 | expm1f4 | Computes (the exponential function of each element of vx )
- 1. |
vector double expm1d2 (vector double vx); | vector float expm1f4 (vector float vx); |
exp2m1d2 | exp2m1f4 | Computes (2 raised to the power of each element of vx )
-1. |
vector double exp2m1d2 (vector double vx); | vector float exp2m1f4 (vector float vx); |
hypotd2 | hypotf4 | For each element of vx and the corresponding element of
vy , computes sqrt(x*x+y*y) . |
vector double hypotd2 (vector double vx, vector double vy); | vector float hypotf4 (vector float vx, vector float vy); |
lgammad2 | lgammaf4 | Computes the natural logarithm of the absolute value of the Gamma function
of each element of vx . |
vector double lgammad2 (vector double vx); | vector float lgammaf4 (vector float vx); |
logd2 | logf4 | Computes the natural logarithm of each element of
vx . |
vector double logd2 (vector double vx); | vector float logf4 (vector float vx); |
log2d2 | log2f4 | Computes the base-2 logarithm of each element of
vx . |
vector double log2d2 (vector double vx); | vector float log2f4 (vector float vx); |
log10d2 | log10f4 | Computes the base-10 logarithm of each element of
vx . |
vector double log10d2 (vector double vx); | vector float log10f4 (vector float vx); |
log1pd2 | log1pf4 | Computes the natural logarithm of each element of (vx
+1) . |
vector double log1pd2 (vector double vx); | vector float log1pf4 (vector float vx); |
log21pd2 | log21pf4 | Computes the base-2 logarithm of each element of (vx
+1) . |
vector double log21pd2 (vector double vx); | vector float log21pf4 (vector float vx); |
powd2 | powf4 | Computes each element of vx raised to the power of the
corresponding element of vy . |
vector double powd2 (vector double vx, vector double vy); | vector float powf4 (vector float vx, vector float vy); |
qdrtd2 | qdrtf4 | Computes the quad root of each element of vx . |
vector double qdrtd2 (vector double vx); | vector float qdrtf4 (vector float vx); |
rcbrtd2 | rcbrtf4 | Computes the reciprocal of the cube root of each element of
vx . |
vector double rcbrtd2 (vector double vx); | vector float rcbrtf4 (vector float vx); |
recipd2 | recipf4 | Computes the reciprocal of each element of vx . |
vector double recipd2 (vector double vx); | vector float recipf4 (vector float vx); |
rqdrtd2 | rqdrtf4 | Computes the reciprocal of the quad root of each element of
vx . |
vector double rqdrtd2 (vector double vx); | vector float rqdrtf4 (vector float vx); |
rsqrtd2 | rsqrtf4 | Computes the reciprocal of the square root of each element of
vx . |
vector double rsqrtd2 (vector double vx); | vector float rsqrtf4 (vector float vx); |
sincosd2 | sincosf4 | Computes the sine and cosine of each element of
vx . |
void sincosd2 (vector double vx, vector double *vs, vector double *vc); | void sincosf4 (vector float vx, vector float *vs, vector float *vc); |
sind2 | sinf4 | Computes the sine of each element of vx . |
vector double sind2 (vector double vx); | vector float sinf4 (vector float vx); |
sinhd2 | sinhf4 | Computes the hyperbolic sine of each element of
vx . |
vector double sinhd2 (vector double vx); | vector float sinhf4 (vector float vx); |
sqrtd2 | sqrtf4 | Computes the square root of each element of vx . |
vector double sqrtd2 (vector double vx); | vector float sqrtf4 (vector float vx); |
tand2 | tanf4 | Computes the tangent of each element of vx . |
vector double tand2 (vector double vx); | vector float tanf4 (vector float vx); |
tanhd2 | tanhf4 | Computes the hyperbolic tangent of each element of
vx . |
vector double tanhd2 (vector double vx); | vector float tanhf4 (vector float vx); |