Using the vector libraries
If you want to explicitly call any of the MASS vector functions, you can do so
by including massv.h in your source files and linking your application
with the appropriate vector library.
Information about linking is provided in Compiling and linking a program with MASS.
- libmassv.a
- The generic vector library that runs on any supported POWER® processor. It provides balanced performance tuning across the range of processors while favoring Power9 and Power10. Unless your application requires this portability, use the appropriate architecture-specific library below for maximum performance.
- libmassvp7.a
- Contains functions that are tuned for the POWER7 architecture.
- libmassvp8.a
- Contains functions that are tuned for the POWER8 architecture.
- libmassvp9.a
- Contains functions that are tuned for the POWER9 architecture.
- libmassvp10.a
- Contains functions that are tuned for the Power10 architecture.
All libraries can be used in either 32-bit or 64-bit mode.
The single-precision and double-precision floating-point functions contained in the vector libraries are summarized in Table 1. The integer functions contained in the vector libraries are summarized in Table 2. Note that in C and C++ applications, only call by reference is supported, even for scalar arguments.
- A double-precision (for double-precision functions) or single-precision (for single-precision functions) vector output parameter
- A double-precision (for double-precision functions) or single-precision (for single-precision functions) vector input parameter
- An integer vector-length parameter.
function_name (y,x,n)where y is
the target vector, x is the source vector,
and n is the vector length. The parameters y and x are
assumed to be double-precision for functions with the prefix v,
and single-precision for functions with the prefix vs. As
an example, the following code outputs a vector y of
length 500 whose elements are exp(x[i]), where
i=0,...,499:
#include <massv.h>
double x[500], y[500];
int n;
n = 500;
...
vexp (y, x, &n);
The functions vdiv, vsincos, vpow,
and vatan2 (and their single-precision versions, vsdiv, vssincos, vspow,
and vsatan2) take four arguments. The functions vdiv, vpow,
and vatan2 take the arguments (z,x,y,n).
The function vdiv outputs a vector z whose
elements are x[i]/y[i], where i=0,..,*n–1. The function vpow outputs
a vector z whose elements are x[i]y[i], where i=0,..,*n–1. The function vatan2 outputs
a vector z whose elements are atan(x[i]/y[i]), where i=0,..,*n–1. The function vsincos takes
the arguments (y,z,x,n),
and outputs two vectors, y and z,
whose elements are sin(x[i]) and cos(x[i]), respectively.
In vcosisin(y,x,n) and vscosisin(y,x,n), x is
a vector of n elements and the function
outputs a vector y of n __Complex elements of the form (cos(x[i]),sin(x[i])). If -D__nocomplex is used (see note
in Table 1), the output vector
holds y[0][i] = cos(x[i]) and y[1][i] = sin(x[i]), where i=0,..,*n-1.
| Double-precision function | Single-precision function | Description | Double-precision function prototype | Single-precision function prototype |
|---|---|---|---|---|
| vacos | vsacos | Sets y[i] to the arc cosine of x[i], for
i=0,..,*n-1 |
void vacos (double y[], double x[], int *n); | void vsacos (float y[], float x[], int *n); |
| vacosh | vsacosh | Sets y[i] to the hyperbolic arc cosine of
x[i], for i=0,..,*n-1 |
void vacosh (double y[], double x[], int *n); | void vsacosh (float y[], float x[], int *n); |
| vasin | vsasin | Sets y[i] to the arc sine of x[i], for
i=0,..,*n-1 |
void vasin (double y[], double x[], int *n); | void vsasin (float y[], float x[], int *n); |
| vasinh | vsasinh | Sets y[i] to the hyperbolic arc sine of
x[i], for i=0,..,*n-1 |
void vasinh (double y[], double x[], int *n); | void vsasinh (float y[], float x[], int *n); |
| vatan2 | vsatan2 | Sets z[i] to the arc tangent of
x[i]/y[i], for i=0,..,*n-1 |
void vatan2 (double z[], double x[], double y[], int *n); | void vsatan2 (float z[], float x[], float y[], int *n); |
| vatanh | vsatanh | Sets y[i] to the hyperbolic arc tangent of
x[i], for i=0,..,*n-1 |
void vatanh (double y[], double x[], int *n); | void vsatanh (float y[], float x[], int *n); |
| vcbrt | vscbrt | Sets y[i] to the cube root of x[i], for
i=0,..,*n-1 |
void vcbrt (double y[], double x[], int *n); | void vscbrt (float y[], float x[], int *n); |
| vcos | vscos | Sets y[i] to the cosine of x[i], for
i=0,..,*n-1 |
void vcos (double y[], double x[], int *n); | void vscos (float y[], float x[], int *n); |
| vcosh | vscosh | Sets y[i] to the hyperbolic cosine of x[i],
for i=0,..,*n-1 |
void vcosh (double y[], double x[], int *n); | void vscosh (float y[], float x[], int *n); |
| vcosisin1 | vscosisin1 | Sets the real part of y[i] to the cosine of
x[i] and the imaginary part of y[i] to the sine of
x[i], for i=0,..,*n-1 |
void vcosisin (double _Complex y[], double x[], int *n); | void vscosisin (float _Complex y[], float x[], int *n); |
| vdint | Sets y[i] to the integer truncation of x[i],
for i=0,..,*n-1 |
void vdint (double y[], double x[], int *n); | ||
| vdiv | vsdiv | Sets z[i] to x[i]/y[i], for
i=0,..,*n–1 |
void vdiv (double z[], double x[], double y[], int *n); | void vsdiv (float z[], float x[], float y[], int *n); |
| vdnint | Sets y[i] to the nearest integer to x[i],
for i=0,..,*n-1 |
void vdnint (double y[], double x[], int *n); | ||
| verf | vserf | Sets y[i] to the error function of x[i], for
i=0,..,*n-1 |
void verf (double y[], double x[], int *n) | void vserf (float y[], float x[], int *n) |
| verfc | vserfc | Sets y[i] to the complementary error function of
x[i], for i=0,..,*n-1 |
void verfc (double y[], double x[], int *n) | void vserfc (float y[], float x[], int *n) |
| vexp | vsexp | Sets y[i] to the exponential function of
x[i], for i=0,..,*n-1 |
void vexp (double y[], double x[], int *n); | void vsexp (float y[], float x[], int *n); |
| vexp2 | vsexp2 | Sets y[i] to 2 raised to the power of
x[i], for i=1,..,*n-1 |
void vexp2 (double y[], double x[], int *n); | void vsexp2 (float y[], float x[], int *n); |
| vexpm1 | vsexpm1 | Sets y[i] to (the exponential function of
x[i])-1, for i=0,..,*n-1 |
void vexpm1 (double y[], double x[], int *n); | void vsexpm1 (float y[], float x[], int *n); |
| vexp2m1 | vsexp2m1 | Sets y[i] to (2 raised to the power of
x[i]) - 1, for i=1,..,*n-1 |
void vexp2m1 (double y[], double x[], int *n); | void vsexp2m1 (float y[], float x[], int *n); |
| vhypot | vshypot | Sets z[i] to the square root of the sum of the squares of
x[i] and y[i], for i=0,..,*n-1 |
void vhypot (double z[], double x[], double y[], int *n); | void vshypot (float z[], float x[], float y[], int *n); |
| vlog | vslog | Sets y[i] to the natural logarithm of x[i],
for i=0,..,*n-1 |
void vlog (double y[], double x[], int *n); | void vslog (float y[], float x[], int *n); |
| vlog2 | vslog2 | Sets y[i] to the base-2 logarithm of
x[i], for i=1,..,*n-1 |
void vlog2 (double y[], double x[], int *n); | void vslog2 (float y[], float x[], int *n); |
| vlog10 | vslog10 | Sets y[i] to the base-10 logarithm of x[i],
for i=0,..,*n-1 |
void vlog10 (double y[], double x[], int *n); | void vslog10 (float y[], float x[], int *n); |
| vlog1p | vslog1p | Sets y[i] to the natural logarithm of
(x[i]+1), for i=0,..,*n-1 |
void vlog1p (double y[], double x[], int *n); | void vslog1p (float y[], float x[], int *n); |
| vlog21p | vslog21p | Sets y[i] to the base-2 logarithm of
(x[i]+1), for i=1,..,*n-1
|
void vlog21p (double y[], double x[], int *n); | void vslog21p (float y[], float x[], int *n); |
| vpow | vspow | Sets z[i] to x[i] raised to the power
y[i], for i=0,..,*n-1 |
void vpow (double z[], double x[], double y[], int *n); | void vspow (float z[], float x[], float y[], int *n); |
| vqdrt | vsqdrt | Sets y[i] to the fourth root of x[i], for
i=0,..,*n-1 |
void vqdrt (double y[], double x[], int *n); | void vsqdrt (float y[], float x[], int *n); |
| vrcbrt | vsrcbrt | Sets y[i] to the reciprocal of the cube root of
x[i], for i=0,..,*n-1 |
void vrcbrt (double y[], double x[], int *n); | void vsrcbrt (float y[], float x[], int *n); |
| vrec | vsrec | Sets y[i] to the reciprocal of x[i], for
i=0,..,*n-1 |
void vrec (double y[], double x[], int *n); | void vsrec (float y[], float x[], int *n); |
| vrqdrt | vsrqdrt | Sets y[i] to the reciprocal of the fourth root of
x[i], for i=0,..,*n-1 |
void vrqdrt (double y[], double x[], int *n); | void vsrqdrt (float y[], float x[], int *n); |
| vrsqrt | vsrsqrt | Sets y[i] to the reciprocal of the square root of
x[i], for i=0,..,*n-1 |
void vrsqrt (double y[], double x[], int *n); | void vsrsqrt (float y[], float x[], int *n); |
| vsin | vssin | Sets y[i] to the sine of x[i], for
i=0,..,*n-1 |
void vsin (double y[], double x[], int *n); | void vssin (float y[], float x[], int *n); |
| vsincos | vssincos | Sets y[i] to the sine of x[i] and
z[i] to the cosine of x[i], for i=0,..,*n-1 |
void vsincos (double y[], double z[], double x[], int *n); | void vssincos (float y[], float z[], float x[], int *n); |
| vsinh | vssinh | Sets y[i] to the hyperbolic sine of x[i],
for i=0,..,*n-1 |
void vsinh (double y[], double x[], int *n); | void vssinh (float y[], float x[], int *n); |
| vsqrt | vssqrt | Sets y[i] to the square root of x[i], for
i=0,..,*n-1 |
void vsqrt (double y[], double x[], int *n); | void vssqrt (float y[], float x[], int *n); |
| vtan | vstan | Sets y[i] to the tangent of x[i], for
i=0,..,*n-1 |
void vtan (double y[], double x[], int *n); | void vstan (float y[], float x[], int *n); |
| vtanh | vstanh | Sets y[i] to the hyperbolic tangent of x[i],
for i=0,..,*n-1 |
void vtanh (double y[], double x[], int *n); | void vstanh (float y[], float x[], int *n); |
|
Note:
|
||||
Integer functions are of the form function_name (x[],
*n), where x[] is a vector
of 4-byte (for vpopcnt4) or 8-byte (for vpopcnt8)
numeric objects (integral or floating-point), and *n is
the vector length.
| Function | Description | Prototype |
|---|---|---|
| vpopcnt4 | Returns the total number of 1 bits in the
concatenation of the binary representation of x[i],
for i=0,..,*n–1 , where x is a vector of 32-bit
objects. |
unsigned int vpopcnt4 (void *x, int *n) |
| vpopcnt8 | Returns the total number of 1 bits in the
concatenation of the binary representation of x[i],
for i=0,..,*n–1 , where x is a vector of 64-bit
objects. |
unsigned int vpopcnt8 (void *x, int *n) |
Overlap of input and output vectors
vsin (y, y, &n)).
For other kinds of overlap, be sure to observe the following restrictions,
to ensure correct operation of your application: - For calls to vector functions that take one input and one output
vector (for example,
vsin (y, x, &n)):The vectors
x[0:n-1]andy[0:n-1]must be either disjoint or identical, or the address ofx[0]must be greater than the address ofy[0]. That is, ifxandyare not the same vector, the address ofy[0]must not fall within the range of addresses spanned byx[0:n-1], or unexpected results might be obtained. - For calls to vector functions that take two input vectors (for
example,
vatan2 (y, x1, x2, &n)):The previous restriction applies to both pairs of vectors
y,x1andy,x2. That is, ifyis not the same vector asx1, the address ofy[0]must not fall within the range of addresses spanned byx1[0:n-1]; ifyis not the same vector asx2, the address ofy[0]must not fall within the range of addresses spanned byx2[0:n-1]. - For calls to vector functions that take two output vectors (for
example,
vsincos (x, y1, y2, &n)):The above restriction applies to both pairs of vectors
y1,xandy2,x. That is, ify1andxare not the same vector, the address ofy1[0]must not fall within the range of addresses spanned byx[0:n-1]; ify2andxare not the same vector, the address ofy2[0]must not fall within the range of addresses spanned byx[0:n-1]. Also, the vectorsy1[0:n-1]andy2[0:n-1]must be disjoint.
Alignment of input and output vectors
To get the best performance from the vector libraries, align the input and output vectors on 8-byte (or better, 16-byte) boundaries.
Consistency of MASS vector functions
The accuracy of the vector functions is comparable to that of the corresponding scalar functions
in libmass.a, though results might not be bitwise-identical.
In the interest of speed, the MASS libraries make certain trade-offs. One of these involves the consistency of certain MASS vector functions. For certain functions, it is possible that the result computed for a particular input value varies slightly (usually only in the least significant bit) depending on its position in the vector, the vector length, and nearby elements of the input vector. Also, the results produced by the different MASS libraries are not necessarily bit-wise identical.
All the functions in libmassvp7.a and libmassvp8.a
are consistent.
- double-precision functions
vacos,vacosh,vasin,vasinh,vatan2,vatanh,vcbrt,vcos,vcosh,vcosisin,vdint,vdnint,vexp2,vexpm1,vexp2m1,vlog,vlog2,vlog10,vlog1p,vlog21p,vpow,vqdrt,vrcbrt,vrqdrt,vsin,vsincos,vsinh,vtan,vtanh
- single-precision functions
vsacos,vsacosh,vsasin,vsasinh,vsatan2,vsatanh,vscbrt,vscos,vscosh,vscosisin,vsexp,vsexp2,vsexpm1,vsexp2m1,vslog,vslog2,vslog10,vslog1p,vslog21p,vspow,vsqdrt,vsrcbrt,vsrqdrt,vssin,vssincos,vssinh,vssqrt,vstan,vstanh
Older, inconsistent versions of some of these functions are available on the Mathematical Acceleration Subsystem for AIX website. If consistency is not required, there might be a performance advantage to using the older versions. For more information on consistency and avoiding inconsistency with the vector libraries, as well as performance and accuracy data, see the Mathematical Acceleration Subsystem website.