Using the vector libraries
If you want to explicitly call any of the MASS vector
functions, you can do so by including massv.include in your source
files and linking your application with the appropriate vector library.
Information about linking is provided in Compiling and linking a program with MASS.
- libmassv.a
- The generic vector library that runs on any supported POWER® processor. Unless your application requires this portability, use the appropriate architecture-specific library below for maximum performance.
- libmassvp8.a
- Contains functions that are tuned for the POWER8® architecture.
- libmassvp9.a
- Contains functions that are tuned for the POWER9™ architecture.
The single-precision and double-precision floating-point functions contained in the vector libraries are summarized in Table 1. The integer functions contained in the vector libraries are summarized in Table 2.
- A double-precision (for double-precision functions) or single-precision (for single-precision functions) vector output argument.
- A double-precision (for double-precision functions) or single-precision (for single-precision functions) vector input argument.
- An integer vector-length argument.
function_name (y,x,n)where y is
the target vector, x is the source vector,
and n is the vector length. The arguments y and x are
assumed to be double-precision for functions with the prefix v,
and single-precision for functions with the prefix vs. As
an example, the following code outputs a vector y of
length 500 whose elements are exp(x(i)), where i=1,...,500:
include 'massv.include'
real(8) x(500), y(500)
integer n
n = 500
...
call vexp (y, x, n)
The functions vdiv, vsincos, vpow,
and vatan2 (and their single-precision versions, vsdiv, vssincos, vspow,
and vsatan2) take four arguments. The functions vdiv, vpow,
and vatan2 take the arguments (z,x,y,n).
The function vdiv outputs a vector z whose
elements are x(i)/y(i), where i=1,...,n. The function vpow outputs
a vector z whose elements are x(i)y(i), where i=1,..,n. The function vatan2 outputs
a vector z whose elements are atan(x(i)/y(i)), where i=1,..,n. The function vsincos takes
the arguments (y,z,x,n),
and outputs two vectors, y and z,
whose elements are sin(x(i)) and cos(x(i)), respectively.
In vcosisin(y,x,n) and vscosisin(y,x,n), x is
a vector of n elements and the function
outputs a vector y of n complex(8)(for
vcosisin) or complex(4)(for
vscosisin) elements of the form (cos(x(i)),sin(x(i))).
| Double-precision function | Single-precision function | Arguments | Description |
|---|---|---|---|
| vacos | vsacos | (y,x,n) |
Sets y(i) to the arc cosine of x(i), for i=1,..,n |
| vacosh | vsacosh | (y,x,n) |
Sets y(i) to the hyperbolic arc cosine of x(i), for i=1,..,n |
| vasin | vsasin | (y,x,n) |
Sets y(i) to the arc sine of x(i), for i=1,..,n |
| vasinh | vsasinh | (y,x,n) |
Sets y(i) to the arc hyperbolic sine of x(i), for i=1,..,n |
| vatan2 | vsatan2 | (z,x,y,n) |
Sets z(i) to the arc tangent of x(i)/y(i), for i=1,..,n |
| vatanh | vsatanh | (y,x,n) |
Sets y(i) to the arc hyperbolic tangent of x(i), for i=1,..,n |
| vcbrt | vscbrt | (y,x,n) |
Sets y(i) to the cube root of x(i), for i=1,..,n |
| vcos | vscos | (y,x,n) |
Sets y(i) to the cosine of x(i), for i=1,..,n |
| vcosh | vscosh | (y,x,n) |
Sets y(i) to the hyperbolic cosine of x(i), for i=1,..,n |
| vcosisin | vscosisin | (y,x,n) |
Sets the real part of y(i) to the cosine of x(i) and the imaginary part of y(i) to the sine of x(i), for i=1,..,n |
| vdint | (y,x,n) |
Sets y(i) to the integer truncation of x(i), for i=1,..,n | |
| vdiv | vsdiv | (z,x,y,n) |
Sets z(i) to x(i)/y(i), for i=1,..,n |
| vdnint | (y,x,n) |
Sets y(i) to the nearest integer to x(i), for i=1,..,n | |
| verf | vserf | (y,x,n) |
Sets y(i) to the error function of x(i), for i=1,..,n |
| verfc | vserfc | (y,x,n) |
Sets y(i) to the complimentary error function of x(i), for i=1,..,n |
| vexp | vsexp | (y,x,n) |
Sets y(i) to the exponential function of x(i), for i=1,..,n |
| vexp2 | vsexp2 | (y,x,n) |
Sets y(i) to 2 raised to the power of x(i), for i=1,..,n |
| vexpm1 | vsexpm1 | (y,x,n) |
Sets y(i) to (the exponential function of x(i)) -1, for i=1,..,n |
| vexp2m1 | vsexp2m1 | (y,x,n) |
Sets y(i) to (2 raised to the power of x(i)) -1, for i=1,..,n |
| vhypot | vshypot | (z,x,y,n) |
Sets z(i) to the square root of the sum of the squares of x(i) and y(i), for i=1,..,n |
| vlog | vslog | (y,x,n) |
Sets y(i) to the natural logarithm of x(i), for i=1,..,n |
| vlog2 | vslog2 | (y,x,n) |
Sets y(i) to the base-2 logarithm of x(i), for i=1,..,n |
| vlog10 | vslog10 | (y,x,n) |
Sets y(i) to the base-10 logarithm of x(i), for i=1,..,n |
| vlog1p | vslog1p | (y,x,n) |
Sets y(i) to the natural logarithm of (x(i)+1), for i=1,..,n |
| vlog21p | vslog21p | (y,x,n) |
Sets y(i) to the base-2 logarithm of (x(i)+1), for i=1,..,n |
| vpow | vspow | (z,x,y,n) |
Sets z(i) to x(i) raised to the power y(i), for i=1,..,n |
| vqdrt | vsqdrt | (y,x,n) |
Sets y(i) to the 4th root of x(i), for i=1,..,n |
| vrcbrt | vsrcbrt | (y,x,n) |
Sets y(i) to the reciprocal of the cube root of x(i), for i=1,..,n |
| vrec | vsrec | (y,x,n) |
Sets y(i) to the reciprocal of x(i), for i=1,..,n |
| vrqdrt | vsrqdrt | (y,x,n) |
Sets y(i) to the reciprocal of the 4th root of x(i), for i=1,..,n |
| vrsqrt | vsrsqrt | (y,x,n) |
Sets y(i) to the reciprocal of the square root of x(i), for i=1,..,n |
| vsin | vssin | (y,x,n) |
Sets y(i) to the sine of x(i), for i=1,..,n |
| vsincos | vssincos | (y,z,x,n) |
Sets y(i) to the sine of x(i) and z(i) to the cosine of x(i), for i=1,..,n |
| vsinh | vssinh | (y,x,n) |
Sets y(i) to the hyperbolic sine of x(i), for i=1,..,n |
| vsqrt | vssqrt | (y,x,n) |
Sets y(i) to the square root of x(i), for i=1,..,n |
| vtan | vstan | (y,x,n) |
Sets y(i) to the tangent of x(i), for i=1,..,n |
| vtanh | vstanh | (y,x,n) |
Sets y(i) to the hyperbolic tangent of x(i), for i=1,..,n |
Integer functions are of the form function_name (x, n),
where x is a vector of 4-byte (for vpopcnt4)
or 8-byte (for vpopcnt8) numeric objects (integer
or floating-point), and n is the vector
length.
| Function | Description | Interface |
|---|---|---|
| vpopcnt4 | Returns the total number of 1 bits in the
concatenation of the binary representation of x(i), for i=1,...,n,
where x is vector of 32-bit objects |
|
| vpopcnt8 | Returns the total number of 1 bits in the
concatenation of the binary representation of x(i), for i=1,...,n,
where x is vector of 64-bit objects |
|
INTERFACE vsqrt
! Sets y(i) to the square root of x(i), for i=1,...,n
PURE SUBROUTINE vsqrt (y, x, n)
REAL*8, INTENT(OUT) :: y(*)
REAL*8, INTENT(IN) :: x(*)
INTEGER*4, INTENT(IN) :: n
END SUBROUTINE
PURE SUBROUTINE vssqrt (y, x, n)
REAL*4, INTENT(OUT) :: y(*)
REAL*4, INTENT(IN) :: x(*)
INTEGER*4, INTENT(IN) :: n
END SUBROUTINE
END INTERFACE
Overlap of input and output vectors
vsin (y, y, n)).
Other kinds of overlap (where input and output vectors are neither
disjoint nor identical) should be avoided, since they might produce
unexpected results: - For calls to vector functions that take one input and one output
vector (for example,
vsin (y, x, n)):The vectors
x(1:n)andy(1:n)must be either disjoint or identical, or unexpected results might be obtained. - For calls to vector functions that take two input vectors (for
example,
vatan2 (y, x1, x2, n)):The previous restriction applies to both pairs of vectors
y,x1andy,x2. That is,y(1:n)andx1(1:n)must be either disjoint or identical; andy(1:n)andx2(1:n)must be either disjoint or identical. - For calls to vector functions that take two output vectors (for
example,
vsincos (y1, y2, x, n)):The above restriction applies to both pairs of vectors
y1,xandy2,x. That is,y1(1:n)andx(1:n)must be either disjoint or identical; andy2(1:n)andx(1:n)must be either disjoint or identical. Also, the vectorsy1(1:n)andy2(1:n)must be disjoint.
Alignment of input and output vectors
To get the best performance from the vector libraries, align the input and output vectors on 8-byte (or better, 16-byte) boundaries.
Consistency of MASS vector functions
All the functions in the MASS vector libraries are consistent, in the sense that a given input value will always produce the same result, regardless of its position in the vector, and regardless of the vector length.