SGBMV, DGBMV, CGBMV, and ZGBMV (Matrix-Vector Product for a General Band Matrix, Its Transpose, or Its Conjugate Transpose)

Purpose

SGBMV and DGBMV compute the matrix-vector product for either a real general band matrix or its transpose, where the general band matrix is stored in BLAS-general-band storage mode. It uses the scalars α and β, vectors x and y, and general band matrix A or its transpose:

y←βy+αAx

y ← βy+αA^Tx

CGBMV and ZGBMV compute the matrix-vector product for either a complex general band matrix, its transpose, or its conjugate transpose, where the general band matrix is stored in BLAS-general-band storage mode. It uses the scalars α and β, vectors x and y, and general band matrix A, its transpose, or its conjugate transpose:

y ← βy+αAx
y ← βy+αA^Tx
y ← βy+αA^Hx

Table 1. Data Types
Data Types
α, β, x, y, A	Subprogram
Short-precision real	SGBMV
Long-precision real	DGBMV
Short-precision complex	CGBMV
Long-precision complex	ZGBMV

Syntax

Language	Syntax
Fortran	CALL SGBMV \| DGBMV \| CGBMV \| ZGBMV (`transa`, `m`, `n`, `ml`, `mu`, `alpha`, `a`, `lda`, `x`, `incx`, `beta`, `y`, `incy`)
C and C++	sgbmv \| dgbmv \| cgbmv \| zgbmv (`transa`, `m`, `n`, `ml`, `mu`, `alpha`, `a`, `lda`, `x`, `incx`, `beta`, `y`, `incy`);
CBLAS	cblas_sgbmv \| cblas_dgbmv \|cblas_cgbmv \| cblas_zgbmv (`cblas_layout`, `cblas_transa`, `m`, `n`, `ml`, `mu`, `alpha`, `a`, `lda`, `x`, `incx`, `beta`, `y`, `incy`);

On Entry

cblas_layout

indicates whether the input matrices are stored in row major order or column major order, where:

If cblas_layout = CblasRowMajor, the matrices are stored in row major order.
If cblas_layout = CblasColMajor, the matrices are stored in column major order.

Specified as: an object of enumerated type CBLAS_LAYOUT. It must be CblasRowMajor or CblasColMajor.

transa

indicates the form of matrix A to use in the computation, where:

If transa = 'N', A is used in the computation.

If transa = 'T', A^T is used in the computation.

If transa = 'C', A^H is used in the computation.

Specified as: a single character. It must be 'N', 'T', or 'C'.

cblas_transa

indicates the form of matrix A to use in the computation, where:

If cblas_transa = CblasNoTrans, A is used in the computation.

If cblas_transa = CblasTrans, A^T is used in the computation.

If cblas_transa = CblasConjTrans, A^H is used in the computation.

Specified as: an object of enumerated type CBLAS_TRANSPOSE. It must be CblasNoTrans, CblasTrans, or CblasConjTrans.

m

is the number of rows in matrix A, and:

If transa = 'N', it is the length of vector y.

If transa = 'T' or 'C', it is the length of vector x.

Specified as: an integer; m ≥ 0.

n

is the number of columns in matrix A, and:

If transa = 'N', it is the length of vector x.

If transa = 'T' or 'C', it is the length of vector y.

Specified as: an integer; n ≥ 0.

ml

is the lower band width ml of the matrix A.

Specified as: an integer; ml ≥ 0.

mu

is the upper band width mu of the matrix A.

Specified as: an integer; mu ≥ 0.

alpha

is the scaling constant α.

Specified as: a number of the data type indicated in Table 1.

a

is the m by n general band matrix A, stored in BLAS-general-band storage mode. It has an upper band width mu and a lower band width ml. Also:

If transa = 'N', A is used in the computation.

If transa = 'T', A^T is used in the computation.

If transa = 'C', A^H is used in the computation.

Note: No data should be moved to form A^T or A^H; that is, the matrix A should always be stored in its untransposed form in BLAS-general-band storage mode.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 1, where lda ≥ ml+mu+1.

lda

is the leading dimension of the array specified for a.

Specified as: an integer; lda > 0 and lda ≥ ml+mu+1.

x

is the vector x, where:

If transa = 'N', it has length n.

If transa = 'T' or 'C', it has length m.

Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 1, where:

If transa = 'N', it must have at least 1+(n-1)|incx| elements.

If transa = 'T' or 'C', it must have at least 1+(m-1)|incx| elements.

incx

is the stride for vector x.

Specified as: an integer; incx > 0 or incx < 0.

beta

is the scaling constant β.

Specified as: a number of the data type indicated in Table 1.

y

is the vector y, where:

If transa = 'N', it has length m.

If transa = 'T' or 'C', it has length n.

Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 1, where:

If transa = 'N', it must have at least 1+(m-1)|incy| elements.

If transa = 'T' or 'C', it must have at least 1+(n-1)|incy| elements.

incy

is the stride for vector y.

Specified as: an integer; incy > 0 or incy < 0.

On Return

y

is the vector y, containing the result of the computation, where:

If transa = 'N', it has length m.

If transa = 'T' or 'C', it has length n.

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 1.

Notes

For SGBMV and DGBMV, if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.
All subroutines accept lowercase letters for the transa argument.
Vector y must have no common elements with matrix A or vector x; otherwise, results are unpredictable. See Vector concepts.
To achieve optimal performance, use lda = mu+ml+1.
For general band matrices, if you specify ml ≥ m or mu ≥ n, ESSL assumes, only for purposes of the computation, that the lower band width is m-1 or the upper band width is n-1, respectively. However, ESSL uses the original values for ml and mu for the purposes of finding the locations of element a₁₁ and all other elements in the array specified for A, as described in General Band Matrix. For an illustration of this technique, see Example 4.
For a description of how a general band matrix is stored in BLAS-general-band storage mode in an array, see General Band Matrix.

Function

The possible computations that can be performed by these subroutines are described. Varying implementation techniques are used for this computation to improve performance. As a result, accuracy of the computational result may vary for different computations.

In all the computations, general band matrix A is stored in its untransposed form in an array, using BLAS-general-band storage mode.

For SGBMV and CGBMV, intermediate results are accumulated in long precision. Occasionally, for performance reasons, these intermediate results are truncated to short precision and stored.

See references [44], [45], [48], [56], and [95]. No computation is performed if m or n is 0 or if α is zero and β is one.

General Band Matrix

For SGBMV, DGBMV, CGBMV, and ZGBMV, the matrix-vector product for a general band matrix is expressed as follows:

y←βy+αAx

where:

x is a vector of length n.

y is a vector of length m.

α is a scalar.

β is a scalar.

A is an m by n general band matrix, having a lower band width of ml and an upper band width of mu.

Transpose of a General Band Matrix

For SGBMV, DGBMV, CGBMV, and ZGBMV, the matrix-vector product for the transpose of a general band matrix is expressed as:

y ← βy+αA^Tx

where:

x is a vector of length m.

y is a vector of length n.

α is a scalar.

β is a scalar.

A^T is the transpose of an m by n general band matrix A, having a lower band width of ml and an upper band width of mu.

Conjugate Transpose of a General Band Matrix

For CGBMV and ZGBMV, the matrix-vector product for the conjugate transpose of a general band matrix is expressed as follows:

y ← βy+αA^Hx

where:

x is a vector of length m.

y is a vector of length n.

α is a scalar.

β is a scalar.

A^H is the conjugate transpose of an m by n general band matrix A of order n, having a lower band width of ml and an upper band width of mu.

Error conditions

Resource Errors

Unable to allocate internal work area

Computational Errors

None

Input-Argument Errors

cblas_layout ≠ CblasRowMajor or CblasColMajor
transa ≠ 'N', 'T', or 'C'
cblas_transa ≠ CblasNoTrans, CblasTrans, or CblasConjTrans
m < 0
n < 0
ml < 0
mu < 0
lda ≤ 0
lda < ml+mu+1
incx = 0
incy = 0

Examples

Example 1

This example shows how to use SGBMV to perform the computation y←βy+αAx, where TRANSA is equal to 'N', and the following real general band matrix A is used in the computation. Matrix A is:

                                          
                     | 1.0  1.0  1.0  0.0 |
                     | 2.0  2.0  2.0  2.0 |
                     | 3.0  3.0  3.0  3.0 |
                     | 4.0  4.0  4.0  4.0 |
                     | 0.0  5.0  5.0  5.0 |

Call Statement and Input:

           TRANSA M   N   ML  MU  ALPHA  A  LDA  X  INCX  BETA   Y  INCY
             |    |   |   |   |     |    |   |   |   |     |     |   |
CALL SGBMV( 'N' , 5 , 4 , 3 , 2 ,  2.0 , A , 8 , X , 1  , 10.0 , Y , 2  )

                             
        |  .    .   1.0  2.0 |
        |  .   1.0  2.0  3.0 |
        | 1.0  2.0  3.0  4.0 |
A    =  | 2.0  3.0  4.0  5.0 |
        | 3.0  4.0  5.0   .  |
        | 4.0  5.0   .    .  |
        |  .    .    .    .  |
        |  .    .    .    .  |

X        =  (1.0, 2.0, 3.0, 4.0)
Y        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . , 5.0, . )

Output:

Y        =  (22.0, . , 60.0, . , 90.0, . , 120.0, . , 140.0, . )

Example 2

This example shows how to use SGBMV to perform the computation y ← βy+αA^Tx, where TRANSA is equal to 'T', and the transpose of a real general band matrix A is used in the computation. It uses the same input as Example 1.

Call Statement and Input:

           TRANSA M   N   ML  MU  ALPHA  A  LDA  X  INCX  BETA   Y  INCY
             |    |   |   |   |     |    |   |   |   |     |     |   |
CALL SGBMV( 'T' , 5 , 4 , 3 , 2 ,  2.0 , A , 8 , X , 1  , 10.0 , Y , 2  )

Output:

Y        =  (70.0, . , 130.0, . , 140.0, . , 148.0, . )

Example 3

This example shows how to use CGBMV to perform the computation y←βy+αA^Hx, where TRANSA is equal to 'C', and the complex conjugate of the following general band matrix A is used in the computation. Matrix A is:

                                                              
             | (1.0, 1.0)  (1.0, 1.0)  (1.0, 1.0)  (0.0, 0.0) |
             | (2.0, 2.0)  (2.0, 2.0)  (2.0, 2.0)  (2.0, 2.0) |
             | (3.0, 3.0)  (3.0, 3.0)  (3.0, 3.0)  (3.0, 3.0) |
             | (4.0, 4.0)  (4.0, 4.0)  (4.0, 4.0)  (4.0, 4.0) |
             | (0.0, 0.0)  (5.0, 5.0)  (5.0, 5.0)  (0.0, 0.0) |

Call Statement and Input:

           TRANSA M   N   ML  MU  ALPHA   A  LDA  X  INCX  BETA   Y  INCY
             |    |   |   |   |     |     |   |   |   |     |     |   |
CALL CGBMV( 'C' , 5 , 4 , 3 , 2 , ALPHA , A , 8 , X , 1  , BETA , Y , 2  )

                                                         
        |     .           .       (1.0, 1.0)  (2.0, 2.0) |
        |     .       (1.0, 1.0)  (2.0, 2.0)  (3.0, 3.0) |
        | (1.0, 1.0)  (2.0, 2.0)  (3.0, 3.0)  (4.0, 4.0) |
A    =  | (2.0, 2.0)  (3.0, 3.0)  (4.0, 4.0)  (5.0, 5.0) |
        | (3.0, 3.0)  (4.0, 4.0)  (5.0, 5.0)      .      |
        | (4.0, 4.0)  (5.0, 5.0)      .           .      |
        |     .           .           .           .      |
        |     .           .           .           .      |

X        =  ((1.0, 2.0), (2.0, 3.0), (3.0, 4.0), (4.0, 5.0),
             (5.0, 6.0))
ALPHA    =  (1.0, 1.0)
BETA     =  (10.0, 0.0)
Y        =  ((1.0, 2.0), . , (2.0, 3.0), . , (3.0, 4.0), . ,
             (4.0, 5.0), . )

Output:

Y        =  ((70.0, 100.0), . , (130.0, 170.0), . ,
             (140.0, 180.0), . , (148.0, 186.0), . )

Example 4

This example shows how to use SGBMV to perform the computation y←βy+αAx, where ml ≥ m and mu ≥ n, TRANSA is equal to 'N', and the following real general band matrix A is used in the computation. Matrix A is:

                                                 
                       | 1.0  1.0  1.0  1.0  1.0 |
                       | 2.0  2.0  2.0  2.0  2.0 |
                       | 3.0  3.0  3.0  3.0  3.0 |
                       | 4.0  4.0  4.0  4.0  4.0 |

Call Statement and Input:

           TRANSA M   N   ML  MU  ALPHA  A  LDA   X  INCX  BETA   Y  INCY
             |    |   |   |   |     |    |   |    |   |     |     |   |
CALL SGBMV( 'N' , 4 , 5 , 6 , 5 ,  2.0 , A , 12 , X , 1  , 10.0 , Y , 2  )

                                  
        |  .    .    .    .    .  |
        |  .    .    .    .   1.0 |
        |  .    .    .   1.0  2.0 |
        |  .    .   1.0  2.0  3.0 |
        |  .   1.0  2.0  3.0  4.0 |
A    =  | 1.0  2.0  3.0  4.0   .  |
        | 2.0  3.0  4.0   .    .  |
        | 3.0  4.0   .    .    .  |
        | 4.0   .    .    .    .  |
        |  .    .    .    .    .  |
        |  .    .    .    .    .  |
        |  .    .    .    .    .  |
                                  
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . )

Output:

Y        =  (40.0, . , 80.0, . , 120.0, . , 160.0, . )