Skip to main content

TechReview, Part 2: Program applications with the LAPACK library

Learn how to program and performance tune the LAPACK library linear algebra operations in high-performance computing

Kane Scarlett, Editor, Multicore acceleration, IBM
Kane Scarlett
Kane Scarlett is a technology journalist/analyst with 20 years in the business, working for such publishers as National Geographic, Population Reference Bureau, Miller Freeman, and IDG, and managing, editing, and writing for such august journals as JavaWorld, LinuxWorld, and of course, developerWorks.

Summary:  For application programmers using the IBM Software Development Kit for Multicore Acceleration (SDK), this article explains how to program with the IBM Linear Algebra Package (LAPACK) library using a sample application designed to get an inverse matrix. The article also offers 4 pieces of advice on optimizing LAPACK programs, and it outlines the package's optimized APIs. LAPACK is based on a published standard interface for commonly used linear algebra operations in high-performance computing and other scientific domains.

View more content in this series

Date:  30 Sep 2008
Level:  Introductory PDF:  A4 and Letter (33KB)Get Adobe® Reader®
Activity:  5290 views

Introduction

This review gives a quick look at the LAPACK library through the eyes of the original documentation, "LAPACK: Linear Algebra Package Library Programmer's Guide and API Reference" (see Resources). This article focuses on programming basics with the library, advice on optimizing, and the optimized APIs that come in the library. Use this article with the most current version of IBM SDK for Multicore Acceleration (also known as the Cell/B.E.® SDK), which is the version with fixpack 3.0.0.3.

Building the examples

The following sample application shows you how to use the LAPACK library. The application invokes the DGETRF and DGETRI routines to get an inverse matrix. To build the examples, do this:

  1. Cut and paste the Makefile source from Sample Makefile or from the source programmer's guide into an editor, and save the file as Makefile.
  2. Cut and paste the example source code from Example source code or from the programmer's guide into an editor, and save the file with a name such as doc_example.c.
  3. Edit the Makefile to include the name of the example source file that you chose in the previous step. If you did not use doc_example.c, substitute the name you chose into the lines that contain doc_example and doc_example.a.
  4. Copy the Makefile and the example source file into your development source directory (for example, /opt/sandbox/).
  5. From a shell prompt, type:
    $ cd /opt/sandbox
    $ export CELL_TOP=/opt/cell/sdk $ make

Sample Makefile

Here's that sample Makefile.

# -------------------------------------------------------------- 
# (C)Copyright 2007,2008 
# International Business Machines Corporation 
# All Rights Reserved. 
# 
# Redistribution and use in source and binary forms, with or 
# without modification, are permitted provided that the 
# following conditions are met: 
# 
# Redistributions of source code must retain the above copyright 
# notice, this list of conditions and the following disclaimer. 
# 
# Redistributions in binary form must reproduce the above 
# copyright notice, this list of conditions and the following 
# disclaimer in the documentation and/or other materials 
# provided with the distribution. 
# 
# Neither the name of IBM Corporation nor the names of its 
# contributors may be used to endorse or promote products 
# derived from this software without specific prior written 
# permission. 
# 
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND 
# CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, 
# INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 
# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR 
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT 
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 
# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 
# EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
# -------------------------------------------------------------- 
# PROLOG END TAG zYx 

######################################################################## 
#                               Target 
######################################################################## 

   PROGRAM_spu := doc_example 
   LIBRARY_embed := doc_example.a 

######################################################################## 
#                               Local Defines 
######################################################################## 

   INCLUDE = -I$(SDKINC) 
   LDFLAGS += -L$(SDKLIB) 

   IMPORTS := -lm -lmc_rand -lmisc 

######################################################################## 
#                               make.footer 
######################################################################## 

# If necessary, change this to the location of make.footer: 
   include /opt/cell/sdk/buildutils/make.footer 

########################################################################

Example source code

The following sample application shows you how to use the LAPACK library by invoking the DGETRF and DGETRI routines to get an inverse matrix.

#include <stdlib.h> 
#include <stdio.h> 
#include <lapack.h> 

int main( int argc, char *argv[] ) 
{
   int info = 0; 
   double *a; 
   int *ipiv; 
   int n; 
   int i,j; 

   if( argc < 2 ) 
   { 
      fprintf(stderr, "%s N\n", argv[0]); 
      return 0; 
   } 
   n = atoi(argv[1]); 

   posix_memalign((void **)(&ipiv), 128, sizeof(int)*n); 
   posix_memalign((void **)(&a), 128, sizeof(double)*n*n); 
   if( a == NULL || ipiv == NULL ) 
   { 
      fprintf(stderr, "a/ipiv malloc error\n"); 
      return 0; 
   } 

   for( i = 0; i < n; i++ ) 
   {
       for( j = 0; j < n; j++ )
       { 
           a[i+j*n]= (drand48() - 0.5f)*4; 
       } 
   } 

   /*---------Call Cell LAPACK library---------*/ 
   dgetrf_(&n, &n, a, &n, ipiv, &info); 
   if( info != 0 ) 
   { 
      fprintf(stderr, "Call dgetrf error\n"); 
      goto end; 
   } 

   /*---------Query workspace-------*/ 
   double workspace; 
   int tmp=-1; 
   int lwork; 
   double *work; 
   dgetri_(&n, a, &n, ipiv, &workspace, &tmp, &info); 
   lwork = (int)workspace; 
   work = malloc(sizeof(double)*lwork); 
   if(work == NULL) 
   { 
      printf("work malloc error\n"); 
      goto end; 
   } 

   /*---------Call Cell LAPACK library---------*/ 
   dgetri_(&n, a, &n, ipiv, work, &lwork, &info); 
   if( info != 0 ) 
   { 
      fprintf(stderr, "Call dgetri error\n"); 
      free(work); 
      goto end; 
   } 
   printf("Inverse matrix completed!\n"); 

   end: 
   free(ipiv); 
   free(a); 
   return 0; 
}


Getting in tune

The LAPACK library provides additional features for customizing the library. Use the features to efficiently leverage the available resources to boost your performance.

Some advice about optimizing LAPACK programs:

  • Make the matrices 128-byte aligned. Memory access is more efficient when data is 128-byte aligned.
  • If you want to use more than 8 SPEs, use the NUMA APIs to interleave the memory of the data on different nodes.
  • Supply a matrix in which the number of rows and columns is greater than 1024 for better library performance.
  • Realize that if you use DGETRI with an odd value of lda, the performance is approximately 25 percent worse than when the value of lda is even. A fix for this discrepancy should be on the horizon.

Your environment variable is LAPACK_NUMSPES. It specifies the number of SPEs to use, and its default value is 16.


Using optimized routines

These routines (which come with the LAPACK) have been optimized to use features of the SPEs:

  • DGETRF: Computes the LU factorization of a general matrix. LU is a matrix decomposition that writes a matrix as the product of a lower and upper triangular matrix.
  • DGETRI: Computes the inverse of a general matrix using the LU factorization.
  • DGEQRF: Computes the QR factorization of a general matrix. QR is a decomposition of the matrix into an orthogonal and a triangular matrix.
  • DPOTRF: Computes the Cholesky factorization of a symmetric positive matrix. Cholesky is a decomposition of a symmetric positive-definite matrix into a lower triangular matrix and the transpose of the lower triangular matrix.
  • DBDSQR: Computes the singular value decomposition of a real bi-diagonal matrix using the implicit zero-shift QR algorithm.
  • DSTEQR: Computes the singular value decomposition of a real symmetric tridiagonal matrix using the implicit QR algorithm.

Resources

Learn

Get products and technologies

Discuss

About the author

Kane Scarlett

Kane Scarlett is a technology journalist/analyst with 20 years in the business, working for such publishers as National Geographic, Population Reference Bureau, Miller Freeman, and IDG, and managing, editing, and writing for such august journals as JavaWorld, LinuxWorld, and of course, developerWorks.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Multicore acceleration
ArticleID=342722
ArticleTitle=TechReview, Part 2: Program applications with the LAPACK library
publish-date=09302008
author1-email=kane@us.ibm.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers