This article addresses the most commonly encountered scenarios and errors while porting High Performance Computing Fortran (parallel computing) applications on different systems. You'll focus on the common scenarios with large Fortran, C, and mixed-mode High Performance Computing applications such as the computational fluid dynamic (CFD) models, weather models, and linear algebra models (ScaLAPACK, LAPACK, and BLAS) using IBM XL compilers (XLC and XLF) and gnu compilers (gcc and g77) on large clusters (System p5, Blue Gene®, OpenPOWER, and Linux® clusters). The tips offered in this article are applicable to all types of porting work in Fortran on any system.
The error messages you can get while porting such applications are not always obvious, and it can take some serious effort to learn the causes for some of the errors. This article includes:
- Undefined references, including underscore problems, wrong order of libraries in linking, recursive references, and mix up of 32-bit and 64-bit libraries
- Memory and segmentation errors
- Linking errors with multi-threaded applications
- Compilation of Fortran77 code using Fortran 90 or Fortran 95
- Errors that occur while using Fortran intrinsic functions
- Fortran I/O and input redirection
- Null terminated string issues while passing strings from Fortran to C programs
- Working with compiler optimizations with XL Fortran compilers
- Errors related to null communicator in Message Passing Interface (MPI) programs
Undefined references: The most common error
Probably the most common error message observed while linking Fortran object files with C object files is shown in
Listing 1below.
Listing 1. Undefined reference error
undefined reference to 'dgamv_' |
There are different reasons why you might get this error.
You can get the undefined reference error if the library or object file that should consist of the function, such as dgamv above, is
completely missing.
But it's more likely that the function is present with either an underscore absent in its definition or with
extra underscores added.
In the case of IBM XL compilers, the compiler flag that helps in resolving the problem is -qextname or
-qnoextname. These two compiler flags can be used either to add an underscore or maintain the definition without
underscores, per the requirement of the caller functions. For example, out of 'n' number of functions
in a file, you could have 'm' functions with underscores
and the remaining 'n-m' functions without underscores. It is possible to give a subset of functions to have underscores.
In the case of g77/gcc, flags, such as -Mnounderscore, -fno-underscore, or -fno-double-underscore, would help.
Order of libraries in link step matters
Another situation for getting the undefined references error is with a program that needs to be compiled, as shown in Listing 2.
Listing 2. Linking step
<Compiler> -o <Executable> <Object Code> <Library1> <Library2> <Library3> |
If a function in Library3 flashes an error, undefined reference to dgamv and dgamv is indeed present in Library2. Then based on the compilers
and the system under examination, the example in Listing 3 could work.
Listing 3. Modified linking step
<Compiler> -o <Executable> <Object Code> <Library1> <Library3> <Library2> |
The references from the function can be resolved by placing
Library2 to the right of Libray3.
Recursive references of libraries
In the case of undefined references, it would be handy to stop using the compilers and take out only the linker step to experiment with, and then use a linker, such as "ld", to debug what is happening. Library2 and Library3 might sometimes refer to functions in each other. For example, Library3 needs some functions in Library2 and Library2 needs some from Library3. In case of libraries ScaLAPACK and BLACS, the most common practice on many parallel systems is to do recursive references, as shown in Listing 4.
Listing 4. Recursive references
<Compiler> -o <Executable> <Object Code> <Library1> <Library3> <Library2> <Library3> |
Otherwise, you could use a "ld" or "gcc" linker for linking, as follows:
Listing 5. Using groups for recursive references
<gcc(or ld)> -o <Executable> <Object Code> <Library1> -Wl,--startgroup <Library2> <Library3> --Wl,--endgroup |
All the libraries between -Wl,--startgroup and -Wl,--endgroup would be grouped and can have references to functions in both directions.
Mix up of 32-bit and 64-bit libraries
You can also get undefined references due to a mix up of 32-bit and 64-bit compiled libraries and compilation with different and incompatible compilers. In this case, it would be
advisable to do nm -s and file on each of the libraries (with extension .a, .so, and so on). By doing so, you can see the validity of the libraries and check the file formats. Then use -q32 or -q64 flags consistently over all the files or functions that are being linked to form an executable.
Memory errors, segmentation fault, and insufficient memory
While porting High Performance Computing applications, often a program successfully executes for small data sets and starts showing errors, such as Segmentation Fault or Insufficient memory, in the case of large data sets although sufficient memory is available. The error with XLC and XLF on POWER machines would look like Listing 6.
Listing 6. Memory-related error
1525-108 Error encountered while attempting to allocate a data object. The program will stop. |
In this situation, check ulimit -a for the limits of virtual memory and any other parameters, make them large or unlimited, and execute the program again.
The compilers, such as XLF and XLC, would keep a limited data segment by default. POWER (architecture) programs access memory through
segment-based addresses. For example, in POWER4 machines in the 32-bit environment, there are 16 segment registers and each can reference a segment of up to 256MB. Programmers can prevent the stack overwriting non-stack data by limiting the size of the stack. This can be done by calling the linker (ld) with the -S option. The programmer can also use the shell ulimit command (ksh: ulimit, csh: limit) to limit the size of the stack or data area at run time.
Alternatively, you can call the linker specifying the -bmaxdata option. It specifies a maximum size for the user data area where 32-bit programs need to access more than 256MB of data. These programs need to be compiled with the -bmaxdata option. For example, using -bmaxdata:0x80000000 enables
the maximum possible data space of 2GB. Using -bmaxdata:-1 enables unlimited memory on the system. In the 64-bit environment, there are effectively an unlimited number of segment registers. Note that in the 64-bit environment, -bmaxdata should not be used because it limits the addressable data space. Each segment consists of a number of pages (of the same size).
By default, pages are 4KB. Using this option is one way to get out of the insufficient memory situation.
Linking errors while compiling multi-threaded applications
Whenever there are compile time errors or run time errors using multi-threaded programs, the basic thing to do is to use _r versions for XLF and XLC, such as [mp]xlf[_r] or [mp]xlc[_r], and use -qnosave for the correctness of programs. _r versions are reentrant versions of the normal [mp]xlf/[mp]xlc compilers.
Compilation of Fortran77 programs using Fortran90 or 95 compilers
Due to either some advanced features or portability issues, it is a general practice to compile Fortran77 programs using Fortran90 or Fortran95 compilers, as shown in Listing 7 below.
Listing 7. Syntax is incorrect error
xlf95 -c grrad2.f
The most common error which could be observed is as follows.
"grad.f", line 1.30: 1513-051 (S) The PROGRAM/SUBROUTINE/FUNCTION/BLOCK DATA
statement is ignored
due to the presence of syntax errors.
"grad.f", line 293.43: 1515-019 (S) Syntax is incorrect.
"grad.f", line 295.3: 1515-019 (S) Syntax is incorrect.
"grad.f", line 299.18: 1515-022 (S) Syntax Error: Extra token " , " was found. The
token is ignored.
"grad.f", line 299.19: 1515-022 (S) Syntax Error: Extra token " k " was found. The
token is ignored.
|
There would be a huge number of Syntax is incorrect errors or Tokens missed statements. If the coder is confident of the source code, then there is no need to panic if too many such errors flow. The simplest way to deal with this is to have a careful look at the error messages. The column width, as understood by the Fortran compilers, is different from system to system. So, start experimenting with the -qfixed or -qfree options, as shown in
, and you should be able to resolve the error.
-qfixed[=<num>] states that Fortran code is in fixed source form, and optionally
specifies the maximum line length.
-qfree[={f90|ibm}] states that Fortran code is in either Fortran 90 free source form
(-qfree or -qfree=f90) or IBM
free source form (-qfree=ibm). |
In the above example, using
-qfixed solves the problem.
Another important flag that comes in handy while compiling Fortran 77 programs using Fortran 90 or Fortran 95 compilers is the -qsuffix flag.
-qsuffix=<option>=<suffix> specifies the source file suffix on the command line instead of in the .cfg file. The options are shown in
.
f=<suffix> where <suffix> is the new source file suffix o=<suffix> where <suffix> is the new object-file suffix s=<suffix> where <suffix> is the new assembler source file suffix cpp=<suffix> where <suffix> is the new preprocessor source file suffix. |
Minimum size of partition 5 exceeds partition limit
In the case of the highest possible optimizations, such as -O5, IPA is performed by the compiler and requires huge amounts of temporary space. If the default amount is not sufficient, you get an error message, such as The minimum size of partition 5 exceeds the partition limit.
Adding -qipa=partition=large to the compilation statement would help overcome the error.
Fortran programs have intrinsic functions, such as sin and cos, and these can sometimes crash while the functions are called. The most probable reason is that
the library linked for these functions is not correct. Some of the other functions, such as flush and exit, might compile okay, but some of the functions might not execute properly. If these functions show errors at link time, you might need to add an underscore to them, such as flush_ and exit_, in the code, or use âqextname for the above function calls.
There are some simple basics of Fortran I/O that, if not clearly understood, can cause errors for programmers. The unit numbers 5, 6, and 0 are defined as standard input, standard output, and standard error, respectively. If you have these, no open statements would be present in the program. For any other unit number, if an open statement is not present and you encounter read or write statements in the code, it means the file that is being read or written is of form fort.<unit no.>, such as fort.10, fort.13, and so on. The name changes from system to system are based on the standards being used.
If your code is hanging on a read or write, either the file is missing or an invalid file is present. You often see the following error, as shown in Listing 10.
Listing 10. Error message on missing file or input
No such file or directory (or) 1525-001 The READ statement on the file grad.f cannot be completed because the end of the file was reached. The program will stop. |
In case of Fortran I/O-related errors, try using utilities, such as the XLFRTEOPTS environment variable or setxlfrteopts function, and observe the units from where the code expects
the data. It is also possible that the formats used in different and incompatible ways in Fortran I/O could be causing the problem. In case a namelist is not read, try experimenting with XLFRTEOPTS="namelist=old" or using defaults based on whether the input file is in conformance with an old or new format.
To enable multiple and simultaneous reads on a single file, some systems provide a flag, such as XLFRTEOPTS="multiconn=yes", when a parallel program is running. Being aware of all such handy flags while executing Fortran programs is useful.
Standard input redirection in parallel programs
In a parallel environment, input redirection, such as mpirun -np 4 run <infile, might not work as intended. Search for any options the particular parallel run script might have. Mpirun has some options for standard input, such as -stdin infile or -args '-in infile', that might help overcome the problem.
In the case of large applications, such as ScaLAPACK, where you would compile BLAS (Fortran), BLACS (C and Fortran), ScaLAPACK (C and Fortran), and execute
the test programs compiled using XLC and XLF, the code could fail as a constant string passed as an argument to
a C-coded function from a Fortran subroutine or program that was not correctly received. The way to solve this is to use the -qnullterm option (based on the system and
compilers used), which makes the constant strings null terminated and then passed to the C function accurately.
Working with compiler optimizations with XL Fortran compilers
While working with compiler optimizations using XL Fortran compilers, some simple guidelines are helpful in optimizing the codes. Always test your code with lower optimization flags such as -O or -O0 or -O2 and make sure your code is working perfectly. In case of any problems, the source code needs to be corrected first. Then, it is always better to increase the optimization flags on the source code by one step at a time in the order of -O3, -O4, -O5 and any other additional options relevant to the code. The higher optimizations such as -O3, -O4, -O5 can alter certain floating-point semantics of your application to gain execution speed. And it is very rare that any change in output happens, however you can specify the flag -qstrict if your application is sensitive to floating-point exceptions or if the order of the floating-point arithmetic to be evaluated is important. Without -qstrict, the difference in output of any operation is very small compared to lower optimization levels. Only in certain cases, where this effect gets added up in loops, we may need -qstrict.
You can see loops that repeatedly update some real value in each iteration, and the values being assigned to a variable
get saturated after an iteration where it is not supposed to be. Also, the values get truncated. In that case, try increasing the variable size using
-qrealsize=8 or -qautodbl=dbl4 to increase the precision.
Macros or #define not working with Fortran programs
The preprocessor directives, such as -Ddebug or -Dibm, when used with Fortran programs, might need a syntax change while using them with a cpp preprocessor. The preprocessor definitions need to be specified with -WF added so that
they are passed to the cpp processor as -WF,-Ddebug. To specify the preprocessor name
in a compile statement, you can specify flags, such as -qsuffix=f=f90:cpp=F90, in addition to the normal compile
flags. It differs from case to case.
If you see the Null/Invalid communicator error in parallel programs that deal with communicators and sub-communicators, the most common reasons could be:
- Including a wrong MPI header file
- Wrong implementation of
MPI_Comm_create - Including
mpi.hinstead ofmpif.hor vice versa
There are several types of scenarios for errors with Fortran code and HPC code porting. This article discussed the most common scenarios and errors and offered practical solutions.
Learn
- IBM XL Fortran for AIX User's Guide: Provides more details on
compiler flags.
- The POWER4 Processor Introduction and Tuning Guide: This Redbook has more details on compiler optimization flags.
- AIX and UNIX: Visit the developerWorks AIX and UNIX zone to expand your UNIX skills.
- New to AIX and UNIX: Visit the New to AIX and UNIX page to learn more about AIX and UNIX.
- developerWorks
technical events and webcasts: Stay current with developerWorks technical events and webcasts.
- AIX 5L Wiki: A collaborative environment for technical information related to AIX.
- Podcasts: Tune in and catch up with IBM technical experts.
Get products and technologies
- IBM trial software: Build your next development project with software for download directly from developerWorks.
Discuss
-
Participate in the AIX and UNIX forums:
- AIX 5L -- technical
- AIX for Developers Forum
- Cluster Systems Management
- IBM Support Assistant
- Performance Tools -- technical
- Virtualization -- technical
- More AIX and UNIX forums
- Participate in the developerWorks
blogs and get involved in the developerWorks community.

Subba R. Bodda works as a Staff Software Engineer at IBM for the High Performance Computing team in the India Systems and Technology Lab. He has experience in porting scientific and engineering applications, benchmarking and performance analysis of clusters (such as Linux clusters, pSeries, Blue Gene/L), and with various benchmarks and applications (such as HPL, FFT, ScaLAPACK, CFD codes, and NAS). Subba has presented a technical paper on "Benchmarking and Performance of MPI Parallel Programs on PARAM 10000 -- a cluster of Symmetric Multiprocessors (SMPs)" at HPC Asia 2002. He also presented posters on P-COMS, "Measuring Communication Overhead Time on Message Passing Clusters (PCOMS)" at HiPC-2002, and "Grid Computing Tools" at HiPC 2003.
Comments (Undergoing maintenance)





