Navigating "C" in a "leaky" boat? Try Purify

Fixing memory usage errors and leaks using IBM Rational Purify

Inappropriate memory usage is one of the most intractable classes of programming errors. Analyzing and fixing these errors is extremely difficult for two reasons: the source of the memory corruption and the manifestation of the error are usually far apart, making it hard to correlate the cause and effect, and the symptoms appear under exceptional conditions, making it hard to consistently reproduce the error. IBM® Rational® Purify® tool can identify these errors automatically and accurately. By learning how to use Purify effectively, you can eliminate these errors from your C or C++ programs.

Satish Chandra Gupta (satish.gupta@acm.org), Senior Software Engineer, IBM

Satish Chandra GuptaSatish Chandra Gupta is a programmer and loves building software engineering and programming tools. His interests include performance (CPU time, memory, energy) profiling tools, compilers, programming languages, type theory, software engineering, and software development environments. While at IBM, he was architect for UML Action Language tooling in Rational Software Architect, and tech lead for Rational PurifyPlus on AIX and Java leak detection tooling in Rational Application Developer. You can keep up with him through his Twitter and Google+ feeds.


developerWorks Contributing author
        level

Giridhar Sreenivasamurthy (giridhar.s@in.ibm.com), Programmer, PurifyPlus, Rational Software, IBM

Giridhar SreenivasamurthyGiridhar Sreenivasamurthy is a programmer with the Rational® PurifyPlus® group, IBM® Software, Bangalore, India. His interests are in the areas of runtime analysis, computer architecture, compilers, and object-oriented design. He received a B.E. from Visveswaraiah Technological University, Karnataka (India).



22 August 2006

Introduction

Most programmers agree that defects related to incorrect memory usage and management are the hardest to isolate, analyze, and fix. Therefore, they are the costliest defects to have in your programs. These defects are typically caused by using uninitialized memory, using un-owned memory, buffer overruns, or faulty heap management.

IBM® Rational® Purify® is an advanced memory usage error detecting tool that enables software developers and testers to detect memory errors in C and C++ programs. While a program runs, Purify collects and analyzes data to accurately identify memory errors that are about to happen. It provides detailed information, such as the error location (function call stack) and size of the affected memory, to assist you in quickly locating the problem areas. It also greatly reduces debugging time and complexity, so you can focus on fixing the flaw in the application logic that is causing the error.

Purify is available for all prominent platforms, including IBM® AIX® on Power PC®, HP-UX® on PA-RISC, Linux™ on x86 and x86/64, Sun™ Solaris™ on SPARC®, and the Microsoft® Windows® on x86 (check documentation for updated list of supported platform). In this article, you will first learn about various types of memory access errors with the help of examples, and then learn how to use Purify for detecting and fixing those errors. In the Download section, you will find the C source file (memerrors.c) with the code samples in this article, and you can use them to experiment with Purify.

Memory errors

Memory errors can be broadly classified into four categories:

  1. Using memory that you have not initialized
  2. Using memory that you do not own
  3. Using more memory than you have allocated (buffer overruns)
  4. Using faulty heap memory management

Purify detects errors in all of these categories and identifies the type of the error within a category. Understanding the types of errors helps you identify and isolate subtle mistakes in your program that may cause the program to act strangely and unpredictably. In rest of this section, various error types are explained using code samples.

Using memory that you have not initialized

When you read from memory that you forgot to initialize, you get garbage value. This error looks deceptively innocent, but it has the potential to cause mysterious program behavior.

The garbage value that you get could fortuitously happen to be a meaningful value that your program can handle. For example, some operating systems initialize a memory block with zeros when it is allocated for the first time. If zero is a meaningful value for your program, it may run smoothly, initially. However, after the program runs for a while, the memory might be freed and reallocated. When a memory block is recycled, it has the values that were stored in it when it was last used. These values are unpredictable. Depending upon the value, your program may crash immediately, may run for a while and crash sometime later, or may run smoothly but produce strange results. Since the value could be different in each run, the behavior of the program can be baffling, making it hard to reproduce the problem consistently.

Purify detects such errors and reports an Uninitialized Memory Read (UMR) error for every use of uninitialized memory. It further differentiates between using uninitialized memory and copying value from an uninitialized memory location to another memory location. When an uninitialized memory is copied, Purify reports an Uninitialized Memory Copy (UMC) error. After the copying, the destination location also has uninitialized memory; therefore, whenever this memory is used, Purify reports a UMR.

Listing 1 shows a simple example. There are two integers: i and j. The integer i is initialized with 10. Then the value of j is copied into i. Since j has not been initialized, i also has garbage value after j is copied into it. Purify maintains status of each memory location. It is capable of the analysis that reveals that, although i has been initialized with 10, copying an uninitialized value has made i also uninitialized. Therefore, Purify reports any usage of i (for example, as an argument to printf in the next line) as a UMR error.

Listing 1. An example of UMR and UMC errors
void uninit_memory_errors() {
    int i=10, j;
    i = j;     /* UMC: j is uninitialized, copied into i */
    printf("i = %d\n", i); /* UMR: Using i, which has junk value */
}

This example is intentionally trivial to make it easy for you to identify the problem just by inspecting the code. But real-world applications have many thousands lines of code and have complex control flow. The location where a valid value is corrupted by copying a garbage value into it, could be in a different function, and potentially in a different sub-system or library. If you inspect the bar method in Listing 2, and you do not know much about foo method, you would not suspect that i would be corrupted after calling the foo method. Depending upon the size and complexity of the source code, you may have to spend considerable time and effort to analyze and then to rectify this type of defect. Purify eliminates this effort and reports UMRs, indicating the use of uninitialized memory value.

Listing 2. Another example of UMR and UMC errors
void foo(int *pi) {
    int j;
    *pi = j; /* UMC: j is uninitialized, copied into *pi */
}

void bar() {
    int i=10;
    foo(&i);
    printf("i = %d\n", i); /* UMR: Using i, which is now junk value */
}

As you notice, whenever a memory location with a UMC error is finally used, Purify reports a UMR error for that same memory location. UMC errors may not always be critical, and Purify hides them by default. Later in this article, you will learn how to see UMC and other errors that Purify hides.

Using memory that you don't own

Explicit memory management and pointer arithmetic present opportunities for designing compact and efficient programs. However, incorrect use of these features can lead to complex defects, such as a pointer referring to memory that you don't own. In this case, too, reading memory through such pointers may give garbage value or cause segmentation faults and core dumps, and using garbage values can cause unpredictable program behavior or crashes.

Purify detects these errors. In addition to reporting the type of error, Purify indicates the memory area that the pointer refers to and where that memory has been allocated. This is typically a good clue for identifying the cause of the error. This category includes following types of errors:

  • Null pointer read or write (NPR, NPW)
  • Zero page read or write (ZPR, ZPW)
  • Invalid pointer read or write (IPR, IPW)
  • Free memory read or write (FMR, FMW)
  • Beyond stack read or write (BSR, BSW)

Null Pointer Read/Write (NPR, NPW) and Zero Page Read/Write (ZPR, ZPW):

If a pointer's value can potentially be null (NULL), the pointer should not be de-referenced without checking it for being null. For example, a call to malloc can return a null result if no memory is available. Before using the pointer returned by malloc, you need to check it to make sure that isn't null. For example, a linked list or tree traversal algorithm needs to check whether the next node or child node is null.

It is common to forget these checks. Purify detects any memory access through de-referencing a null pointer, and reports an NPR or NPW error. When you see this error, examine whether you need to add a null pointer check or whether you wrongly assumed that your program logic guaranteed a non-null pointer. On AIX, HP, and under some linker options in Solaris, dereferencing a null pointer produces a zero value, not a segmentation fault signal.

The memory is divided into pages, and it is "illegal" to read from or write to a memory location on the zero'th page. This error is typically due to null pointer or incorrect pointer arithmetic computations. For example, if you have a null pointer to a structure and you attempt to access various fields of that structure, it will lead to a zero page read error, or ZPR.

Listing 3 shows a simple example of both NPR and ZPR problems. The findLastNodeValue method has a defect, in that it does not check whether the head parameter is null. NPR and ZPR errors occur when the next and val fields are accessed, respectively.

Listing 3. An example of NPR and ZPR errors
typedef struct node {
    struct node* next;
    int          val;
} Node;

int findLastNodeValue(Node* head) {
    while (head->next != NULL) { /* Expect NPR */
        head = head->next;
    }
    return head->val; /* Expect ZPR */
}

void genNPRandZPR() {
    int i = findLastNodeValue(NULL);
}

Invalid Pointer Read or Write (IPR, IPW):

Purify tracks all memory operations. When it detects a pointer to a memory location that has not been allocated to the program, it reports either an IPR or IPW error, depending on whether it was a read or write operation. The error can happen for multiple reasons. For example, you will get this type of error if you have an uninitialized pointer variable and the garbage value happens to be invalid. As another example, if you wanted to do *pi = i;, where pi is a pointer to an integer and i is an integer. But, by mistake, you didn't type the * and wrote just pi = i;. With the help of implicit casting, an integer value is copied as a pointer value. When you dereference pi again, you may get an IPR or IPW error. This can also happen when pointer arithmetic results in an invalid address, even when it is not on the zero'th page. (See Listing 4.)

Listing 4. An example of IPR and IPW errors
void genIPR() {
    int *ipr = (int *) malloc(4 * sizeof(int));
    int i, j;
    i = *(ipr - 1000); j = *(ipr + 1000); /* Expect IPR */
    free(ipr);
}

void genIPW() {
    int *ipw = (int *) malloc(5 * sizeof(int));
    *(ipw - 1000) = 0; *(ipw + 1000) = 0; /* Expect IPW */
    free(ipw);
}

IPR and IPW are encountered commonly while using functions that return a pointer (e.g. malloc) in 64-bit applications because pointer is 8 byte long and integer is 4 byte long. If the method declaration is not included, compiler assumes that the method returns an integer, and implicitly casts the return value and retains only lower 4 bytes of the pointer value. Purify reports IPR and IPW upon using this invalid pointer. (See Listing 5.)

Listing 5. Another example of IPR and IPW errors
/*Forgot to include following in a 64-bit application:
#include <malloc.h>
#include <stdlib.h>
 */

void illegalPointer() {
    int *pi = (int*) malloc(4 * sizeof(int));
    pi[0] = 10; /* Expect IPW */
    printf("Array value = %d\n", pi[0]); /* Expect IPR */
}

Free Memory Read or Write (FMR, FMW):

When you use malloc or new, the operating system allocates memory from heap and returns a pointer to the location of that memory. When you don't need this memory anymore, you de-allocate it by calling free or delete. Ideally, after de-allocation, the memory at that location should not be accessed thereafter.

However, you may have more than one pointer in your program pointing to the same memory location. For instance, while traversing a linked list, you may have a pointer to a node, but a pointer to that node is also stored as next in the previous node. Therefore, you have two pointers to the same memory block. Upon freeing that node, these pointers will become heap dangling pointers, because they point to memory that has already been freed. Another common cause for this error is usage of realloc method. (See Listing 6 code.)

The heap management system may respond to another malloc call in the same program and allocate this freed memory to other, unrelated objects. If you use a dangling pointer and access the memory through it, the behavior of the program is undefined. It may result in strange behavior or crash. The value read from that location would be completely unrelated and garbage. If you modify memory through a dangling pointer, and later that value is used for the intended purpose and unrelated context, the behavior will be unpredictable. Of course, either an uninitialized pointer or incorrect pointer arithmetic can also result in pointing to already freed heap memory.

Listing 6. An example of FMR and FMW errors
int* init_array(int *ptr, int new_size) {
    ptr = (int*) realloc(ptr, new_size*sizeof(int));
    memset(ptr, 0, new_size*sizeof(int));
    return ptr;
}

int* fill_fibonacci(int *fib, int size) {
    int i;
    /* oops, forgot: fib = */ init_array(fib, size);
    /* fib[0] = 0; */ fib[1] = 1;
    for (i=2; i<size; i++)
        fib[i] = fib[i-1] + fib[i-2];
    return fib;
}

void genFMRandFMW() {
    int *array = (int*)malloc(10);
    fill_fibonacci(array, 3);
}

Beyond Stack Read or Write (BSR, BSW) :

If the address of a local variable in a function is directly or indirectly stored in a global variable, in a heap memory location, or somewhere in the stack frame of an ancestor function in the call chain, upon returning from the function, it becomes a stack dangling pointer. When a stack dangling pointer is de-referenced to read from or write to the memory location, it accesses memory outside of the current stack boundaries, and Purify reports a BSR or BSW error. Uninitialized pointer variables or incorrect pointer arithmetic can also result in BSR or BSW errors.

In the example in Listing 7, the append method returns the address of a local variable. Upon returning from that method, the stack frame for the method is freed, and stack boundry shrinks. Now the returned pointer would be outside the stack bounds. If you use that pointer, Purify will report a BSR or BSW error. In the example, you would expect append("IBM ", append("Rational ", "Purify")) to return "IBM Rational Purify", but it returns garbage manifesting BSR and BSW errors.

Listing 7. An example of BSR and BSW errors
char *append(const char* s1, const char *s2) {
    const int MAXSIZE = 128; 
    char result[128];
    int i=0, j=0;

    for (j=0; i<MAXSIZE-1 && j<strlen(s1); i++,j++) {
        result[i] = s1[j];
    }

    for (j=0; i<MAXSIZE-1 && j<strlen(s2); i++,j++) {
        result[i] = s2[j];
    }

    result[++i] = '\0';
    return result;
}

void genBSRandBSW() {
    char *name = append("IBM ", append("Rational ", "Purify"));
    printf("%s\n", name); /* Expect BSR */
    *name = '\0'; /* Expect BSW */
}

Using memory that you haven't allocated, or buffer overruns

When you don't do a boundary check correctly on an array, and then you go beyond the array boundary while in a loop, that is called buffer overrun. Buffer overruns are a very common programming error resulting from using more memory than you have allocated. Purify can detect buffer overruns in arrays residing in heap memory, and it reports them as array bound read (ABR) or array bound write (ABW) errors. (See Listing 8.)

Listing 8. An example of ABR and ABW errors
void genABRandABW() {
    const char *name = "IBM Rational Purify";
    char *str = (char*) malloc(10);
    strncpy(str, name, 10);
    str[11] = '\0'; /* Expect ABW */
    printf("%s\n", str); /* Expect ABR */
}

Using faulty heap memory management

Explicit memory management in C and C++ programming puts the onus of managing memory on the programmers. Therefore, you must be vigilant while allocating and freeing heap memory. These are the common memory management mistakes:

  • Memory leaks and potential memory leaks (MLK, PLK, MPK)
  • Freeing invalid memory (FIM)
    • Freeing mismatched memory (FMM)
    • Freeing non-heap memory (FNH)
    • Freeing unallocated memory (FUM)

Memory leaks and potential memory leaks:

When all pointers to a heap memory block are lost, that is commonly called a memory leak. With no valid pointer to that memory, there is no way you can use or release that memory. You lose a pointer to a memory when you overwrite it with another address, or when a pointer variable goes out of the scope, or when you free a structure or an array that has pointers stored in it. Purify scans all of the memory and reports all memory blocks without any pointers pointing to them as memory leaks (MLK). In addition, it reports all blocks as potential leaks, or PLK (called MPK on Windows platforms) when there are no pointers to the beginning of the block but there are pointers to the middle of the block.

Linsting 9 shows a simple example of a memory leak and a heap dangling pointer. In this example, interestingly, methods foo and main independently seem to be error-free, but together they manifest both errors. This example demonstrates that interactions between methods may expose multiple flaws that you may not find simply by inspecting individual functions. Real-world applications are very complex, thus tedious and time-consuming for you to inspect and to analyze the control flow and its consequences. Using Purify gives you vital help in detecting errors in such situations.

First, in the method foo, the pointer pi is overwritten with a new memory allocation, and all pointers to the old memory block are lost. This results in leaking the memory block that was allocated in method main. Purify reports a memory leak (MLK) and specifies the line where the leaked memory was allocated. It eliminates the slow process of hunting down the memory block that is leaking, therefore shortens the debugging time. You can start debugging at the memory allocation site where the leak is reported, and then track what you are doing with that pointer and where you are overwriting it.

Later, the method foo frees up the memory it has allocated, but the pointer pi still holds the address (it is not set to null). After returning from method foo to main, when you use the pointer pi, it refers to the memory that has already been freed, so pi becomes a dangling pointer. Purify promptly reports a FMW error at that location.

Listing 9. An example of a memory leak and a dangling pointer
int *pi;
void foo() {	
    pi = (int*) malloc(8*sizeof(int)); /* Allocate memory for pi */
    /* Oops, leaked the old memory pointed by pi holding 4 ints */
    /* use pi */
    free(pi); /* foo() is done with pi, so free it */
}
void main() {
    pi = (int*) malloc(4*sizeof(int)); /* Expect MLK: foo leaks it */
    foo();
    pi[0] = 10; /* Expect FMW: oops, pi is now a dangling pointer */
}

Listing 10 shows an example of a potential memory leak. After incrementing pointer plk, it points to the middle of the memory block, but there is no pointer pointing to the beginning of that memory block. Therefore, a potential memory leak is reported at the memory allocation site for that block.

Listing 10. An example of potential memory leak
int *plk = NULL;
void genPLK() {
    plk = (int *) malloc(2 * sizeof(int)); /* Expect PLK */
    plk++;
}

Freeing invalid memory:

This error occurs whenever you attempt to free memory that you are not allowed to free. This may happen for various reasons: allocating and freeing memory through inconsistent mechanisms, freeing a non-heap memory (say, freeing a pointer that points to stack memory), or freeing memory that you haven't allocated. When using Purify for the Windows platform, all such errors are reported as freeing invalid memory (FIM). On the UNIX® system, Purify further classifies these errors by reporting freeing mismatched memory (FMM), freeing non-heap memory (FNH), and freeing unallocated memory (FUM) to indicate the exact reason for the error.

Freeing mismatched memory (FMM) is reported when a memory location is de-allocated by using a function from a different family than the one used for allocation. For example, you use new operator to allocate memory, but use method free to de-allocate it. Purify checks for the following families, or matching pairs:

  • malloc() / free()
  • calloc() / free()
  • realloc() / free()
  • operator new / operator delete
  • operator new[] / operator delete[]

Purify reports any incompatible use of memory allocation and de-allocation routine as an FMM error. In the example in Listing 11, the memory was allocated using the malloc method but freed using the delete operator, which is not the correct counterpart, thus incompatible. Another common example of an FMM error is C++ programs that allocate an array using the new[] operator, but free the memory using a scalar delete operator instead of array delete[] operator. These errors are hard to detect through code inspection, because the memory allocation and de-allocation locations may not be located close to each other, and because there is no difference in syntax between an integer pointer and a pointer to an integer array.

Listing 11. An example of a freeing mismatched memory error
void genFMM() {
    int *pi = (int*) malloc(4 * sizeof(int));
    delete pi; /* Expect FMM/FIM: should have used free(pi); */
    pi = new int[5];
    delete pi; /* Expect FMM/FIM: should have used delete[] pi; */
}

Freeing non-heap memory (FNH) error is reported when you call free with a non-heap address (a stack address, for instance). Freeing unallocated memory (FUM) is reported when you try to free unallocated memory, such as memory that you have already freed, or the pointer you are trying to free points to the middle of a memory block. Listing 12 shows examples of these errors.

Listing 12. Examples of freeing non-heap memory and freeing unallocated memory errors
void genFNH() {
    int fnh = 0;
    free(&fnh); /* Expect FNH: freeing stack memory */
}

void genFUM() {
    int *fum = (int *) malloc(4 * sizeof(int));
    free(fum+1); /* Expect FUM: fum+1 points to middle of a block */
    free(fum);
    free(fum); /* Expect FUM: freeing already freed memory */
}

Using IBM Rational Purify

Purify detects memory errors in a program by performing a runtime analysis on the program. It instruments a program and inserts additional code at appropriate locations. When the instrumented program runs, the additional code collects necessary information and performs runtime memory validations. If any of the memory checks fail, the Purify reports errors. At the end of the check, it performs a leak scan to locate leaked memory blocks (you can also request a leak scan anytime).

Purify instruments the program and the library object code. This process is called object code instrumentation (OCI). Because Purify relies on object code instrumentation, not source code instrumentation, it is possible for it to perform memory checks on third-party libraries that you link with your program even if you don't have the source code for those libraries.

Compile your code with debug flags. Purify uses debug information to relate errors to source line numbers, and it will display relevant source code along with error reports. For the part of your program where there is no debug information, Purify relates the errors to object code information, such as program counters and instructions.

Purify reports function call chains along with the memory usage errors. Even in cases where the error is occurring in a third-party library, a call to those libraries must have been made from your code (for which source code is available). In a call chain, methods with debug information are displayed along with the source, thereby pinpointing the line where a call to a third party function is made. This provides vital clues about circumstances in which an error is occurring. From this information, you can analyze whether the error is in your code (for example, you are passing an uninitialized argument to a third-party function) or in the third-party library (there is an uninitialized variable in the library, for instance).

Using Purify involves the following steps:

  1. Compile your code with the debug option
  2. Use Purify to instrument the binaries
  3. Run the instrumented program
  4. Examine and fix the errors reported by Purify

The process of instrumentation is different on Windows and UNIX platforms. The rest of this section explains how to use Purify on these platforms.

Using Purify on Windows

On the Windows platform, there are two ways you can use Purify:

  • First, Purify integrates with the Microsoft® Visual Studio® IDE, and you can engage or disengage Purify at a click of the button. If Purify is engaged, when you build a project, Purify is automatically used to instrument the executable you are building. When you run that executable program, Purify's error-checking code runs and reports any errors within the IDE, along with the memory-usage statistics.
  • The second option to invoke Purify is to use the Purify GUI to instrument an executable and to set various instrumentation options. When you run the instrumented program, any errors are displayed in a window within the Purify GUI. In this section, you will learn the steps of instrumenting an executable, running the instrumented program, and inspecting errors here, with the help of screenshots.

Purify requires relocation data for precise error checking, which Visual Studio compiler does not generate by default. You can use /fixed:no and /incremental:no linker options to add relocation data. Without relocation data, Purify does minimal error checking.

To instrument and run the program, start the Purify in the stand-alone mode and follow these steps:

  1. Compile the file named memerrors.c (provided in the Download section) with the debug option, and then generate an executable program.
  2. Select File > Run to bring up the Run Program dialog box (Figure 1).
  3. Specify the path to the executable location in the Program name box.
  4. Select the Error and leak data radio button under the Collect options.
  5. Select the Pause console after exit check box to retain the console after the program finishes running.
  6. (Optional) Change the Purify setting by clicking Settings.
  7. Click Run.
Figure 1. Run Program dialog to instrument an executable program
Run Program dialog

Purify instruments the binary files and then runs the program. A window like the one shown in Figure 2 displays the progress of instrumentation of the executable program and the various libraries required by the program.

Figure 2. Instrumentation Progress of the executable and DLLs
Instrumentation Progress

After instrumentation, the instrumented program runs. As it runs, Purify reports memory access errors as they are detected. At the end, Purify also reports memory leaks in the program. Figure 3 shows a typical Purify error and leak report.

Figure 3. Memory errors found by Purify (on Windows)
Memory errors

Purify reports errors, warnings, and summaries of memory leaks. You can click an error to see detailed information, such as statistics, stack trace, and line numbers. For example, if you click the array bounds read (ABR) error, it will show you both the error location and the memory allocation location (see Figure 4). Looking at the source code at the error location, you can tell that the ABR error occurs at line 110 of the file memerrors.c when string str is passed to printf. The printf method processes the str string till it encounters a NULL byte. Prior to calling printf, str is populated by copying 10 bytes from name, and str[11] is set to NULL to terminate the string. So printf must be reading 11 bytes. If you look at the memory allocation location, you will notice that str was allocated to hold 10 characters at line 107 of the file memerrors.c. Putting these three together, str[11] would have accessed a byte outside of the array boundary, hence the array bounds read (ABR) error. This is a typical error of miscomputation of the size of array needed to store a null-terminated string.

Figure 4. Details of ABR error including source code and line number information (on Windows)
ABR on Windows

Examine each error reported by Purify. With the helpful details provided by Purify, you can debug and eliminate the errors. After fixing any defects, run Purify again and verify that it no longer reports the error.

Purify is highly customizable through various options that give you the flexibility to choose the types of analyses that you need. You can also suppress the error reports you are not interested in (such as an error in the third party library, which you cannot fix).

Here are the steps for specifying on a Windows platform that you are interested in UMC errors, including those related to stack variables:

  1. Click Settings in the Run Program dialog (Figure 1).
  2. Go to the Errors and Leaks tab and check Show UMC messages (Figure 5).
  3. Go to the Files tab and type -stack-load-checking in the Additional options box (Figure 6), which tells Purify to check UMR errors on stack variables.
  4. Click OK, and then click Run in the Run Program dialog.
  5. Purify runs and reports UMC memory use errors in your program (Figure 3).
Figure 5. Tick the Checkbox "Show UMC messages" in the Settings dialog (on Windows)
Show UMC messages
Figure 6. Additional options to check UMR errors on stack variables (on Windows)
Additional options to check UMR errors

You can also define filters to suppress error reports that you are not interested in seeing. On the left pane of the Purify window, right-click on Run and select the Filter Manager dialog (Figure 7). Purify provides a rich set of choices to suppress certain types of error reports or to suppress a set of error reports only in a particular call stack or when in a particular library.

Figure 7. Invoking Filter Manager to suppress uninteresting errors (on Windows)
Invoking Filter Manager

Deatiled documentation is provided under Help menu for various Purify features.

Using Purify on UNIX and Linux

On the UNIX platform, there are multiple ways of building instrumented executable programs. The simplest is to add the word purify as a prefix to the command line that you use to build an executable.

For example, for the file memerrors.c (available in the Download section), you can build a.out as follows:

ksh% cc -g memerrors.c

If you add the prefix purify, it will build an instrumented a.out program, instead:

ksh% purify cc -g memerrors.c

If you use Makefile to build your executable, you can add one more target to build an instrumented executable file:

a.out: foo.c bar.c
    $(CC) $(FLAGS) -o $@ $?
a.out.pure: foo.c bar.c
    purify $(CC) $(FLAGS) -o $@ $?

You have to copy the directive to build the target executable (a.out in this case), and make only two changes:

  • Change the name of the target (a.out.pure).
  • Add the prefix purify to the command to build target.

Instrumentation is performed at link time on all platforms. On AIX, instrumentation can also be done directly on executables:

ksh% purify a.out

Here is an example of a session from AIX showing steps of compiling the file memerrors.c, instrumenting it, and then running the instrumented executable program:

ksh % cc -g memerrors.c
ksh % purify a.out
Purify 7.0 AIX (32-bit) (C) Copyright IBM Corporation. 1992, 2006 All Rights Reserved.  
Instrumenting: a.out. libc.a,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,.......,,,,, libcrypt.a., 
Instrumented a.out is a.out.pure.
Done.
ksh % ./a.out.pure

When the instrumented program runs, the Purify GUI displays, showing the memory errors as soon as they are detected (see Figure 8).

Figure 8. Memory errors found by Purify (on UNIX)
Memory errors

You can click on an error to see the details. Figure 9 shows details of an ABR error in the program. You can infer from the details that memory for a null-terminated str is allocated at line 107 of the file memerrors.c. But due to wrong size computation in method genABRandABW, after copying name string to str, the NULL byte is stored at str[11]. This leads printf to read str beyond its boundary, and causes an array bounds read (ABR) error at line 110.

Figure 9. Details of ABR errors, including source code and line number information (on UNIX)
ABR Details (on UNIX)

Purify running on UNIX and Linux tracks all UMC errors, including those related to the stack variable, but it suppresses them by default. You can see all suppressed errors by selecting View > Suppressed messages (Figure 10).

Figure 10. Seeing suppressed errors (on UNIX)
suppressed errors

Suppressing an error report that you are not interested in is also very simple. Just select an error type, right-click on it, then select Suppress from the drop-down menu (Figure 11).

Figure 11. Suppressing an error (on UNIX)
Suppressing an error (on UNIX)

Deatiled documentation is provided under the Help menu for various Purify features.

Summary

You now know the various types of memory usage errors and potential programming mistakes that can cause them. Most of these errors result from interactions of seemingly error-free methods in your program. Therefore, it is almost impossible to detect these kinds of errors through code inspection in large, real-world applications, which typically have complex control flow. Finding and fixing such errors is extremely difficult, tedious and time-consuming because a defect and its manifestation may be far apart, and because, by nature, these errors are unpredictable, strange, and hard to reproduce consistently.

The effect of these errors may vary from programs sometimes crashing to giving incomprehensible and different results each time you run them. IBM Rational Purify can help you find, isolate, and fix memory use errors, thereby significantly reducing debugging time. This article showed you how to use Purify to instrument your program. When you run the instrumented program, Purify reports each error when it is about to happen and gives you details, such as line numbers and call stack, that provide the vital clues you need to debug the problem and identify the root cause. With the help of Purify, you can develop better C and C++ applications, free of memory use defects.


Download

DescriptionNameSize
Code samples for examples used in the articlememerrors.zip3KB

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Rational software on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Rational
ArticleID=154335
ArticleTitle=Navigating "C" in a "leaky" boat? Try Purify
publish-date=08222006