IBM Support

LINUX GDB: IDENTIFY MEMORY LEAKS

Technical Blog Post


Abstract

LINUX GDB: IDENTIFY MEMORY LEAKS

Body

This small article describe how to track memory leaks using 'gdb' on Linux. If you are using products like 'db2' or any other product that has it's own memory management routines then you should first track possible leaks at the product level using the tools it provides. For 'db2' that would be 'db2pd' for example. This article therefore applies only to memory leaks at the 'malloc()' level, that is blocks of memory allocated using 'malloc()' but never freed.

 

Remember that a debugger is very slow and might cause performance issues. If you can 'overload' the default 'malloc()/free()' routines via LD_PRELOAD or another way it would probably be faster and easier than using the debugger.

 

NOTE: The 'gdb' script provided below can be modified as fits. Remember that
      multiple things like 'gdb' version, OS version, compilation options...
      can end up providing slightly different results and therefore you should
      consider this script more as a 'basis' to work on rather than something
      that will work 100% on all platforms for whatever executable.

 

If you are still reading this means you have no other choice but using the debugger. So let's first describe what we need to do.

 

  - Everytime we enter 'malloc()' we should 'save' the memory allocation
    requested size in a variable.
  - Everytime we return from 'malloc()' we should print the size and the
    return address from 'malloc()'.
  - Everytime we enter 'free()' we should print the 'pointer' we are
    about to free.

 

Note that while the 'size' is not needed it is good to have it because it might allow you to find some 'pattern' in the list of allocated blocks. Since we don't want to have to manually interact with the debugger each time a memory allocation is made or freed we want to put command in s script that gdb will take as an argument and execute without any manual intervention
required.
Now let's show what the script looks like and then we will show a typical output and explain a few things about the script.

 

  == gdbcmd1 ==

  set pagination off
  set breakpoint pending on
  set logging file gdbcmd1.out
  set logging on
  hbreak malloc
  commands
  set $mallocsize = (unsigned long long) $rdi
  continue
  end
  hbreak *(malloc+191)
  commands
  printf "malloc(%lld) = 0x%016llx\n", $mallocsize, $rax
  continue
  end
  hbreak free
  commands
  printf "free(0x%016llx)\n", (unsigned long long) $rdi
  continue
  end
  continue

 

Then we attach a running process with the debugger using the script:

 

  # gdb --command=gdbcmd1 server2 11543

 

The program will run and send the output to a file named 'gdbcmd1.out'. Because the file contains as well 'gdb' messages, besides the 'printf' we added to the script we can get a clearer output by doing this:

 

  # grep -e "^malloc" -e "^free" gdbcmd1.out

 

We have something like this:

 

  malloc(57) = 0x0000000012e1b260
  free(0x0000000012e1b260)
  malloc(57) = 0x0000000012e1b260
  free(0x0000000012e1b260)
  malloc(100) = 0x000000000e497d30
  malloc(15) = 0x0000000011bda010
  malloc(568) = 0x0000000011bda030
  free(0x0000000000000000)
  malloc(2248) = 0x0000000011bda270
  free(0x0000000011bda030)
  malloc(20) = 0x0000000011bda030
  malloc(20) = 0x0000000011bda050
  malloc(20) = 0x0000000011bda070
  malloc(20) = 0x0000000011bda090
  malloc(20) = 0x0000000011bda0b0
  malloc(20) = 0x0000000011bda0d0
  free(0x0000000011bda010)
  malloc(15) = 0x0000000011bda010
  free(0x0000000011bda010)
  malloc(15) = 0x0000000011bda010
  malloc(21) = 0x0000000011bda210
  malloc(21) = 0x0000000011bda230
  malloc(56) = 0x0000000011bdafe0

 

Once you have that you can parse the output to find out those blocks that are never freed. Later on you can add a 'where' command in the commands for the entry in 'malloc()' breakpoint to see where the allocation comes from. So you would for example replace this:

 

  hbreak malloc
  commands
  set $mallocsize = (unsigned long long) $rdi
  continue
  end

 

By this:

 

  hbreak malloc
  commands
  set $mallocsize = (unsigned long long) $rdi
  where
  continue
  end

 

Now some words about the 'gdb' script 'gdmcmd1' itself:

 

  set pagination off

  - This prevents to have to press 'return' each time the screen has been
    filled up with messages. This can be seen as some 'auto-scroll' option.

 

  set breakpoint pending on

  - In case the library containing the function would not yet be loaded
    the debugger would wait for it before placing the breakpoint. In this
    case of course it should already be loaded in the running process.

 

  set logging file gdbcmd1.out

  - Instruct 'gdb' to save the output to a file named 'gdbcmd1'.

 

  set logging on

  - Instruct 'gdb' to start sending output to 'gdbcmd1'.

 

  hbreak malloc
  commands
  set $mallocsize = (unsigned long long) $rdi
  continue
  end

  - Place a 'hardware assisted' breakpoint on the entry of malloc. Save the
    allocation requested size in a variable named 'mallocsize'. Note that on
    X86 AMD the first argument to a function is in register '$rdi'.

 

  hbreak *(malloc+191)
  commands
  printf "malloc(%lld) = 0x%016llx\n", $mallocsize, $rax
  continue
  end

  - Place a 'hardware assisted' breakpoint on the return from malloc.
    Print the size as well as the pointer returned by malloc. Note that the
    return value on X86 AMD is in register '$rax'. See below for finding
    the offset of the 'return' instruction from malloc.

 

  hbreak free
  commands
  printf "free(0x%016llx)\n", (unsigned long long) $rdi
  continue
  end
  continue

 

- Finding the 'return' instruction offset from malloc -

 

We could use the 'finish' instruction of 'gdb' but in many cases it simply did not work as expected for me. So this way of doing it is an alternative. To find the return instruction offset in malloc you need to check the assembly for malloc (disas malloc in gdb) and locate the 'retq' (64 bits) instruction. Then you take the address of that instruction and compute the offset starting from the first instruction in malloc. For example:

 

  (gdb) disas malloc
  Dump of assembler code for function malloc:
  0x0000003f8a673fb0 <malloc+0>:  mov    %rbp,-0x10(%rsp)
  0x0000003f8a673fb5 <malloc+5>:  mov    %rbx,-0x18(%rsp)
  0x0000003f8a673fba <malloc+10>: mov    %rdi,%rbp
  0x0000003f8a673fbd <malloc+13>: mov    %r12,-0x8(%rsp)
  0x0000003f8a673fc2 <malloc+18>: sub    $0x18,%rsp
  0x0000003f8a673fc6 <malloc+22>: mov    0x2dbe53(%rip),%rax
  0x0000003f8a673fcd <malloc+29>: mov    (%rax),%rax
  ...
  0x0000003f8a674055 <malloc+165>:        mov    0x8(%rsp),%rbp
  0x0000003f8a67405a <malloc+170>:        mov    0x10(%rsp),%r12
  0x0000003f8a67405f <malloc+175>:        add    $0x18,%rsp
  0x0000003f8a674063 <malloc+179>:        retq
  0x0000003f8a674064 <malloc+180>:        mov    %rbx,%rdi
  0x0000003f8a674067 <malloc+183>:        mov    %rbp,%rsi
  0x0000003f8a67406a <malloc+186>:        xor    %r12d,%r12d
  ...

  0x0000003f8a674063 - 0x0000003f8a673fb0 = 179

 

Once again you might have to play a bit with the script to obtain the results
you are after but this should give you a base to start on.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm13286389