IBM Support

== DEBUGGING CORE FILES [06] == STACK TRACE <-> SOURCE CODE [EXAMPLE 1]

Technical Blog Post


Abstract

== DEBUGGING CORE FILES [06] == STACK TRACE <-> SOURCE CODE [EXAMPLE 1]

Body


== DEBUGGING CORE FILES [06] == STACK TRACE <-> SOURCE CODE [EXAMPLE 1]

C program  : example1.c [see at the end of entry]
Platform   : AIX
Compilation: cc -q64 example1.c -o example1
Execution  : ./example1 example1.c 99

  # example1 example1.c 99
  Segmentation fault(coredump)

Let's load the core file and print the stack:

  # dbx example1 mycore
  Type 'help' for help.
  [using memory image in mycore]
  reading symbolic information ...warning: no source compiled with -g
  Segmentation fault in gettimeofday at 0x90000000013f468
  0x90000000013f468 (gettimeofday+0x128) f81f0000            std   r0,0x0(r31)
  (dbx) where
  gettimeofday(??, ??) at 0x90000000013f468
  main(0x300000003, 0xffffffffffff020) at 0x100000530
  (dbx)

So here is one example where the core file is generated in while the
program was running an instruction part of a system library for which
we don't have the source code. So what we will try to do is collect the
arguments passed to 'gettimeofday()'.

We have 2 calls to 'gettimeofday()' in 'example1.c'. So first we need
to locate what call it is. To do that we will use the 'return address'
in 'main()', here that is '0x100000530' and look at the assembly.
To get the 'starting' address of main we do this:

  (dbx) 0x100000530/i
  0x100000530 (main+0xb0) 480001f1             bl   0x100000720 (gettimeofday)

So this tells us that the instruction is at offset 0xb0 from the start
of function main. We can then compute the start of main:

  (dbx) p (void *) 0x100000530 - 0xb0
  0x0000000100000480

or simply as well by

  (dbx) p &main
  0x0000000100000480

And then we print the assembly code for 'main()':

  (dbx) 0x0000000100000480/128i

Let's keep only the part surrounding our call...

  ...
  0x100000520 (main+0xa0)          bl   0x1000006a8 (exit)
  0x100000524 (main+0xa4)          ld   r2,0x28(r1)
  0x100000528 (main+0xa8)          ld   r3,0x90(r1)
  0x10000052c (main+0xac)          li   r4,0x0
  0x100000530 (main+0xb0)          bl   0x100000720 (gettimeofday)
  0x100000534 (main+0xb4)          ld   r2,0x28(r1)
  ...

So we found the call. Since the SEGV happens in the system library
we can check the arguments. As mentioned in the topic '5' the registers
used to pass arguments to functions on AIX are $r3 to $r10. So here we
would have the following:

  From man pages:

  int gettimeofday ( Tp,  Tzp)
  struct timeval *Tp;
  void *Tzp;

  From assembly:

  0x100000514 (main+0x94)          bl   0x100000680 (printf)
  0x100000518 (main+0x98)          ld   r2,0x28(r1)
  0x10000051c (main+0x9c)          li   r3,0x1
  0x100000520 (main+0xa0)          bl   0x1000006a8 (exit)
  0x100000524 (main+0xa4)          ld   r2,0x28(r1)
  0x100000528 (main+0xa8)          ld   r3,0x90(r1)
    $r3 = arg0 = *Tp
  0x10000052c (main+0xac)          li   r4,0x0
    $r4 = arg1 = *Tzp = 0
  0x100000530 (main+0xb0)          bl   0x100000720 gettimeofday()

We don't have the value for 'Tp' as we didn't learn yet how to navigate
the raw stack to recover arguments. However we see a few things:

  - The second argument is 0 and according to the man pages that is valid.
    So it must be the first argument (Tp) that is invalid.

  - Right before calling 'gettimeofday()' we see a printf and an 'exit()'.
    So this would most likely be the first call to 'gettimeofday()' in
    our source code as the last one we would most likely see a call
    to 'fclose()' before it.

So this would be here:

   +32      if ((bf = (char *) malloc(BUFSIZE)) == 0) {
   +33          printf("[error] malloc(bf), [errno = %d]\n", errno);
   +34          exit(1);
   +35      }
   +36
   +37      gettimeofday(t, 0);

Now we know that 't' is wrong. If we look at the source code we see that
actually 't' is never set! It is declared as a pointer but never set.
So either we could allocate it or better, simply change it to a local
variable.

== SOURCE FILE FOR example1.c ==

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <strings.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>


#define BUFSIZE 512


int
main(int ac, char **av)
{
    FILE                    *fp;
    char                    *bf;
    char                    *fname;
    int                      max;
    int                      num;
    struct timeval          *t;


    if (ac < 3) {
        printf("Usage: %s <filename> <maxloops>\n", av[0]);
        exit(1);
    }

    fname = av[1];
    max = atoi(av[2]);

    if ((bf = (char *) malloc(BUFSIZE)) == 0) {
        printf("[error] malloc(bf), [errno = %d]\n", errno);
        exit(1);
    }

    gettimeofday(t, 0);
    printf("start time: %ldsec %ldnsec\n", t->tv_sec, t->tv_usec);

    if ((fp = fopen(fname, "r")) == 0) {
        printf("[error] fopen(%s), [errno = %d]\n", fname, errno);
        exit(1);
    }

    num = 0;
    while (num < max) {

        if (fgets(bf, BUFSIZE, fp) == 0)
            break;

        printf("line = %s\n", bf);

        num++;
    }

    fclose(fp);

    gettimeofday(t, 0);
    printf("end time  : %ldsec %ldnsec\n", t->tv_sec, t->tv_usec);

    exit(0);
}

 

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm11140226