Topic
  • 10 replies
  • Latest Post - ‏2010-07-04T17:05:41Z by PedroGonnet
SystemAdmin
SystemAdmin
10114 Posts

Pinned topic realloc_align bug

‏2007-07-08T13:19:50Z |
The realloc_align routine in
/opt/ibm/cell-sdk/prototype/src/lib/misc/realloc_align.h seems to have a bug.

If malloc_align(100,4) happens to get a result from malloc that matches you requested alignment then malloc_align will return a pointer that is sizeof(void*) from what was algined.

You then write some data, do some unrelated mallocs and call realloc_align(ptr,500,4)
If realloc then returns a pointer that no longer matches your alignment then realloc_align will offset the result to get your new alignment but it doesn't move your data.

The comment in the header indicates that realloc_align will preserve your data if the requested alignment matches the original alignment. This isn't true, data only gets preserved if the alignment of the original malloc() call matches the alignment of the underlying realloc() call (so the offset stays constant)

The attached test code demonstrates the problem on my PS3 with sdk 2.0

You can fix the problem by adding a memmove() to realloc_align to move the data the difference between the old offset and the new offset. This might mean that the data gets transfered twice (once by realloc and once by realloc_align)

Updated on 2010-07-04T17:05:41Z at 2010-07-04T17:05:41Z by PedroGonnet
  • CellServ
    CellServ
    1346 Posts

    Re: realloc_align bug

    ‏2007-07-09T22:36:37Z  
    Does this code print "Error" for you every time you run it? I tested this with SDK 2.1 (compiled on x86) and run both in simulator and blade and don't get the "Error" printed.

    I compiled with:

    /opt/cell/bin/ppu-gcc -W -Wall -Winline -I. -I /opt/ibm/cell-sdk/prototype/sysroot/usr/include -O3 realloc_align_test.c

    --
    IBM SDK Service Administrator
  • SystemAdmin
    SystemAdmin
    10114 Posts

    Re: realloc_align bug

    ‏2007-07-15T16:22:55Z  
    • CellServ
    • ‏2007-07-09T22:36:37Z
    Does this code print "Error" for you every time you run it? I tested this with SDK 2.1 (compiled on x86) and run both in simulator and blade and don't get the "Error" printed.

    I compiled with:

    /opt/cell/bin/ppu-gcc -W -Wall -Winline -I. -I /opt/ibm/cell-sdk/prototype/sysroot/usr/include -O3 realloc_align_test.c

    --
    IBM SDK Service Administrator
    If I compile the program like
    gcc -I /opt/ibm/cell-sdk/prototype/sysroot/usr/include realloc_align_test.c

    It fails everytime, even after upgrading to sdk2.1

    However,

    If I compile it with
    ppu-gcc -I /opt/ibm/cell-sdk/prototype/sysroot/usr/include realloc_align_test.c

    I don't get the error.
    For PPU code does ppu-gcc do something different than the stock gcc?
  • CellServ
    CellServ
    1346 Posts

    Re: realloc_align bug

    ‏2007-07-16T15:04:16Z  
    If I compile the program like
    gcc -I /opt/ibm/cell-sdk/prototype/sysroot/usr/include realloc_align_test.c

    It fails everytime, even after upgrading to sdk2.1

    However,

    If I compile it with
    ppu-gcc -I /opt/ibm/cell-sdk/prototype/sysroot/usr/include realloc_align_test.c

    I don't get the error.
    For PPU code does ppu-gcc do something different than the stock gcc?
    Your tests have answered your question :)

    You need to use "ppu-gcc" if you're using any code from the SDK since everything is written to properly align (memory or otherwise) with SPE code, even if you're not starting any SPE threads.

    --
    IBM SDK Service Administrator
  • SystemAdmin
    SystemAdmin
    10114 Posts

    Re: realloc_align bug

    ‏2007-08-08T14:45:57Z  
    • CellServ
    • ‏2007-07-16T15:04:16Z
    Your tests have answered your question :)

    You need to use "ppu-gcc" if you're using any code from the SDK since everything is written to properly align (memory or otherwise) with SPE code, even if you're not starting any SPE threads.

    --
    IBM SDK Service Administrator
    Hi,

    Compiling the test using the SDK ppu32-gcc does not work. It seems that the realloc_align only works correctly when compiling for 64 bits. Can anybody else check that bug. (I've checked it using the last SDK on a PS3 and on a blade).

    Thanks in advance,
    Salut!

    Ramon.
  • SystemAdmin
    SystemAdmin
    10114 Posts

    Re: realloc_align bug

    ‏2007-08-08T19:25:36Z  
    Hi,

    Compiling the test using the SDK ppu32-gcc does not work. It seems that the realloc_align only works correctly when compiling for 64 bits. Can anybody else check that bug. (I've checked it using the last SDK on a PS3 and on a blade).

    Thanks in advance,
    Salut!

    Ramon.
    My coworker and I corroborate ssinger's claim of the incorrectness of realloc_align.

    The following code will crash on a PS3 with sdk2.1. Compile with ppu-gcc -m32 file.c -o file

    code
    #include </opt/ibm/cell-sdk/prototype/src/lib/misc/malloc_align.h>
    #include </opt/ibm/cell-sdk/prototype/src/lib/misc/realloc_align.h>
    #include <stdint.h>
    #include <stdbool.h>

    #define INITIAL_ARRAY_SIZE 1024

    int main(int argc, char ** argv)
    {
    int i = 0;
    int icurrentArraySize = INITIAL_ARRAY_SIZE;
    unsigned long long * db_ppu;
    unsigned char * str;

    str = (unsigned char*)_malloc_align(256, 4);

    db_ppu = (unsigned long long *)_malloc_align(icurrentArraySize*sizeof(unsigned long long), 4);
    printf("db_ppu ptr = %lx, db_ppu realptr = %lx\n", db_ppu, *((void **)(db_ppu)-1));
    for(i = 0; i < icurrentArraySize; i++)
    db_ppu[i] = i;

    while(true)
    {
    icurrentArraySize = icurrentArraySize*2;
    db_ppu = (unsigned long long *)_realloc_align(db_ppu, icurrentArraySize*sizeof(unsigned long long), 4);
    printf("db_ppu ptr = %lx, db_ppu realptr = %lx\n", db_ppu, *((void **)(db_ppu)-1));

    if(db_ppu == NULL)
    {
    printf("Was enable to assign memory... Exiting\n");
    exit(0);
    }

    for(i = 0; i < INITIAL_ARRAY_SIZE; i++)
    {
    if( db_ppu[i] != i)
    {
    printf("ERROR: Data does not match what is expected. Check if the lower four bits of the real pointer have changed.\n");
    exit(1);
    }
    }

    }

    return 0;
    }
    [/code]

    The only difference between my code and ssinger's code is that I dirty the heap in a different way than he does. I'm going to guess that the difference between gcc and ppu-gcc as previously questioned is that either ppu-gcc has a different mallocater than the stock gcc.

    ssinger had it bang on - realloc_align is incorrect code. malloc_align and realloc_align are wrappers to malloc and realloc that ask for a bit more space than the caller requests and increments the pointer returned by the libc call until it is appropriately aligned. something like:

    code
    real_ptr = malloc(size + (1 << log2_align) + sizeof(void**))
    p = real_ptr;
    while(((int)p)&0xf) p++;
    *(p-1) = real_ptr;
    return p;
    [/code]

    Suppose your code looks like this:
    code
    p = malloc_align(some_space, some_alignment);
    p[0] = some_value;
    p = realloc_align(some_more_space, some_alignment);
    [/code]

    malloc_align wraps a call to malloc which returns some value real_p. malloc_align returns some p > real_p that is properly aligned - so the line p[0] write at address p.
    Now when realloc is called, real_p is computed from p, and realloc'ed, say real_p'. Now p' is computed, but is purely based on the offset of real_p'! This disregards the fact that the offset of the DATA in the data block is already fixed by the assignment before the realloc. Thus realloc_align works fine provided that libc's realloc returns a pointer with the same alignment, but that kind of defeats the purpose of the wrapper.

    It takes a special heap situation to have the mallocater return memory that's on a different byte boundary. The sample above needs to be compiled for 32bits because we don't particularily clutter the heap. The malloc on line 15 is the only other heap request, but it is enough; commenting out that line and recompiling and the program runs to completion.

    realloc_align can certainly be coaxed to fail even when working in a 64bit environment, but with the larger address space there has to be considerably more clutter on the heap to force the realloc or malloc to return a pointer with a peculiar alignment.

    Without access to the internals of malloc, our workarounds are either:
    • malloc new memory, do a memcopy, free the old memory (requires both objects to be resident simultaneously, thus may fail if the heap if pretty full)
    OR
    • realloc the memory, but do a memcopy if the alignment of the original pointer has changed (this requires an extra memcopy than the one already implicitly executed in realloc)

    however both of those require that we know the size of the old pointer so we don't generate a segfault copying off the address-space.

    v.
  • CellServ
    CellServ
    1346 Posts

    Re: realloc_align bug

    ‏2007-08-08T19:47:48Z  
    My coworker and I corroborate ssinger's claim of the incorrectness of realloc_align.

    The following code will crash on a PS3 with sdk2.1. Compile with ppu-gcc -m32 file.c -o file

    code
    #include </opt/ibm/cell-sdk/prototype/src/lib/misc/malloc_align.h>
    #include </opt/ibm/cell-sdk/prototype/src/lib/misc/realloc_align.h>
    #include <stdint.h>
    #include <stdbool.h>

    #define INITIAL_ARRAY_SIZE 1024

    int main(int argc, char ** argv)
    {
    int i = 0;
    int icurrentArraySize = INITIAL_ARRAY_SIZE;
    unsigned long long * db_ppu;
    unsigned char * str;

    str = (unsigned char*)_malloc_align(256, 4);

    db_ppu = (unsigned long long *)_malloc_align(icurrentArraySize*sizeof(unsigned long long), 4);
    printf("db_ppu ptr = %lx, db_ppu realptr = %lx\n", db_ppu, *((void **)(db_ppu)-1));
    for(i = 0; i < icurrentArraySize; i++)
    db_ppu[i] = i;

    while(true)
    {
    icurrentArraySize = icurrentArraySize*2;
    db_ppu = (unsigned long long *)_realloc_align(db_ppu, icurrentArraySize*sizeof(unsigned long long), 4);
    printf("db_ppu ptr = %lx, db_ppu realptr = %lx\n", db_ppu, *((void **)(db_ppu)-1));

    if(db_ppu == NULL)
    {
    printf("Was enable to assign memory... Exiting\n");
    exit(0);
    }

    for(i = 0; i < INITIAL_ARRAY_SIZE; i++)
    {
    if( db_ppu[i] != i)
    {
    printf("ERROR: Data does not match what is expected. Check if the lower four bits of the real pointer have changed.\n");
    exit(1);
    }
    }

    }

    return 0;
    }
    [/code]

    The only difference between my code and ssinger's code is that I dirty the heap in a different way than he does. I'm going to guess that the difference between gcc and ppu-gcc as previously questioned is that either ppu-gcc has a different mallocater than the stock gcc.

    ssinger had it bang on - realloc_align is incorrect code. malloc_align and realloc_align are wrappers to malloc and realloc that ask for a bit more space than the caller requests and increments the pointer returned by the libc call until it is appropriately aligned. something like:

    code
    real_ptr = malloc(size + (1 << log2_align) + sizeof(void**))
    p = real_ptr;
    while(((int)p)&0xf) p++;
    *(p-1) = real_ptr;
    return p;
    [/code]

    Suppose your code looks like this:
    code
    p = malloc_align(some_space, some_alignment);
    p[0] = some_value;
    p = realloc_align(some_more_space, some_alignment);
    [/code]

    malloc_align wraps a call to malloc which returns some value real_p. malloc_align returns some p > real_p that is properly aligned - so the line p[0] write at address p.
    Now when realloc is called, real_p is computed from p, and realloc'ed, say real_p'. Now p' is computed, but is purely based on the offset of real_p'! This disregards the fact that the offset of the DATA in the data block is already fixed by the assignment before the realloc. Thus realloc_align works fine provided that libc's realloc returns a pointer with the same alignment, but that kind of defeats the purpose of the wrapper.

    It takes a special heap situation to have the mallocater return memory that's on a different byte boundary. The sample above needs to be compiled for 32bits because we don't particularily clutter the heap. The malloc on line 15 is the only other heap request, but it is enough; commenting out that line and recompiling and the program runs to completion.

    realloc_align can certainly be coaxed to fail even when working in a 64bit environment, but with the larger address space there has to be considerably more clutter on the heap to force the realloc or malloc to return a pointer with a peculiar alignment.

    Without access to the internals of malloc, our workarounds are either:
    • malloc new memory, do a memcopy, free the old memory (requires both objects to be resident simultaneously, thus may fail if the heap if pretty full)
    OR
    • realloc the memory, but do a memcopy if the alignment of the original pointer has changed (this requires an extra memcopy than the one already implicitly executed in realloc)

    however both of those require that we know the size of the old pointer so we don't generate a segfault copying off the address-space.

    v.
    I believe you are correct about the failings of realloc_align and will open an internal defect to get this worked on.

    thanks
    --
    IBM SDK Service Administrator
  • SystemAdmin
    SystemAdmin
    10114 Posts

    Re: realloc_align bug

    ‏2008-04-09T15:38:25Z  
    • CellServ
    • ‏2007-08-08T19:47:48Z
    I believe you are correct about the failings of realloc_align and will open an internal defect to get this worked on.

    thanks
    --
    IBM SDK Service Administrator
    Is this fixed?
  • CellServ
    CellServ
    1346 Posts

    Re: realloc_align bug

    ‏2008-04-09T16:13:37Z  
    Is this fixed?
    Yes, this is fixed in SDK 3.0.

    --
    IBM SDK Service Administrator
  • SystemAdmin
    SystemAdmin
    10114 Posts

    Re: realloc_align bug

    ‏2009-02-18T17:23:05Z  
    • CellServ
    • ‏2008-04-09T16:13:37Z
    Yes, this is fixed in SDK 3.0.

    --
    IBM SDK Service Administrator
    In my opinion, there are still 2 bugs in realloc_align.h (SDK 3.1).

    • First bug: I would replace the 'prev_offset' computation
    
    prev_offset = ptr - real_ptr;
    

    with
    
    prev_offset = ptr - (real_ptr + 2*sizeof(
    
    void *));
    

    Otherwise 'prev_offset' is wrongly computed (especially with several
    successive calls to _realloc_align()).

    • Second bug: replace
    
    memmove(ret, ptr, (size > prev_size) ? prev_size : size);
    

    with
    
    memmove(ret, (
    
    void *)(real + 2*sizeof(
    
    void *)) + prev_offset, (size > prev_size) ? prev_size : size);
    

    since the call to realloc() can modify the content of the "old" memory area
    (this actually happens in some of my tests). Therefore we cannot rely on the content
    of this old memory area, and we should better move the content of the "new" memory area.

    Can you please confirm/invalidate these 2 points?
    Thank you.
  • PedroGonnet
    PedroGonnet
    12 Posts

    Re: realloc_align bug

    ‏2010-07-04T17:05:41Z  
    In my opinion, there are still 2 bugs in realloc_align.h (SDK 3.1).

    • First bug: I would replace the 'prev_offset' computation
    <pre class="jive-pre"> prev_offset = ptr - real_ptr; </pre>
    with
    <pre class="jive-pre"> prev_offset = ptr - (real_ptr + 2*sizeof( void *)); </pre>
    Otherwise 'prev_offset' is wrongly computed (especially with several
    successive calls to _realloc_align()).

    • Second bug: replace
    <pre class="jive-pre"> memmove(ret, ptr, (size > prev_size) ? prev_size : size); </pre>
    with
    <pre class="jive-pre"> memmove(ret, ( void *)(real + 2*sizeof( void *)) + prev_offset, (size > prev_size) ? prev_size : size); </pre>
    since the call to realloc() can modify the content of the "old" memory area
    (this actually happens in some of my tests). Therefore we cannot rely on the content
    of this old memory area, and we should better move the content of the "new" memory area.

    Can you please confirm/invalidate these 2 points?
    Thank you.
    I can confirm that realloc_align is broken in SDK 3.1 for 64-bit applications. Since the code that breaks it is part of a much larger program, I won't post it here, but I was getting shifted data after calling realloc_align and doing the re-allocation "by hand" (e.g. malloc_align, memcpy, free_align) fixed the problem.

    Cheers, Pedro.