A customer of ours ("us" being the Mentor Graphics Sourcery VSIPL++ team) recently noticed that ALF 4.0.0 has a performance bug where the scheduler thread's main loop is a busy-wait loop rather than waiting on pthread conditions as it's supposed to, and it stays in this busy-wait loop even after all of the ALF tasks complete. Obviously, this eats up a lot of CPU time that the rest of the application would like to use!
We found some fairly simple fixes for this, and delivered them to the customer -- but obviously it would be nice to deliver them to the rest of the community as well.
However, this raises the question: Is there a "rest of the community" for ALF these days? Any upstream to send the patches to?
I've attached the patches here, at least. Basically, the problem is that the "release tasks" function only drains the "destroyed" task list if its length exceeds ALF_GC_THRESHOLD, so the "do we have more work to do?" check in the main scheduler loop should use that as the condition on whether to go into a pthread_cond_wait, rather than simply asking whether the length is nonzero. And, when that is fixed, it exposes a latent bug in alf_exit where the scheduler thread needs to be signaled to come out of that pthread_cond_wait to do the final round of garbage collection.
NOTICE: developerWorks Community will be offline May 29-30, 2015 while we upgrade to the latest version of IBM Connections. For more information, read our upgrade FAQ.
This topic has been locked.
1 reply Latest Post - 2012-04-03T21:09:51Z by NLai
Pinned topic Bug fixes for ALF 4.0.0 -- I have patches, but does anyone care?
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2012-04-03T21:09:51Z at 2012-04-03T21:09:51Z by NLai
NLai 27000238R314 PostsACCEPTED ANSWER
Re: Bug fixes for ALF 4.0.0 -- I have patches, but does anyone care?2012-04-03T21:09:51Z in response to BrooksMosesIt could be a day of remembrance -- Sony deleted a further possibility to run a supercomputer in an affordable box, the PS3.
I still enjoy very much to program the Cell, the super power of 1ppe+6spes.
Just to share something for the programming:
1) Always try to avoid the "for", the statement harms the Cell very much; use spe's 128-bit vectors instead, that can make your Cell greatly departure from the other architectures.
2) Never compare an "unsigned" quantity with zero, that not only harms the performance, also possibly makes your program in mess.