Topic
  • 8 replies
  • Latest Post - ‏2011-04-01T08:36:44Z by SystemAdmin
SystemAdmin
SystemAdmin
131 Posts

Pinned topic Generating for PowerPC 2.01 and not 2.03

‏2011-03-25T10:19:27Z |
Hello,

Being an optimist, I'm trying to use the OpenCL SDK 0.2 on a dual 970MP system running fedora 12. The SDK installs just fine, and after very limited tweaking I got everything to work.

But as the SDK detect the system as "POWER", it generates code for the PowerPC ISA 2.03 (i.e. POWER6 & above), while my 970MP are only 2.01. So when the SDK generates code such as "friz f26,f21", the kernel dies with a SIGILL.

Is there any way to trick the SDK compiler (xlcl?) into generating 2.01 code only?
Updated on 2011-04-01T08:36:44Z at 2011-04-01T08:36:44Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: Generating for PowerPC 2.01 and not 2.03

    ‏2011-03-25T13:49:13Z  
    > {quote:title=RomainDolbeau wrote:}{quote}
    > Is there any way to trick the SDK compiler (xlcl?) into generating 2.01 code only?

    I think I found most of the answer, so in case someone else is interested...

    From the answer here, I checked the file
    
    /opt/ibmcmp/xlc/opencl/0.2/etc/vac.cfg
    
    to try and convince the compiler to produce 2.01-compliant code. But I couldn't get a "-qarch" value other than "pwr6" (regardless of "-qtune" value or lack thereof) to work. "ppc970", "pwr4", "pwr5", "ppc64v" are all unsupported, and most generic values refuse AltiVec.

    So far it seems the PPC970 just cannot work with OpenCL 0.2.
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: Generating for PowerPC 2.01 and not 2.03

    ‏2011-03-25T13:50:20Z  
    Hi Romain,

    I am glad to hear that you are interested in IBM OpenCL SDK. There is no guaranteed way to configure the compiler to support the older ISA, but you may try the "-noopt" build flag to see if it at least gets you past the SIGILL. Of course, this may impact performance.

    Regards.
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: Generating for PowerPC 2.01 and not 2.03

    ‏2011-03-25T14:05:43Z  
    Romain,

    I remind you that you are treading on undesigned and untested territory and may not be functional and certainly not optimal. However, you might find some success by modifying the build template to choose the Cell PPU (ppu-xlcl) compiler instead of the Power (xlcl) compiler. The Cell PPU conforms to the 2.02 Power ISA which is closer to the 2.01 ISA. The build template is /usr/include/CL/device/cl_build_template_CPU.

    Dan B.
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: Generating for PowerPC 2.01 and not 2.03

    ‏2011-03-25T14:53:31Z  
    Hi Romain,

    I am glad to hear that you are interested in IBM OpenCL SDK. There is no guaranteed way to configure the compiler to support the older ISA, but you may try the "-noopt" build flag to see if it at least gets you past the SIGILL. Of course, this may impact performance.

    Regards.
    > {quote:title=gbello wrote:}{quote}

    > I am glad to hear that you are interested in IBM OpenCL SDK.

    I am, but unfortunately it's hard to convince the boss to part with the money for a full-fledged POWER6 system :-/ In the meantime, I tried on a colleague's old YDL PowerStation. Those 970MPs are still pretty fast.

    > There is no guaranteed
    > way to configure the compiler to support the older ISA, but you may try the "-noopt"
    > build flag to see if it at least gets you past the SIGILL. Of course, this may impact
    > performance.

    -noopt doesn't seem to be supported, I suspect you meant the standard "-cl-opt-disable".
    If I use this option, the CL compiler refuses the "__attribute__((reqd_work_group_size(64, 1, 1)))" on the kernel. If I remove the attribute, it compiles, but it seems all my results are 0.0 ... and they really shouldn't be.
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: Generating for PowerPC 2.01 and not 2.03

    ‏2011-03-25T15:11:02Z  
    Romain,

    I remind you that you are treading on undesigned and untested territory and may not be functional and certainly not optimal. However, you might find some success by modifying the build template to choose the Cell PPU (ppu-xlcl) compiler instead of the Power (xlcl) compiler. The Cell PPU conforms to the 2.02 Power ISA which is closer to the 2.01 ISA. The build template is /usr/include/CL/device/cl_build_template_CPU.

    Dan B.
    > {quote:title=brokensh wrote:}{quote}
    > I remind you that you are treading on undesigned and untested territory and may not be
    > functional and certainly not optimal.

    That's what makes things interesting :-)

    > However, you might find some success by modifying the build template to choose the
    > Cell PPU (ppu-xlcl) compiler instead of the Power (xlcl) compiler. The Cell PPU
    > conforms to the 2.02 Power ISA which is closer to the 2.01 ISA. The build template
    > is /usr/include/CL/device/cl_build_template_CPU.

    Guess what? It worked :-) I'm unsure about the results (the code is still running :-), but it's definitely doing something instead of crashing.

    Thanks everyone for the help,
  • SystemAdmin
    SystemAdmin
    131 Posts

    Generating for PowerPC 2.01 and not 2.03, using SDK 0.3

    ‏2011-03-31T12:41:51Z  
    > {quote:title=brokensh wrote:}{quote}
    > I remind you that you are treading on undesigned and untested territory and may not be
    > functional and certainly not optimal.

    That's what makes things interesting :-)

    > However, you might find some success by modifying the build template to choose the
    > Cell PPU (ppu-xlcl) compiler instead of the Power (xlcl) compiler. The Cell PPU
    > conforms to the 2.02 Power ISA which is closer to the 2.01 ISA. The build template
    > is /usr/include/CL/device/cl_build_template_CPU.

    Guess what? It worked :-) I'm unsure about the results (the code is still running :-), but it's definitely doing something instead of crashing.

    Thanks everyone for the help,
    {quote:title=RomainDolbeau wrote:}{quote}
    > Guess what? It worked :-)

    Unfortunately, I can't get this to work using the newly released SDK 0.3. It works w/o tricks on the Cell of a PS3, but on the 970MPs nothing will run, not even the SDK samples. From my limited checking with strace & gdb, the code stalls on the pthread_cond_wait inside ibm::openclHost::Event::wait inside ibm::openclHost::CommandQueue::finish inside clFinish. Seems like the kernel is never really started (top shows the code at 0%), and so the wait goes on forever.

    Any idea welcome...
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: Generating for PowerPC 2.01 and not 2.03, using SDK 0.3

    ‏2011-03-31T19:53:01Z  
    {quote:title=RomainDolbeau wrote:}{quote}
    > Guess what? It worked :-)

    Unfortunately, I can't get this to work using the newly released SDK 0.3. It works w/o tricks on the Cell of a PS3, but on the 970MPs nothing will run, not even the SDK samples. From my limited checking with strace & gdb, the code stalls on the pthread_cond_wait inside ibm::openclHost::Event::wait inside ibm::openclHost::CommandQueue::finish inside clFinish. Seems like the kernel is never really started (top shows the code at 0%), and so the wait goes on forever.

    Any idea welcome...
    Romain,

    Without actually recreating the environment that you have, we can only guess what could be going wrong. One of my colleagues thought your symptoms looked similar to the cpu map problem briefly described in this forum post.
    http://www.ibm.com/developerworks/forums/thread.jspa?threadID=357392&tstart=0
    When you run a sample, check to make sure that all the expected compute threads exist. If they aren't, then it may be this problem.

    Since you have had to force install or manually install the ppu-xclc compiler on a non-Cell system, make sure you don't have mismatch components (some from v0.2 and some from v0.3). Since this is a technology preview, backwards compatibility is not rigorously tested -- especially on an unsupported system like the 970.
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: Generating for PowerPC 2.01 and not 2.03, using SDK 0.3

    ‏2011-04-01T08:36:44Z  
    Romain,

    Without actually recreating the environment that you have, we can only guess what could be going wrong. One of my colleagues thought your symptoms looked similar to the cpu map problem briefly described in this forum post.
    http://www.ibm.com/developerworks/forums/thread.jspa?threadID=357392&tstart=0
    When you run a sample, check to make sure that all the expected compute threads exist. If they aren't, then it may be this problem.

    Since you have had to force install or manually install the ppu-xclc compiler on a non-Cell system, make sure you don't have mismatch components (some from v0.2 and some from v0.3). Since this is a technology preview, backwards compatibility is not rigorously tested -- especially on an unsupported system like the 970.
    {quote:title=brokensh wrote:}{quote}
    > Without actually recreating the environment that you have, we can only guess what could be going wrong. One of my colleagues thought your symptoms looked similar to the cpu map problem briefly described in this forum post.
    > http://www.ibm.com/developerworks/forums/thread.jspa?threadID=357392&tstart=0
    > When you run a sample, check to make sure that all the expected compute threads exist. If they aren't, then it may be this problem.

    The test code in the forum post you linked was returning 'FAIL'. After reconfiguring the boot loader to use a slightly newer kernel (2.6.32.26-175.fc12.ppc64 instead of 2.6.31.5-127.fc12.ppc64), the test started to return 'PASS'... and OpenCL is working again, using ppu-xlcl :-)

    Thanks a lot for the help.