I have a rather straightforward kernel and associated host program (used to measure memory latency) which executes fine using the IBM platform on Cell PPEs and SPEs, and on AMD, nVidia and Intel platforms. On Power7 no errors are reported for any host function call, but the kernel does not appear to run (the result is 0 and the kernel execution takes 0 ns). I checked the generated binary (dumped from OpenCL) and it seems to be correct.
Other OpenCL programs work on our Power7 machine so it doesn't appear to be a problem with the setup.
Is there any known problem with the Power7 implementation that could cause this? If not, I'm happy to provide more details or the complete program.
Pinned topic Kernel execution fails on Power7, no errors reported
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2011-07-27T08:49:48Z at 2011-07-27T08:49:48Z by PeterTh
Re: Kernel execution fails on Power7, no errors reported2011-07-21T16:02:44ZThis is the accepted answer. This is the accepted answer.Try running with our 'debug' runtime library that might show if there are any errors that didn't get caught --
If that doesn't show anything, then your best bet is to post your code here and we'll take a look at it.
Re: Kernel execution fails on Power7, no errors reported2011-07-21T16:05:32ZThis is the accepted answer. This is the accepted answer.PeterTh,
Have you tried running your program with the DEBUG library? Perhaps an error is beng issued that is not being seen. If the debug library doesn't help, then the complete program will be helpful in debugging the problem.
Re: Kernel execution fails on Power7, no errors reported2011-07-25T09:25:07ZThis is the accepted answer. This is the accepted answer.I tried using the debug runtime, and the only difference is that it takes a bit longer to run. The code is part of a larger suite of microbenchmarks, which could be a bit hard to compile/run in its current state, so I isolated the parts needed and tar'd it here:
It should compile with
g++ main.cpp ../cllib/clcommon.cpp ../cllib/clinfo.cpp -I../cllib -DUNIX -lOpenCL -o mem_latency
The code in the cllib part in particular is quite messy, but it's not really relevant. The functionality in the main.cpp is very straightforward, and as I said it works on every other platform.
I'm looking forward to finding out what is going on here.
Re: Kernel execution fails on Power7, no errors reported2011-07-25T11:36:35ZThis is the accepted answer. This is the accepted answer.
- PeterTh 270004DR9T
$ file cllib/clcommon.cpp
cllib/clcommon.cpp: POSIX tar archive (GNU)
$ tar -tvf cllib/clcommon.cpp
-rwxr--r-- petert/dps 1535 2011-07-11 05:40 cllib/clcommon.h
-rwxr--r-- petert/dps 16657 2011-07-21 04:45 cllib/clinfo.cpp
-rwxr--r-- petert/dps 3430 2011-07-25 04:03 cllib/clinfo.h
-rwxr--r-- petert/dps 3380 2011-07-21 03:50 mem_latency/constant_latency.cl
-rwxr--r-- petert/dps 3243 2011-07-21 03:50 mem_latency/global_latency.cl
-rwxr--r-- petert/dps 4604 2011-07-21 03:51 mem_latency/local_latency.cl
-rwxr--r-- petert/dps 15647 2011-07-25 03:53 mem_latency/main.cpp
-rwxr-xr-x petert/dps 55077 2011-07-25 04:10 mem_latency/mem_latency
Re: Kernel execution fails on Power7, no errors reported2011-07-25T12:05:32ZThis is the accepted answer. This is the accepted answer.Sorry about that, I have no idea how it happened.
I regenerated the file at http://www.dps.uibk.ac.at/~petert/web/ocl/mem_latency.tar and it should work now. (I actually downloaded, extracted and compiled it to test)
Re: Kernel execution fails on Power7, no errors reported2011-07-25T15:38:50ZThis is the accepted answer. This is the accepted answer.
- PeterTh 270004DR9T
unsigned long size, _MEM_TYPE mt, _DATA_LAYOUT layout, unsigned long iterations, // benchmark settings
but then point to it for the kernel as a uint:
err |= clSetKernelArg(kernel, 2, sizeof(cl_uint), &iterations);
so the kernel just sees the top half of the data - 0.
s/long/int/ in run_benchmark and/or build as a 32bit program - both makes it work.
also, you declare iterations as unsigned but then use -1 in several places, which could cause you problems.