Topic
  • 3 replies
  • Latest Post - ‏2011-01-24T13:42:05Z by SystemAdmin
jzola
jzola
2 Posts

Pinned topic CL_OUT_OF_RESOURCES on QS20

‏2011-01-21T04:11:53Z |
The OpenCL kernel below generates CL_OUT_OF_RESOURCES at barrier(...) when executed on IBM QS20 (device accelerator). Any idea what could be causing it?
__kernel void wdthf(__const unsigned int n, __const unsigned int m,
__global float* in, __global float* out,
__local float4* buf) {
unsigned int gIDx = get_global_id(0);
unsigned int gIDy = get_global_id(1);
unsigned int block = get_local_size(1);

__global float* tab = in + (gIDx * m);

unsigned int l = 0;

float x = 0.0f;
float delta = 0.0f;

float mean = 0.0f;
float M2 = 0.0f;

unsigned int offset = gIDy;

while (offset < m) {
l++;
x = taboffset;
delta = x - mean;
mean += (delta / l);
M2 += (delta * (x - mean));
offset += block;
} // while

bufgIDy.x = mean;
bufgIDy.y = M2;
bufgIDy.z = l;

barrier(CLK_LOCAL_MEM_FENCE);

// snip
}
Platform and device info:

  • IBM; OpenCL 1.0 201006161447
  • CL_DEVICE_NAME: ACCELERATOR CellBE processor
CL_DEVICE_AVAILABLE: 1
CL_DEVICE_MAX_COMPUTE_UNITS: 16
CL_DEVICE_HOST_UNIFIED_MEMORY: 1
CL_DEVICE_GLOBAL_MEM_SIZE: 195035136
CL_DEVICE_LOCAL_MEM_SIZE: 249600
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 134217728
CL_DEVICE_MAX_WORK_GROUP_SIZE: 256
CL_DEVICE_MAX_WORK_ITEM_SIZES: 256 256 256
CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT: 16
  • CL_DEVICE_NAME: ACCELERATOR CellBE processor
CL_DEVICE_AVAILABLE: 1
CL_DEVICE_MAX_COMPUTE_UNITS: 16
CL_DEVICE_HOST_UNIFIED_MEMORY: 1
CL_DEVICE_GLOBAL_MEM_SIZE: 195035136
CL_DEVICE_LOCAL_MEM_SIZE: 249600
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 134217728
CL_DEVICE_MAX_WORK_GROUP_SIZE: 256
CL_DEVICE_MAX_WORK_ITEM_SIZES: 256 256 256
CL_DEVICE_NATIVE_VECTOR_WIDTH_FLOAT: 16
Updated on 2011-01-24T13:42:05Z at 2011-01-24T13:42:05Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: CL_OUT_OF_RESOURCES on QS20

    ‏2011-01-21T20:17:29Z  
    jzola,

    I'm not sure exactly what you are saying. What OpenCL function is generating a CL_OUT_OF_RESOURCES error?

    If I were to guess, you are getting an CL_OUT_OF_RESOURCES error returned from clEnqueueNDRangeKernel() because you have requested a local_work_size that exceeds the maximum work group size allowed by the kernel following compilation (see clGetKernelWorkGroupInfo(CL_KERNEL_WORK_GROUP_SIZE). My guess is that you have found that the insertion of the barrier built-in resulted in this error and that you are encountering the problem on the v0.2 runtime/compiler because either there is a limitation in the compiler or the compiler is forced to limit the maximum work group size because of the SPE's limited local storage. The entire kernel source and information on the requested work group size is necessary to determine the real cause.

    Dan B.
  • jzola
    jzola
    2 Posts

    Re: CL_OUT_OF_RESOURCES on QS20

    ‏2011-01-23T21:29:34Z  
    jzola,

    I'm not sure exactly what you are saying. What OpenCL function is generating a CL_OUT_OF_RESOURCES error?

    If I were to guess, you are getting an CL_OUT_OF_RESOURCES error returned from clEnqueueNDRangeKernel() because you have requested a local_work_size that exceeds the maximum work group size allowed by the kernel following compilation (see clGetKernelWorkGroupInfo(CL_KERNEL_WORK_GROUP_SIZE). My guess is that you have found that the insertion of the barrier built-in resulted in this error and that you are encountering the problem on the v0.2 runtime/compiler because either there is a limitation in the compiler or the compiler is forced to limit the maximum work group size because of the SPE's limited local storage. The entire kernel source and information on the requested work group size is necessary to determine the real cause.

    Dan B.
    OK, I was bit imprecise. Yes it is clEnqueueNDRangeKernel that returns CL_OUT_OF_RESOURCES, and your guess is wrong. I never exceed CL_DEVICE_MAX_WORK_GROUP_SIZE (neither any of CL_DEVICE_MAX_WORK_ITEM_SIZES). Dimensions of the kernel are also correct, local storage allocated does not exceeds LOCAL_MEM_SIZE (actually is negligible).
  • SystemAdmin
    SystemAdmin
    131 Posts

    Re: CL_OUT_OF_RESOURCES on QS20

    ‏2011-01-24T13:42:05Z  
    • jzola
    • ‏2011-01-23T21:29:34Z
    OK, I was bit imprecise. Yes it is clEnqueueNDRangeKernel that returns CL_OUT_OF_RESOURCES, and your guess is wrong. I never exceed CL_DEVICE_MAX_WORK_GROUP_SIZE (neither any of CL_DEVICE_MAX_WORK_ITEM_SIZES). Dimensions of the kernel are also correct, local storage allocated does not exceeds LOCAL_MEM_SIZE (actually is negligible).
    jzola:

    You may not have exceeded the CL_DEVICE_MAX_WORK_GROUP_SIZE, but did you exceed the clGetKernelWorkGroupInfo(CL_KERNEL_WORK_GROUP_SIZE) size? While this is not my area of expertise, I think the v0.2 compiler will back off the support for work group sizes greater than 1 if it detects that the kernel is "too complex". We have been working on improving the compiler so these limitations occur less often and I hope this limitation will be fixed in the next release of the SDK. If you wish to share your complete kernel, I can verify if it is fixed.

    Dan B.