Allocating large buffers

Allocating large buffers

Hi,

How can I allocate a single large buffer of 8GB ?

My Core i7 reports

CL_DEVICE_GLOBAL_MEM_SIZE:    16780570624

but

CL_DEVICE_MAX_MEM_ALLOC_SIZE:    4195142656

How can I increase the MAX ALLOC SIZE limit in Intel OpenCL SDK?

Thanks

10 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Sorry it took a while for responding. Not sure if you are still looking for an answer. In theory you cannot allocate a buffer whose size is more than  CL_DEVICE_MAX_MEM_ALLOC_SIZE (OpenCL doesn't allow that). But if you need a buffer that's larger than  CL_DEVICE_MAX_MEM_ALLOC_SIZE, you might have to allocate multiple buffers and implement logic in your code to merge the result from these multiple buffers. What exactly are you trying to do?

Thanks,
Raghu

 

I know this is an old thread, however I have the same issue - I cannot allocate buffers that are > 1/4th of my available memory due to someone implementing an artificial limit.

The OpenCL specification does not seem to imply this limit. Please read https://www.khronos.org/registry/cl/specs/opencl-1.2.pdf on page 39 table 4.3 it states:

CL_DEVICE_MAX_MEM_ALLOC_SIZE 
cl_ulong 
Max size of memory object allocation in bytes. The minimum value is max (1/4th of CL_DEVICE_GLOBAL_MEM_SIZE , 128*1024*1024) for devices that are not of type CL_DEVICE_TYPE_CUSTOM.

Can someone please tell me why Intel OpenCL drivers (v4.4 of July 2014) impose the minimum value and not just let the CL_DEVICE_MAX_MEM_ALLOC_SIZE be set to be as big as the memory available?

Thanks,
Josh

 

I know this thread is old and was unearthed before but this is exactly the question I would like to ask and I did not want to open another thread. Maybe somebody got the answer somewhere in the meantime.

Why is the limit set to 1/4th?

Thank you

Do you need to know just for CPU, or do you need more info about the GPU max alloc size too?

Especially CPU but if you can tell me about the GPU limitation as well I would appreciate that. :)

>>My Core i7 reports
>>
>>CL_DEVICE_GLOBAL_MEM_SIZE: 16780570624
>>
>>but
>>
>>CL_DEVICE_MAX_MEM_ALLOC_SIZE: 4195142656

I just checked a video card on my Dell Precision Mobile Workstation ( M4700 ) and it looks like the 1/4th-rule is applicable to all devices:

...
oclDeviceQuery.exe Starting...

OpenCL SW Info:

CL_PLATFORM_NAME: NVIDIA CUDA
CL_PLATFORM_VERSION: OpenCL 1.2 CUDA 8.0.0
OpenCL SDK Revision: 7027912

OpenCL Device Info:

1 devices found supporting OpenCL:

---------------------------------
Device Quadro K1000M
---------------------------------
CL_DEVICE_NAME: Quadro K1000M
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DRIVER_VERSION: 369.26
CL_DEVICE_VERSION: OpenCL 1.2 CUDA
CL_DEVICE_OPENCL_C_VERSION: OpenCL C 1.2
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
CL_DEVICE_MAX_COMPUTE_UNITS: 1
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES: 1024 / 1024 / 64
CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024
CL_DEVICE_MAX_CLOCK_FREQUENCY: 850 MHz
CL_DEVICE_ADDRESS_BITS: 64
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 512 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 2048 MByte
CL_DEVICE_ERROR_CORRECTION_SUPPORT: no
CL_DEVICE_LOCAL_MEM_TYPE: local
CL_DEVICE_LOCAL_MEM_SIZE: 48 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_MAX_READ_IMAGE_ARGS: 256
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 16
CL_DEVICE_SINGLE_FP_CONFIG: denorms INF-quietNaNs round-to-nearest round-to-zero round-to-inf fma
...

>>...Why is the limit set to 1/4th?

I think this needs to be addressed to somebody from Khronos who is responsible for OpenCL specifications.

I'm not an expert on the OpenCL specification, however this is what OpenCL 1.2 spec sheet says: "The minimum value is max (1/4th of CL_DEVICE_GLOBAL_MEM_SIZE , 128*1024*1024) "

so, the specification specifies a minimum value. This does not mean the minimum value should be implemented as the maximum value!

There is a 2nd part of the statement about "...for devices that are not of type CL_DEVICE_TYPE_CUSTOM..."

In case of my Quadro K1000M card this is:
...
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
...
I wonder if Altera FPGA cards are classified as CL_DEVICE_TYPE_CUSTOM and do Not have that restriction. Could somebody from Intel verify it?

Leave a Comment

Please sign in to add a comment. Not a member? Join today