Compiling OpenCL 2.0 atomics


I am trying to compile a simple kernel using OpenCL 2.0 atomics using exactly the device, driver, and kernel described in:

However, I cannot even get the kernel to compile, as it does not seem to recognize the atomic types and functions. My error log (along with some environment info) is:

Required Product License Upgrade for Intel® Parallel Studio XE 2016 and Intel® System Studio 2016

If you have an older version of Intel® Parallel Studio XE or Intel® System Studio, and your subscription period for the current product is active and is eligible for the upgrade, you will be prompted to upgrade your product. You will receive a product update notification email with a download link. When you upgrade to the 2016 version you will receive a new serial number to be used at Installation. You can also upgrade your product by following the steps below.
  • Desarrolladores
  • Socios
  • Profesores
  • Estudiantes
  • Apple OS X*
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 10
  • Microsoft Windows* 8.x
  • C/C++
  • Fortran
  • Intel® Parallel Studio XE
  • Intel® System Studio
  • Herramientas de desarrollo
  • change image format of image object in opencl

    I have written a small code to create an image object in opencl as below:

    img_fmt.image_channel_order = CL_R;
    img_fmt.image_channel_data_type = CL_UNSIGNED_INT8;      
    memobj_in_luma = clCreateImage2D(p->context, CL_MEM_READ_ONLY, &img_fmt, p->width, p->height, 0, NULL, &ret);

    After creating this object I want to change the image format to CL_RGBA. Is there any way to do this?

    HD4400 bitwise and operation on uchar2 data

    we are seeing different results when implementing "bitwise and" operation in OpenCL kernel working on uchar2 data. The OpenCL kernel code like this:
    uchar2 val1;
    uchar2 val2;
    uchar3 res;
    res = val1 & val2;

    produce wrong results, while code like below:

    res = (uchar2)(val1.x & val2.x, val1.y & val2.y)
    produce correct result.

    BTW, the same behaviour detected for bitwise or/xor and uchar3/4 data, although attached test case was prepared only for "bitwise and" on uchar2 data. 

    Meshcentral & Commander at IDF2015

    A quick note to say that my team has excellent representation & location at IDF this year. If you are at IDF and want to know what we are doing with platform manageability, Intel® AMT or many other technologies our group is working on, please stop by. We are located on the 2th floor on top of the escalators. Not only are the technologies being shown, but often times the people that work on this technology day-to-day are also present making it a fun and interesting experience.

    Fast Gathering-based SpMxV for Linear Feature Extraction

    This algorithm can be used to improve sparse matrix-vector and matrix-matrix multiplication in any numerical computation. As we know, there are lots of applications involving semi-sparse matrix computation in High Performance Computing. Additionally, in popular perceptual computing low-level engines, especially speech and facial recognition, semi-sparse matrices are found to be very common. Therefore, this invention can be applied to those mathematical libraries dedicated to these kinds of recognition engines.
  • Desarrolladores
  • Socios
  • Profesores
  • Estudiantes
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • Code for Good
  • Windows*
  • C/C++
  • Intermedio
  • Intel® Advanced Vector Extensions
  • Intel® Streaming SIMD Extensions
  • Sparse Matrix-Vector multiplication
  • sparse matrix
  • Feature Extraction
  • speech recognition
  • Procesadores Intel® Core™
  • Optimización
  • Computación en paralelo
  • Vectorización
  • setting work_group_size crashes OpenCL on Intel CPU


    I am transfering the reduction kernel from amd app sdk.

    It requires setting work_group_size when you execute

    clEnqueueNDRangeKernel  with local_work_size that is different from 8 it crashes directly in tbb on Intel OpenCL for Intel CPU. The clEnqueueNDRRange successfully launches the kernel.

    When you request work_group_size from the device it returns 8192 (should be 8 in this case) and the kernel work group size is 2048. It crashes with both settings.

    Works only with the number of the cores.

    I have Intel Haswell 4770K.

    Suscribirse a Socios