Xeon Phi wrong behavior

Xeon Phi wrong behavior

Hi,

I have a new 'strange' behavior of a OpenCL kernel using the Xeon Phi.

In this case,  I have a small example written in HPL that executes perfectly in CPU, GPU but not in XEON PHI.

I've attached the example in a .cpp file. You can download HPL library to test it or you can reproduce it with OpenCL (If you need the OpenCL code, please ask me). The problem is in the following loop:

inline Double ipow(Double aa, Int a, Int b)
{
  Double q, qaux;
  Int n, n2;
  Int two_pow = 0;
  q = aa;
  n = a;
  while_(two_pow < 100) {
    n2 = n / 2;
    if_(n2 * 2 == n) {
      qaux = 1.5*q;
      q = qaux;
      n = n2;
    }
    else_ {
      n = n * b;
      two_pow = 200;
    }
//////// UNCOMMENT THIS LINE AND THE PROBLEM WILL BE SOLVED :(
//    two_pow++;
  }

 return(q);
}

As I wrote above, if you uncomment the line (superfluous line) the code executes perfectly in the three (Intel CPU, NVIDIA GPU, and XEON PHI) platforms but the line is commented, the code fails in the XEON PHI case. The global and local work spaces are {1,1,1}.

Why it fails for PHI case? Thank you so much

 

Moisés Viñas

http://gac.udc.es/~moises/index_en.html

 

5 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Hi Moisés,

I can't compile the attached reproducer for Xeon Phi, I've got compilation error for 'ACCELERATOR' macro. I guess this is because the public version of HPL doesn't support accelerator device type. Could you please provide a more recent version?

Meanwhile for CPU I got:
2.96439e-323
51.2578
51.2578
51.2578
...

The first value fluctuates slightly from run to run. Is the output correct?

Thanks,
Yuri

Hi Yuri,

 

Yes, the public version is outdated :-/ But the problem will be solved if you change the following files in /src subfolder for these ones: CLbinding.cpp, Device.h and Device.cpp

 

The results for a[i] = 1.5^7 x 3 for i!=0. For i = 0 is an aleatory number. However, for XEON PHI, the value is 1.5 x 3 Why?

 

Thanks in advance,

 

Moisés Viñas

http://gac.udc.es/~moises/index_en.html

 

Ok, now it's working for me and I was able to reproduce the issue using latest public release (XE R3).
But it works correctly on our internal development version. So please expect a fix to be available in next release (no estimates about the date, as usual).
If time permits I will also try to add this reproducer to our test system to be sure that regression is not introduced.

Thanks,
Yuri

Yuri, my Xeon Phi acts strange with a simple convolution kernel. I believe the results it shows are incorrect. Is there a known problem with opencl support for phi?

 

For kernel enqueue: 

global size set to 1024x1024

local size set to NULL.

 

Thanks 

//KERNEL_SIMPLE
__kernel void Convolve(const __global  float * pInput,
                        __constant float * pFilter,
                        __global  float * pOutput,
                        const int nInWidth,
                        const int nFilterWidth)
{
    const int nWidth = get_global_size(0);

    const int xOut = get_global_id(0);
    const int yOut = get_global_id(1);

    const int xInTopLeft = xOut;
    const int yInTopLeft = yOut;

    float sum = 0;
    for (int r = 0; r < nFilterWidth; r++)
    {
        const int idxFtmp = r * nFilterWidth;

        const int yIn = yInTopLeft + r;
        const int idxIntmp = yIn * nInWidth + xInTopLeft;

        for (int c = 0; c < nFilterWidth; c++)
        {
            const int idxF  = idxFtmp  + c;
            const int idxIn = idxIntmp + c;
            sum += pFilter[idxF]*pInput[idxIn];
        }
    }
    const int idxOut = yOut * nWidth + xOut;
    pOutput[idxOut] = sum;
}
//KERNEL_SIMPLE

Faça login para deixar um comentário.