Intel® Many Integrated Core Architecture (Intel MIC Architecture)

two dimensional array offload issue

i have two dimensional dynamic array that i offload to phi. i dont really pass any data all i want is to allocate mem via transfer and access that mem via nocopy each iteration later on

void foo()

unsigned int ** twoDimArray = new ... etc ... [n*m]

#pragma offload_transfer target(mic:MIC_DEV) in(twoDimArray :length(n*m) alloc_if(1) free_if(0))

while (condition) {

//nocopy offload each iteration of external loop
#pragma offload target(mic:MIC_DEV) nocopy(twoDimArray :length(n*m) alloc_if(0) free_if(0))


mic0 reset failed


I am trying to make coprocessors work on Ubuntu 14.04 but I got stuck with error: mic0 reset failed. I have tried several methods noted in this form but I was not able to make it work. So, I am seeking help to make my mic work. Please, find my micbug as attachment.

Thank you.


preventing execution of remainder loop on xeon phi coprocessor

Hey everyone, consider the following sample code below. 

compiling with ifort -O3 -align array64byte -openmp -vec-report6 spits out something to the effect that nlist is aligned, the SIMD generated vectorization, and position is 64 bit indexed in the offloaded inner loop at line 93. However in the remainder loop, as we expect, nothing is aligned but the remainder code is vectorized. The !dir$ vector aligned prevents the creation of a peel loop like want.

Offload with persistent MIC buffer: are global pointers required?

We have been through that once, but here we go again, because latest results confuse me. My question is: in order to re-use a previously allocated memory buffer on the coprocessor, is the programmer required to supply a global pointer with attribute((target(mic))) in pragma offload?

The reason for this question is that I observe that global variables work in all cases, but local variables work in all cases except one (ouch!). So either it is a bug in the compiler or COI, or it a sign that one programming practice is better than another.

Xeon Phi 7120P always runs at lowest frequency

I recently installed one 7120P in one of my servers. It seems working fine, but I noticed that it always runs at the lowest available frequency. Even I am running the benchmark application coming with intel compiler, the frequency stays at 0.57GHz.

Any idea about this?

Here is some information about my machine

Expected performance gain ... 5960X vs Xeon Phi?

I am a retired theoretical physical chemist with a long association with computers and computing.
As briefly as possible, my interests are in the behavior of fluids at a phase boundary, such as a real gas at a solid
surface: the attractive forces of the solid cause an increased concentration (density) of the gas in the region near the surface, 
a measureable phenomenon called "adsorption". Thermodynamics requires that, at equilibrium at a constant temperature and 

Poor MKL Dfti complex to complex performance


I'm new to MIC programming and trying to get a grip on how to do things with the beast. I stumbled accros very bad FFT performance (using a matrix size often used at our institution) for dfti complex to complex transforms. In the following. no OMP, KMP, MKL variables are set, except when stated. Setting the number of threads or specifying the placement does not change much for this comparison: The mic is much slower than the host!

Any hints how to improve the situation?



Naiive Hardware Configuration Question.


Yet another naiive question. If I establish 2 compute nodes in my sandbox am I generally better off with a mic and 2 gpgpu per node? I'm guessing the answer is, it depends... But assuming that the mics leverage the vector processing in the gpus then pci seems like less of a bottleneck than qdr. My googling isn't showing big boxes with Frankenstein nodes but in my empty head it seems like a good idea.


Thanks again Robert


Assine o Intel® Many Integrated Core Architecture (Intel MIC Architecture)