micnativeloadex problem


I am trying to run a program natively using micnativeloadex but ran into a few problems.

a) I set: export SINK_LD_LIBRARY_PATH=/opt/intel/composer_xe_2015.3.187/compiler/lib/mic/

b) I compiled with: "icpc -mmic -qopenmp -o test_native test_native.cc"

c) micnativeloadex test_native

It seems the SINK_LD_LIBRARY_PATH isn't been set. The result I get is:

Fortran OpenMP on Intel Xeon Phi


My questions are very simple. We have intel visual fortran 2015 and fortran subroutines parallelized with OpenMP directives. Is the compiled code be capable of using all the available threads on the Intel Xeon Phi? Would it be required to modify the code to make it compliant with these new processors?

Our intention is to use already-parallelized code on Intel Xeon Phi or a similar MIC processor. Any suggestions or links on how to do this?


Supplied binary does not match the Intel(R) Xeon Phi(TM) coprocessor that is installed.

I've been using an Intel Xeon Phi A3102 coprocessor with no issues for about 9 months now, but I recently started having issues running code natively on the device. When attempting to run a program via micnativeloadex I get the error message "Supplied binary does not match the Intel(R) Xeon Phi(TM) coprocessor that is installed", which I've never seen before. I updated the software on my (composer, etc) which does not seem to have solved the problem. For reference, I'm using RHEL 6.6. Some system information collected with micinfo:

MPSS 3.5.1, Centos 7.1 x86_64-k1om-linux-ld: relocation error

Updated my mmic system to Centos 7.1, and when trying to compile hello_world with

ifort -mmic hello_world.f90 -o hello_world.f90

get a link error:

x86_64-k1om-linux-ld: relocation error: /usr/lib64/libc.so.6: symbol _dl_starting_up, version GLIBC_PRIVATE not defined in file ld-linux-x86-64.so.2 with link time reference

Any suggestions?



Xeon Phi crashes on too-large SCIF memory registration

Is there a mechanism with SCIF to register a memory region with all endpoints? At the moment, I have a for-loop with scif_register() on this memory region with each endpoint. Memory registration is rather expensive and I would like to avoid unnecessarily incurring this cost repeatedly if there is possibly a faster way to register with all endpoints.

With my current method, if the memory region is sufficiently large (e.g., 6 GB+), the coprocessor crashes during scif_register():

Force xeon level precision on Xeon phi or vice versa

Hi all,

I have been running a program where precision of doubles mean a lot to my program.

However due to some strange reason it seems like Xeon phi is rounding off a few bits(at 10^-8th bit) and this seems to be causing some instabilities to my model. A small round off error grows over my model over iteration of time step and my model fails to converge.

here is  some sample differences in error.

Xeon phi value

订阅 Unix*