Computación en paralelo

Running openmp on the xeon phi

I created a program which will calculate the number of FLOPS per second on the Xeon Phi. This program was made to run nativly on the Xeon Phi. I tried using openmp and compiled the program using icc as such

$ icc -openmp -mmic -vec-report=3 -O3 helloflops3.c -o helloflops3

Though when I tried to run it on the Xeon Phi, I found out that there was no different in speed between running the program in 1 thread or 240 threads. The progam tells me that am using 240 threads, but there is no difference in speed. I ran the program as such.

$ export OMP_NUM_THREADS=240

MIC to MIC to HOST MPI bandwidth issue

Running Intel's IMPI benchmark (mpi ver i've got some strange results.

mpirun -genv I_MPI_FABRICS=shm:dapl -np 2 -ppn 1 -hosts mic0,mic1 ./IMB-MPI1 PingPong 

   36 us lattency for 0 bytes messages , max 868 Mbytes/sec for 4MB messages.

using tcp instead of dapl (i have external bridge config for mic's ethernet ports with MTU of 1500):

 mpirun -genv I_MPI_FABRICS=shm:tcp -np 2 -ppn 1 -hosts mic0,mic1 ./IMB-MPI1 PingPong

  496 us lattency for 0 Bytes and 16 MBytes/sec max throughput for 4MB messages!!!

A good mic gets to 'reset failed' after flashing firmware

I have a mic card which works well that I can boot it correctly and login. But it gets to 'reset failed' status when I tried to update the flash for it.

1. I run the micflash command to udpate the mic0

#/opt/intel/mic/bin/micflash -update /opt/intel/mic/mnt/opt/intel/mic/flash -device 0
mic0: Flash image: /opt/intel/mic/mnt/opt/intel/mic/flash/EXT_HP2_B1_0386-02.rom.smc
micflash: mic0: Flash operation timed out

mic0: Flash update started
mic0: Flash update done
mic0: SMC update started
mic0: Resetting

disable vectorization

Hi, I have a question on MIC. That is, when using -O2 (or further -O3), the compiler will vectorize the code automatedly. However, when we further add the -no-vec option, the compiler will not vectorize the code, right? But when looking into the assembly code (*.s), I found that there are still a lot of vector instrustructions there. Then what are the differences between using and not using the vector option? The compiler will vectorize the code anyway?


Which compiler?


I've just received PHI 3120 on a Windows machine.  The code we've written is in the current Intel Fortran but I see the Intel web pages discuss an beta version of a Fortran compiler and Amplifier for the PHI.  I'm confused.  Should we be using the up-to-date current Intel Fortran compiler or the beta version discussed in

MKL function load error: cpu specific dynamic library is not loaded.

I'm just getting started with MIC and I'm hitting what must be a simple configuration issue.

I want to run a program using OpenMP and MKL on the MIC directly (i.e. no hybrid offload magic). OpenMP is working fine, but when I try to call a simple MKL function (e.g. vdLn) I get the following error: "MKL function load error: cpu specific dynamic library is not loaded."

I've been able to catch this in gdb. The stack trace when the error is printed is:

Suscribirse a Computación en paralelo