Informatique en cluster

CentOS 7.2 + MLNX_OFED 3.1-1 + MPSS 3.6.1

I'm trying to get mpss running on our centOS 7.2 cluster.

We're using kernel 3.10.0-327.4.4.el7.x86_64.

MLNX_OFED_LINUX-3.1- (OFED-3.1-1.0.3).

On Mellanox Technologies MT27500 Family [ConnectX-3] adapters.

Installation of both MOFED and mpss run flawlessly, except when I try to use them together.

So I can install mpss. Setup ethernet networking. Ssh from/to xeon phi. Run code on xeon phi, all without a problem problem.

I can install Mellanox ofed, use infiniband (ON THE HOST),(ibv_*_pingpong ) without problems. 

Poor speed in MIC

Dear All:

As learning purpose, i tried to code a program which find total number prime number for a given range. isprime function finds  if a number is prime or not. I added !$omp declare simd to vectorize that function. I do not know why, program perform three times slower in intel phi than host.


Host: 16 sec

MIC: 43 sec

MODULEFILE creation the easy way

If you use Environment Modules  (from Sourceforge, SGI, Cray, etc) to setup and control your shell environment variables, we've created a new article on how to quickly and correctly create a modulefile.  The technique is fast and produces a correct modulefile for any Intel Developer Products tool.

The article is here:


Rebuild ofed-driver-3.6.1-1.src.rpm MPSS installation issues


Im installing MPSS 3.6.1 on two xeon phi nodes in a cluster connected to Infiniband, CentOS 6.6 and the kernel version is 2.6.32-504.8.1.el6.x86_64, so I update the kernel-headers and kernel-devel and rebuilt the MPSS host drives as the user guide says, and so far so good, but the problem comes when I tried to rebuild OFED drivers with rpmbuild --rebuild ofed-driver-3.6.1-.1.src.rpm,  I get the following error message:

Profiling MPI applilcation with Vtune

Hi, folks

I'd like to profile my MPI application with Vtune.

In ordered to see the inter-node behaviors,I definitely need to use '-gtool' options to aggregate the profiled result into one file.

1) When I run the application without profiling, the following command works perfect:

  • $ mpiexec.hydra -genvall -n 8 -machinefile /home/my_name/machines ARGS1 ARGS2 ...

2) The following command also does the job (running multiple MPI processes on a machine). I can see the aggregated results of them.

Using InfiniBand network fabrics to allocate globally shared memory for processes on different nodes

Dear Collegues,

My MPI program implements a globally shared memory for processes on multiple nodes (hosts) using MPI_Win_allocate_shared, MPI_Comm_split_type functions calls. Unfortunately, the memory address space allocated is not actually shared between processes on different nodes. I'm wondering what will actually happen if I run my MPI program on a cluster with InfiniBand network and change the network fabrics to I_MPI_FABRICS=shm:dapl or something like that. Is this can be a solution of the following problem ?

Thanks in advance.

Cheers, Arthur.

S’abonner à Informatique en cluster