Intel® Clusters and HPC Technology

Documentation for configuring Intel Cluster Studio XE with Moab/Torque queue manager

 

The introductory training documents that Intel provides with the Cluster Studio XE Suite want/expect the nodes of the cluster to be configured with the Intel provided mpd ring.   We use the Adaptive Computing resource manager Moab with the Torque hardware manager.

Do you have of any good source of documentation for getting the Intel Cluster Suite to play nicely with the Moab/Torque queue manager?

mpiexec hangs for 5 seconds when launching notepad

I'm running Intel(R) MPI Library for Windows* OS, Version 4.0 Update 3 Build 8/24/2011 3:07:12 PM on Windows 8 64bit. Mpiexec takes 5 seconds to launch notepad. I'm trying to figure out where the time is being spent. The computer this happens on is an Intel Core i7 3.20 GHz with 16 GB of RAM. This test runs mpiexec when minimal CPU resources are being used.

The command line: mpiexec -verbose -localonly notepad

I've marked below where the hang happens in the log.

MPI Library 4.1 and Torque

Dear all,

I'm trying to run a classical MPI test code on our cluster, and I'm still in trouble with it. I have installed the Intel Cluster Studio XE 2013 for Linux and Torque 4.1.3. 

If I don't use torque "mpirun -f machine -np 18 ./code", it runs fine (machine is the list of nodes). If i use torque, it runs and stop at the end of walltime with the following errors

mpitune question

I am trying to use MPITUNE to tune runs on my new cluster.  The cluster runs under torque and has Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz cpus with 16 cores.  So I run mpitune using 96 cpus under torque (#PBS -l nodes=96).  It then generates the following tuning files :

mpiexec_shm-ofa_nn_1_np_16_ppn_16.conf mpiexec_shm:ofa_nn_1_np_16_ppn_16.conf mpiexec_shm-ofa_nn_1_np_2_ppn_2.conf

mpiexec_shm:ofa_nn_1_np_2_ppn_2.conf mpiexec_shm-ofa_nn_1_np_4_ppn_4.conf mpiexec_shm:ofa_nn_1_np_4_ppn_4.conf

assertion in MPI_GATHERV

Hello,

I receive the following message when I call MPI::COMM_WORLD.Gatherv with large data :

Assertion failed in file ../../i_rtc_cache.c at line 631: buf_end_palign > buf_start_palign

internal ABORT - process 1
[0:node0066] unexpected disconnect completion event from [1:node0067]
Assertion failed in file ../../dapl_conn_rc.c at line 1128: 0
internal ABORT - process 0

Binary Instrumentation - ITCPIN and OpenMPI

Hi,

I am getting following error message while I am trying to do binary instrumentation using itcpin.

Error : E:Function RTN_FindByName called without holding lock. Call PIN_LockClient()/PIN_UnlockClient()

Command used :  mpirun -np 12 itcpin --verbose on --run --profile on --mpi /usr/lib64/openmpi/1.4-gcc/lib/libmpi.so --insert libVT <exe>

Regards,

Nitin G.

Infiniband and MPI_THREAD_MULTIPLE

Hi!

I would like IMPI to use Infiniband together with the MPI_THREAD_MULTIPLE mode. I tested this combination with the Intel MPI Benchmark Suite which mostly runs fine, but crashes at the Bcast Benchmark.

Is it safe to use IMPI that way?

Edit: Some additional Informations.
OS: Windows Server 2008 R2
IMPI version:  4.1 Build 08/22/2012
Intel Benchmark Suite Version: 3.2.3
Infiniband Hardware Connect-X 2 from Mellanox with OFED Providers
Provider used by IMPI: DAPL
 

everything runs on one core

hi

i hope you are doing good. i am trying to make an application using multiple CPUs connected via MPI over a network. I am able to compile my code. However i have an issues for which i seek your help through this forum.

The toolchain i am using is

Compiler XE for applications running on Intel(R) 64, Version 12.1.6.361 Build 20120821
Copyright (C) 1985-2012 Intel Corporation.

I wish to use the intel compilers with intel MKL and intel-mpi.

This is my link line that i got from the link line advisor

MPI_Barrier bug with defined communicators.

Hi,

I am working with IntelMPI version 4.1.0.024 and I detected a problem with the MPI_Barrier() function (maybe a bug).

In the attached code I create a new process via the MPI_Comm_spawn function. Then I merge the intercomm and

the parent communicator with the MPI_Intercomm_merge function and I call a MPI_Barrier() function with the new

communicator.

The problem is some processes don't continue the execution (they remain held in the MPI_Barrier() function).

I have tested the code with other MPI implementations and it works fine.

MPI MPMD fault tolerance support

Hi,

I would really appreciate some help. I would like to know whether Intel MPI supports fault tolerance (run-through stabilisation) for multiple programs multiple data (MPMD) applications?

I have read the Intel MPI fault tolerance documentation. I am running a master - worker application, where the master and worker code are seperate and where there is no communication amongst workers. My configure command looks like this:

Pages

Subscribe to Intel® Clusters and HPC Technology