Intel MPI and Torque 3.0.2

Intel MPI and Torque 3.0.2

Hi,

on the cluster of our university we have one of the last Intel MPI (vielleicht nicht die letzte Update) and Torque 3.0.2. The cluster is running under Red Hat Santiago 6.2. Torque was compiled by the firm, which installed the cluster. But they can't tell us which options they used...

It seems that there is an interaction problem between intel MPI and our Torque:
I can start without torque this job:

mpirun -machinefile machine_file -n 1 program1 : -n 1 program2 : -n 82 program3

I can start the following job:

#!/bin/bash

#PBS -N test

#PBS -l walltime=71:59:00

#PBS -q special
cd $PBS_O_WORKDIR
export I_MPI_FABRICS=ofa
cat $PBS_NODEFILE > machine_file
mpirun -machinefile ./machine_file -n 1 program1 : -n 1 program2 : -n 82 program3


But I can not start this job:
#!/bin/bash

#PBS -N test

#PBS -l walltime=71:59:00

#PBS -q special
cd $PBS_O_WORKDIR
export I_MPI_FABRICS=ofa
mpirun -n 1 program1 : -n 1 program2 : -n 82 program3

It's strange because on other clusters the last job starts. SO I think that we have missed an option in the Torque configuration.

Has Anyone an idea ? Or which options of Torque are you using to install it and that it works with intel MPI ?

Best regards,
Guillaume

publicaciones de 6 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

Hi Guillaume,

Inside of your batch script, check the value of the environment variable $PBS_ENVIRONMENT, it should be either PBS_BATCH or PBS_INTERACTIVE for mpirun to detect that it is running inside of PBS.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Hi,

$PBS_ENVIRONMENT is set to PBS_BATCH.

Regards

Hi Guillaume,

As I recall, you are using version 4.0.2. If this is correct, please try adding the following to your script:

cd $PBS_O_WORKDIR

. /intel64/bin/mpivars.sh

mpiexec.hydra -rmk pbs ./a.out

Changing to match where you have the Intel MPI Library installed and setting the mpiexec.hydra line to match your job. You can also set the I_MPI_HYDRA_RMK environment variable to pbs and you won't need to call that on the command line.

Please try this and let me know if it works. Or if you're using a different version, please let me know which one.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

With mpiexec.hydra it seems OK. With or without the mpivars.sh.

strange. Is there differences between the mpiexec.hydra and mpirun ? Which one is recommended ?

Thx for your help
Guillaume

Best Reply

Hi Guillaume,

Normally, there should be no difference between the two (mpirun calls mpiexec.hydra), however version 4.0.2 of the Intel MPI Library does not correctly acquire the node list. This is corrected in the current version (4.0.3). Using mpiexec.hydra with -rmk pbs (you could try mpirun with -rmk pbs as well) works correctly.

We generally recommend using mpirun, as it checks more environment variables and sets additional flags before calling mpiexec.hydra. It can also be used to run mpiexec (using MPDs) if needed, by setting an appropriate environment variable. Basically, mpirun is more flexible. However, there are times when directly calling mpiexec.hydra is necessary.

I recommend, if possible, upgrading to the current version of the Intel MPI Library.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

Deje un comentario

Por favor inicie sesión para agregar un comentario. ¿No es socio? Únase ya