• 2019 Update 7
  • 03/31/2020
Contents

Job Schedulers Support

Intel® MPI Library Developer Guide for Linux* OS
Intel® MPI Library supports the majority of commonly used job schedulers in the HPC field.
The following job schedulers are supported on Linux* OS:
  • Altair* PBS Pro*
  • Torque*
  • OpenPBS*
  • IBM* Platform LSF*
  • Parallelnavi* NQS*
  • SLURM*
  • Univa* Grid Engine*
The Hydra Process manager detects Job Schedulers automatically by checking specific environment variables. These variables are used to determine how many nodes were allocated, which nodes, and the number of processes per tasks.

Altair* PBS Pro*, TORQUE*, and OpenPBS*

If you use one of these job schedulers, and
$PBS_ENVIRONMENT
exists with the value
PBS_BATCH
or
PBS_INTERACTIVE
,
mpirun
uses
$PBS_NODEFILE
as a machine file for
mpirun
. You do not need to specify the
–machinefile
option explicitly.
An example of a batch job script may look as follows:
#PBS –l nodes=4:ppn=4 #PBS –q queue_name cd $PBS_O_WORKDIR mpirun –n 16 ./myprog

IBM* Platform LSF*

The IBM* Platform LSF* job scheduler is detected automatically if the
$LSB_MCPU_HOSTS
and
$LSF_BINDIR
environment variables are set.
The Hydra process manager uses these variables to determine how many nodes were allocated, which nodes, and the number of processes per tasks. To run processes on the remote nodes, the Hydra process manager uses the
blaunch
utility by default. This utility is provided by the IBM* Platform LSF*.
The number of processes, the number of processes per node, and node names may be overridden by the usual Hydra options (-n, -ppn, -hosts).
Examples:
bsub -n 16 mpirun ./myprog bsub -n 16 mpirun -n 2 -ppn 1 ./myprog

Parallelnavi NQS*

If you use Parallelnavi NQS* job scheduler and the
$ENVIRONMENT
,
$QSUB_REQID
,
$QSUB_NODEINF
options are set, the
$QSUB_NODEINF
file is used as a machine file for
mpirun
. Also,
/usr/bin/plesh
is used as remote shell by the process manager during startup.

SLURM*

If the
$SLURM_JOBID
is set, the
$SLURM_TASKS_PER_NODE
,
$SLURM_NODELIST
environment variables will be used to generate a machine file for
mpirun
. The name of the machine file is
/tmp/slurm_${username}.$$
. The machine file will be removed when the job is completed.
For example, to submit a job, run the command:
$ srun -N2 --nodelist=host1,host2 -A $ mpirun -n 2 ./myprog
To enable PMI2, set I_MPI_PMI_LIBRARY and specify --mpi option:
$ I_MPI_PMI_LIBRARY=<path to libpmi2.so>/libpmi2.so srun --mpi=pmi2 <application>

Univa* Grid Engine*

If you use the Univa* Grid Engine* job scheduler and the
$PE_HOSTFILE
is set, then two files will be generated:
/tmp/sge_hostfile_${username}_$$
and
/tmp/sge_machifile_${username}_$$
. The latter is used as the machine file for
mpirun
. These files are removed when the job is completed.

SIGINT, SIGTERM Signals Intercepting

If resources allocated to a job exceed the limit, most job schedulers terminate the job by sending a signal to all processes.
For example, Torque* sends
SIGTERM
three times to a job and if this job is still alive,
SIGKILL
will be sent to terminate it.
For Univa* Grid Engine*, the default signal to terminate a job is
SIGKILL
. Intel® MPI Library is unable to process or catch that signal causing
mpirun
to kill the entire job. You can change the value of the termination signal through the following queue configuration:
  1. Use the following command to see available queues:
    $ qconf -sql
  2. Execute the following command to modify the queue settings:
    $ qconf -mq <queue_name>
  3. Find
    terminate_method
    and change signal to
    SIGTERM
    .
  4. Save queue configuration.

Controlling Per-Host Process Placement

When using a job scheduler, by default Intel MPI Library uses per-host process placement provided by the scheduler. This means that the
-ppn
option has no effect. To change this behavior and control process placement through
-ppn
(and related options and variables), use the
I_MPI_JOB_RESPECT_PROCESS_PLACEMENT
environment variable:
$ export I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=off

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804