Integrating Intel MPI Library with Sun Grid Engine

So, you want to use Intel® MPI Library with the Sun* Grid Engine* (SGE) batch scheduler?

The below instructions describe how to run Intel MPI Library jobs using Sun Grid Engine. This document relates to Linux*. While there are some differences and additional steps when using Microsoft* Windows*, in general the procedure is the same.

All optional steps are recommended but not necessary for successful integration.

  1. [Optional] Visit sun.com and get a brief overview of SGE
  2. Installation

    See the Installation Guide from sun.com for details. Roughly, the steps are as follows:

    • Install Master Host (see ‘How to Install the MasterHost’ section);
    • Install Execution Host (see ‘How to Install ExecutionHosts’ section);
    • Register Administration Hosts (see the corresponding section in the Installation Guide);
    • Register Submit Hosts (see corresponding section);
    • Verify the installation (see corresponding section).

    IMPORTANT NOTES:

    • To finalize the installation process, you’ll have to configure the network services manually (by modifying /etc/services), which requires root privileges.
    • It’s possible to install/run SGE as a non-privileged user, but
      1. there are some limitations in that case;
      2. you need root privileges for the complete installation process (at least, for modifying /etc/services).
  3. Create a new Parallel Environment (PE) for Intel MPI Library
    1. Create the appropriate configuration file for the new PE. It should contain the following lines:
      pe_name impi
      slots 999
      user_lists NONE
      xuser_lists NONE
      start_proc_args NONE
      stop_proc_args NONE
      allocation_rule $round_robin
      control_slaves FALSE
      job_is_first_task FALSE
      urgency_slots min
    2. Add the new PE using the following command:
      ‘qconf –Ap <config_file>’

    USEFUL COMMANDS:
    * qconf –spl – view all PEs currently available;
    * qconf –sp <PE_name> - view settings for a particular PE;
    * qconf –dp <PE_name> - remove a PE;
    * qconf –mp <PE_name> - modify an existing PE.

    Also see the ‘Managing Special Environment’ section in the Administration Guide from sun.com if you need more details about PE configuration.

  4. Associate a queue with the new PE

    Use the following commands for that:

    1. qconf –sql – to see all queues available;
    2. qconf –mq <queue_name> - to modify the queue’s settings. Find the ‘pe_list’ property in the open window and add the ‘impi’ string to that property.

    USEFUL COMMANDS:
    * qconf –sq <queue_name> - view the queue’s settings.

    See the Administration Guide if you need more details about the queue configuration process.

  5. Add Intel MPI Library environment to your current environment by sourcing the appropriate mpivars.[c]sh script located in the <install_dir>/bin[64] directory
  6. Build the MPI application to be run
  7. [Optional] Make sure that Intel MPI Library works fine on the desired hosts. For this, manually run your application on the desired hosts individually
  8. Submit your MPI job to SGE

    Use the following command for that:

    qsub -N <job_name> -pe impi <num_of_processes> \
    -V <mpirun_absolute_name> -r ssh -np <num_of_processes> <app_absolute_name>

    where
    -V option is used so that all environment variables available in the current shell are exported to a job.

     

    USEFUL COMMANDS to monitor and control jobs:
    * qstat – show status of SGE jobs and queues;
    * qstat –j – show detailed information about jobs (can be useful for pending jobs);
    * qdel – remove existing job.
    After submitting the job you can monitor its status using the qstat command. When the job is finished, you can find the job’s output and error output in your HOME directory – just look for <job_name>.o<jobID> and <job_name>.e<jobID> files.

    See the User’s Guide, if you need more information about the job submission process.

Closer integration with SGE

Read the 'Tight Integration of Parallel Environments and Grid Engine Software' section in SGE's Administration Guide first.

To enable tight integration for Intel MPI, use the same procedure as the one mentioned above, but use a different configuration file for the PE at step #3.

The configuration file should contain the following lines:

pe_name impi_tight
slots 999
user_lists NONE
xuser_lists NONE
start_proc_args <SGE_install_dir>/mpi/startmpi.sh -catch_rsh $pe_hostfile
stop_proc_args <SGE_install_dir>/mpi/stopmpi.sh
allocation_rule $round_robin
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
Pour de plus amples informations sur les optimisations de compilation, consultez notre Avertissement concernant les optimisations.
Étiquettes: