Sun Grid engine tight integration for Intel mpi

Sun Grid engine tight integration for Intel mpi

salmr0's picture

Hi,

Are there any plans to have a version of Intel mpi that has tight integration support for the sun gridengine queuing system much in the same way as openmpi has the support now?

Thanks
Rene

14 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Andrey Derbunovich (Intel)'s picture

Hi Rene,

Yes, we consider possibility to include such functionality to our product.

Actually, I may provide you some current recommendations how to configure SGEto reach tight integration with Intel MPI Library. Just let me know if you are interesting in it.

Best regards,
Andrey

Gergana Slavova (Intel)'s picture
Hi Rene,

As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:

http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/

Let us know if this helps, or if you have any questions or problems.

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
salmr0's picture
Quoting - Gergana Slavova (Intel) Hi Rene,

As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:

http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/

Let us know if this helps, or if you have any questions or problems.

Regards,
~Gergana

Sorry was out of town for a few days and just getting back to this. Thanks Andrey and Gernana! I will look over the manual instructions and give a try and let you know how it goes.

Rene

salmr0's picture
Quoting - Gergana Slavova (Intel) Hi Rene,

As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:

http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/

Let us know if this helps, or if you have any questions or problems.

Regards,
~Gergana

Gergana/Andrey,

We followed the directions on the website and setup SGE as suggested by you for tight integration with intel mpi. One of the reasons we are looking to do this is so that SGE can do proper clean up the the MPD python deamons that get left running around on servers after a job gets deleted or killed.

For example with openmpi and sge tight integration all openmpi processes get forked as children of the SGE execd deamon. So when a job gets deleted or killed SGE has full control of the job and can terminate all its openmpi children and clean up.

With intel mpi here is what I see when I submit a job:

grdadmin 4788 1 4788 4694 0 Mar30 ? 00:02:00 /hpc/SGE/bin/lx24-amd64/sge_execd
root 4789 4788 4788 4694 0 Mar30 ? 00:04:15 /bin/ksh /usr/local/bin/load.sh
grdadmin 16949 4788 16949 4694 0 09:33 ? 00:00:00 sge_shepherd-1712429 -bg
salmr0 17023 16949 17023 17023 1 09:33 ? 00:00:00 -csh /var/spool/SGE/hpcp7781/job_scripts/1712429
salmr0 17127 17023 17023 17023 0 09:33 ? 00:00:00 /bin/sh /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpirun -perhost 1 -env I
salmr0 17174 17127 17023 17023 1 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpiexec -perhost 1 -env
salmr0 17175 17174 17023 17023 1 09:33 ? 00:00:00 [sh]
.
.
.
salmr0 17166 1 17165 17165 0 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpcp7
salmr0 17176 17166 17176 17165 2 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpc
salmr0 17178 17176 17178 17165 87 09:33 ? 00:00:04 /bphpc7/vol0/salmr0/MPI-Bench/bin/x86_64/IMB-MPI1.intelmpi.3.1

As you can see my MPI job is running as a forked child of sgeexcd and it under full SGE control. However the MPDs that got started are totally independent precesses and are not forked children of SGE. The problem comes when i type qdelete or try to delete my job or kill it as it is running. At this point SGE will killl all its forked children. But it know nothing about the MPD deamos. As a result after SGE deletes, kills, and cleans up my job I still have this running around on all the nodes that ran the mpi job:

salmr0 17166 1 17165 17165 0 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpcp7

Each time i submit and delete a job I would get a new python like above hanging around. Any ideas on how to get the clean up of MPDs working properly?

Thanks
Rene

bleedinge's picture
Quoting - salmr0 Quoting - Gergana Slavova (Intel) Hi Rene,

As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:

http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/

Let us know if this helps, or if you have any questions or problems.

Regards,
~Gergana

Gergana/Andrey,

We followed the directions on the website and setup SGE as suggested by you for tight integration with intel mpi. One of the reasons we are looking to do this is so that SGE can do proper clean up the the MPD python deamons that get left running around on servers after a job gets deleted or killed.

For example with openmpi and sge tight integration all openmpi processes get forked as children of the SGE execd deamon. So when a job gets deleted or killed SGE has full control of the job and can terminate all its openmpi children and clean up.

With intel mpi here is what I see when I submit a job:

grdadmin 4788 1 4788 4694 0 Mar30 ? 00:02:00 /hpc/SGE/bin/lx24-amd64/sge_execd
root 4789 4788 4788 4694 0 Mar30 ? 00:04:15 /bin/ksh /usr/local/bin/load.sh
grdadmin 16949 4788 16949 4694 0 09:33 ? 00:00:00 sge_shepherd-1712429 -bg
salmr0 17023 16949 17023 17023 1 09:33 ? 00:00:00 -csh /var/spool/SGE/hpcp7781/job_scripts/1712429
salmr0 17127 17023 17023 17023 0 09:33 ? 00:00:00 /bin/sh /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpirun -perhost 1 -env I
salmr0 17174 17127 17023 17023 1 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpiexec -perhost 1 -env
salmr0 17175 17174 17023 17023 1 09:33 ? 00:00:00 [sh]
.
.
.
salmr0 17166 1 17165 17165 0 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpcp7
salmr0 17176 17166 17176 17165 2 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpc
salmr0 17178 17176 17178 17165 87 09:33 ? 00:00:04 /bphpc7/vol0/salmr0/MPI-Bench/bin/x86_64/IMB-MPI1.intelmpi.3.1

As you can see my MPI job is running as a forked child of sgeexcd and it under full SGE control. However the MPDs that got started are totally independent precesses and are not forked children of SGE. The problem comes when i type qdelete or try to delete my job or kill it as it is running. At this point SGE will killl all its forked children. But it know nothing about the MPD deamos. As a result after SGE deletes, kills, and cleans up my job I still have this running around on all the nodes that ran the mpi job:

salmr0 17166 1 17165 17165 0 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpcp7

Each time i submit and delete a job I would get a new python like above hanging around. Any ideas on how to get the clean up of MPDs working properly?

Thanks
Rene

Did you ever came up with a solution for this?

nixter's picture
Quoting - bleedinge

Did you ever came up with a solution for this?

I have the same problem, any solution for this problem?

thanks.

san's picture

I'm curious to know why Intel developed their MPI based on MPICH2/MVAPICH2. Why not based on OpenMPI?

- Sangamesh

Tim Prince's picture

OpenMPI was not well developed, and had not supplanted lam, at the time the decision was made, and didn't support Windows until recently. Not all subsequent developments were foreseen. Are you suggesting that cooperative developments between OpenMPI and SGE should have been foreseen? Do you know the future of SGE?

eev's picture
Quoting nixter Quoting - bleedinge

Did you ever came up with a solution for this?

I have the same problem, any solution for this problem?

thanks.

I have the same problem, too. What can I do?

Gergana Slavova (Intel)'s picture

Hello everyone,

I'm hoping this reply will reach everyone subscribed to this thread.

As a first point of business, I would suggest you give the new Intel MPI Library 4.0 a try. It came out last month and includes quite a few major changes. You can download it, if you still have a valid license, from the Intel Registration Center, or grab an eval copy from intel.com/go/mpi.

Secondly, we have plans to improve our tight integration support with SGE and other schedulers in future releases. So stay tuned.

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
reuti_at_intel's picture

Hi,

please have a look at:

http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html

for a tight integration with correct accounting and control of all slave tasks by SGE. The Howto was written originally for MPICH2. As Intel MPI is based on MPICH2, the "mpd startup method" also applies to Intel MPI.

-- Reuti

jbp1atdukeedu's picture

Reuit -- Looks like a bad link ... maybe the new gridengine.org has it?

John

jbp1atdukeedu's picture

Not exactly what you're looking for, but you can hack the Intel "stock" mpirun script to do a better job of tight integration. A version that I hacked together is available at:

As was noted elsewhere, if the process detaches from sge_shepherd then you've lost tight integration. The script above should keep open connections to each child process -- so they all stay attached to sge_shepherd.

Login to leave a comment.