qsub syntax changed?

qsub syntax changed?

Hello MTL-ers,Yesterday this used to work:qsub -lncpus=40 -vnThreads=0 myscript.shnow this gives:qsub: Rejecting job, Pls. use syntax -l select=1:ncpus=xx ..so I tried:$ qsub -l select=1:ncpus=40 -vnThreads=0 retime2log.gurobi.shqsub: "-lresource=" cannot be used with "select" or "place", resource is: host[sels@acano01 Debug]$simplifying doesn't help:$ qsub -lselect=1:ncpus=40 retime2log.gurobi.shqsub: "-lresource=" cannot be used with "select" or "place", resource is: host[sels@acano01 Debug]$How to solve this please? I need it urgently...The original squb worked just fine...thanks,Peter

15 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

As you have noticed, we have justchanged the qsub command to require ncpus to be specified - here's the latest segment from the getting started guide:

How do I run a program in an exclusive mode (say for benchmarking)?

Qsub is the command used to submit jobs to the batch system. Using a script to wrap the actual program or programs is the most effective way to submit jobs. Here is a very simple script that merely prints a series of hello world when submitted to PBS using the following qsub command:

$ qsub l select=1:ncpus= $HOME/myjob

Note: the value of ncpus should be aligned typically to the number of threads that your application plans to use, if unknown or greater then 40, then use the value of 40. The reason for specifying the ncpus argument to qsub, is to better utilize the batch nodes, in that if less than 40 cores are in use on a specific batch node, the remaining free cores could be available for other jobs/users.

If the argument above is not used, qsub will return with the following error:

$ qsub: Rejecting job, Pls. use syntax -l select=1:ncpus=xx ..

The contents of the 5 line myjob (for OpenMP) could be:

#!/bin/sh

#PBS N myjob

#PBS -j oe

export OMP_NUM_THREADS=16

./hello_world

The output of the batch job will be left in a file in the working directory where the qsub command was entered. The file will be named using the PBS job name (e.g. myjob) followed by a suffix built using .o + the job number, e.g. myjob.o28802.

For latest PBSPro v10.2 qsub commands, please. read:

/opt/docs/PBSProUserGuide10.2.pdf - example subsection

Note: Because of the shared environment on the MTL login node (acano01), its not recommended that repeatable testing (and results) be performed on this node, the batch system it set up for this purpose.

------------------------------------------------------------------------------------------------------------

wrt to the "-lresource=" option that I believe is part of your script - the PBS manual says the -lselect and -resource shouldnot be used together. We believe that -lselect could be used instead of -lresource.

If this is not the case, if you can share your job script with us and the reason you'd prefer to use -lresource, maybe we can come up with a solution.

Note: the above change to qsub is just a trial, and we welcome feedback.

Hi Mike, All,Thanks for your answer.(1) getting it to work:I mixed old and new syntax without knowing.My qsub command had to change to use the new qsub syntax fromqsub -lncpus=40 -vnThreads=$1 retime2log.gurobi.shintoqsub -l select=1:ncpus=40:host=acano02 -vnThreads=$1 retime2log.gurobi.shsince I used to set the host in my script (retime2log.gurobi.sh) itself as:#PBS -l host=acano02This last line implicitly uses the old -lresource which explained the error message:qsub: "-lresource=" cannot be used with "select" or "place", resource is: hostof mixing -lselect and -lresource.SO I removed this #PBS command and it seems to work now. :)(2) multiple jobs on a machine at the same timeSince now, it seems to be possible to do multiple jobs at the same time.This is beneficial for getting more things done of course,but I fear that timing of each job will be less stable and not reproduciblebecause all jobs interface with memory and I guess that jobs will still influence each other. (memory wall effects?..)If my guess about interference is right,I would vote for only allowing one job per machine at any time.Of course, if one also enters their memory requirements as maximal memory available one can still enforcehis job to be run as the only one, right? In that case there is no real disavantage to the new method,except for having to spend some time to change ones scripts again...like:qsub -l select=1:ncpus=40:host=acano02:mem=54gb -vnThreads=$1 retime2log.gurobi.shseems to work.(Preferably use another one than acano02, if you can, since I am stuck there for license reasons...)best regards,Peter

Peter,
your assumptions are correct with respect to memory.

What we have tried to do with this change, isparticually focused towardsthe "casual" batch user. As you and others have seen some users will run long batch jobs (many hours/days)with only a single core, but this ties upan exclusivebatch server- not the best use of a 40-core batch server. Thus we implemented a shared batch resource and required users to specify ncpus.

There are ways to gain exclusive use of a batch node, and I believe you are aware of those ways.

Hi peteri have a problem when i useqsub -l select=1:ncpus=40:host=acano02 -vnThreads=$1 test.shi can't getThe output of the batch jobwhat can i do?

Hello, I am having an issue when I try to submit a job. I am using the following script: #!/bin/sh
#PBS -N $HOME/fitzSort
#PBS -l select=1:ncpus=32
./SequentialSort
error in sort.qsub.e21848: -bash: /var/spool/PBS/mom_priv/jobs/15083.acaad01.SC: /bin/sh^M: bad interpreter: No such file or directory Thanks for you help.

I assume that the -vnThreads=$1 is an environment variable passed to your test script. But without seeing test.sh it's almost impossible to diagnose your problem.

I can say that with a simple batch script, your qsub command line generates an output file of the format:
.o

this is test.sh#!/bin/sh#PBS N /home/dasayed#PBS -j oeexport OMP_NUM_THREADS=16./maini can't get.o
thanks in advance

If this is an exact copy of your PBS script, then I believe the "N" is using the wrong '' character, please use '-'. I believe if you copied it from the gettingstarted pdf file, then you're using theincorrect ''' character.

Others had had similar problems.

When i replace'' character, please use '-' i get this in command window19083.acaad01
And didn't get the .o file

So this means that PBS now runs your script.

I suspect that it now has to be with the program ./main, you might like to replace ./main with a fully qualified path, for starters.

I suspect that there is a problem running your program./SequentialSort - does it require any shared libraries, or do you source an environment? If so then you'll need to set up that same environment within the PBS script.

ok i get itthanks

hi,
i get the same error
/bin/sh^M: bad interpreter: No such file or directory
in the job file
25053.acaad01

how you solve it or anyone can help please

Quoting dina_a
hi,
i get the same error
/bin/sh^M: bad interpreter: No such file or directory
in the job file
25053.acaad01

how you solve it or anyone can help please

Usually it means that you've created your script on windows machine (with windows-specific line-endings etc). You have several ways to solve the issue:1) if utility dos2unix is available on target unix machine - use it: "dos2unix YOUR_SCRIPT"2) rewrite your script from the very beginning using, for example, nano or vim.3) search for the similar tools on your PC, edit script using them and send to the target machine.etc.Regards,kikobyte

Lascia un commento

Eseguire l'accesso per aggiungere un commento. Non siete membri? Iscriviti oggi