Cilkplus on MTL

Cilkplus on MTL

I am trying to run a simple cilk plus program on MTL. The program runs both a serial (non-threaded) and a parallel (using cilk_spawn) version of the same code and reports the timing results for both versions.

I can compile it and run it on the login node, but it shows no speedup in the parallel version because it does not have access to multiple CPUs.

When I try to submit the job using qsub (hoping to get access to multiple cores), I get the following output file:

-----

Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
MANPATH: Undefined variable.
/home/knag/knag-s01/01/code/sol/stocks: error while loading shared libraries: libcilkrts.so.5: cannot open shared object file: No such file or directory

------

The first two errors (tty & MANPATH) I'd like to fix but, but am more concerned about the third error. How can I let whatever core is running my job know where the libbcilkrts.so is?

I can update my LD_LIBRARY_PATH to point to the right place (adding /opt/cilk/lib64) but this does not seem to help. (when *I* run the program, it already knew where the library was and does not give me this error in the first place). I'm not sure how to fix the problem when the job is submitted via qsub.

Any help much appreciated. Bonus points if you know what to do about the tty and MANPATH warnings. :)

Thanks!

5 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

You will need to include the source command in you qsub script - the batch system can't find the libraries without it, i.e.
$ source /opt/intel/bin/compilervars.sh intel64

Great. Now it runs!

However, I'm still not seeing any speedup on the parallel part of the code, so I think I'm still not being given access to multiple cores when I spawn cilk threads. The code itself is recursive and the number of spawns will simply depend on the input size, so I'm guessing I should ask for 40 CPUs and then play around with it a little.

What is the correct way to make sure my job will be able to spawn threads on separate CPUs?
qsub -l select=1:ncpus=16 ... doesn't seem to be doing it...

Many thanks in advance.

You have access to all the cores/threads on the MTL. The qsub command /w ncpus does not strickly limit the number of threads or cores you have access too. It mearly highlights to the batch system that you wish to use ncpus, if you use more or less, the batch system doesn't care.

I would suggest that you test your code on the login node to ensure that your code does use all the cores/threads that you expect to use - there is no reasonable celing in place on the number of threads you can use.
If your code is recursive, it may mean that you're still only using just a single thread. Why not try your code in a non-recursive mode, to perform truly parallel computation.

thank you, thank you!
this is super helpful.

(In fact, I just wasn't recursing deep enough to overcome the cost of thread spawning. Thanks for letting me know how the thread allocation works so that I could figure this out. Much easier to trouble shoot with the correct big picture of what's going on.)

Anne

Deixar um comentário

Faça login para adicionar um comentário. Não é membro? Inscreva-se hoje mesmo!