cilkifying

cilkifying

When you have a function call lgamma and it reads like this how does one cilkify it?

lg = lgamma(2.0 - ALPHA);

with lgamma the function call.

Is it like this?

lg = cilk_spawn lgamma(2.0 - ALPHA);

and secondly

Create_Thread(thread[i], P_TRA_RAY2, &WPP, 0);

Iam unsure as to how to cilikify this one - or even if I should. P_TRA_RAY2 is the function call.

and thirdly

if (unfold2f(1, PSRF, GzSRC, ...

Where unfold2f is the function call.

Any help appreciated. Thanks in advance.

Newport_j

2 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

Hello,

without knowing the design of your code it's not possible to give direct advices on how to benefit from Intel Cilk Plus.
Anyways, I'm going to describe my answers to your questions on a higher level. Your reasoning should help you finding the correct solution applicable for you.

First, the basic idea of spawning of tasks in Intel Cilk Plus is ideal to do complex computations that can be done in parallel. This works quite naturally when the data used therein is independent to the tasks spawned (e.g. dedicated data structures or stack/thread local data). If not, you can use reducers that help you create task independent data; at the cost of revising your code design, depending on how flexible it is.

With that in mind, let's assume "lgamma(...)" is a complex computation and operates on task independent data. Then, your proposal

g = cilk_spawn lgamma(2.0 - ALPHA);

...is a good approach, provided that you really need a new task continuing in parallel past the function call to "lgamma(...)". So, this would be meaningless (counter-)example

g = cilk_spawn lgamma(2.0 - ALPHA); // task A calls function; new task B spawned to continue
if (g) { ... } else { ... } // task B depending on result of parallel executed task A
cilk_sync;

...as the newly spawned task (B) would immediately require the result of the first/original task (A), which is still computing. Also, keep in mind that in your example variable "g" is only valid once you synchronized (cilk_sync)! So, in the counter-example, the condition operates on an uninitialized variable "g" which is not valid.

The answer above should also answer your third question.

Regarding to your second one:
I'm not sure what you're creating this thread for.
If it's a worker thread running for a long time throughout the lifetime of your application it won't make sense to force it to use Intel Cilk Plus as the benefit is very low. You can still do so, if you create a function that spawns two tasks (and only returns/synchronizes at end of application), one for the worker and one used for the "main" task of your application.
The biggest advantage for Intel Cilk Plus is, however, if you need to spawn tasks for a (relatively!) short time and multiple of those. So, you might want to focus first on parts of your code where heavy computations are made and revise them to get distributed to multiple tasks.
Assuming "P_TRA_RAY2" is such a compute function (kernel) you might simply call it and prepend "cilk_spawn", if applicable.

Finally, you must not convert all threads to tasks for your application in order to use Intel Cilk Plus. You can mix it with other threading/tasking models which works well for native (Windows*/POSIX*) threads, Intel Threading Building Blocks, OpenMP* as well as with Intel Math Kernel Library and Intel Integrated Performance Primitives.

That's in short a high level answer to your questions.

Best regards,

Georg Zitzlsberger

Kommentar hinterlassen

Bitte anmelden, um einen Kommentar hinzuzufügen. Sie sind noch nicht Mitglied? Jetzt teilnehmen