I'm coding in CnC and have a step which does an unbounded number of gets, and may block in any of them.
Is there any way to avoid aborting and restarting the step on a blocking get?
There are (at least) two options.
1. Depends pre-declare the data dependencies in the depends-method of your tuner. The step will not get executed until all declared dependencies are stasfied. See http://software.intel.com/sites/landingpage/icc/api/tuning.html formore detailedexplanation. Other examples in the samples directory using this feature arematrix_inverse/matrix_inverse, fib/fib_tuner, rtm_stencil/halo2. unsafe getsUse unsafe_get instead of get to get items and before using any of the itemsput a call to context::flush_gets(). The runtime will call call the steps twice, the second time will be triggered when all items are available. To further reduce scheduling overhead, you can provide a tuner with prescheduling enabled (see http://software.intel.com/sites/landingpage/icc/api/struct_cn_c_1_1step__tuner.html#5d503afad1d44b7660d29c27856e159f)
Note: both solutions don't work if your gets are data-dependent, e.g. if one get depends on a previous get.
Does this help?
Thanks for your answer!
I'm afraid my gets are data-dependent, so I might have to try adifferent approach.
I'll investigate the options you mention anyway, as I might reformulate my current code to apply those.
Is your access pattern a single chain of dependent gets? Unsafe_get/flush_gets can help if data-itemsare used to get more than a single other item or if there are several independent data-chains?
Here's an example where unsafe_get/flush_gets can help:
a = get(tag)b = get(a)c = get(a+1)cc= get(a+2)d = get(c)dd= get(cc)e= get(c+1)ee= get(cc+1)f = get(c+2)ff = get(cc+2)...
Herewe can limit the maximum number of re-schedules to the number of items actually needed to get other items, by calling flush_gets() when we use a "newly got" item the first time (one can call flush_gets more than once!). In this example it would lead to at most4 re-schedules(3 with pre-scheduling), down from 10. The actual number might in factbe much lower dependent onthe concurrency in your application.The above example would look like this
a = unsafe_get(tag)ctxt.flush_gets() //don't pass this lineunless 'a' is availableb = unsafe_get(a)c = unsafe_get(a+1)cc= unsafe_get(a+2)ctxt.flush_gets() // don't pass this this line unless 'c' and 'cc'are available (as well as 'b')d= unsafe_get(c)dd= unsafe_get(cc)e= unsafe_get(c+1)ee = unsafe_get(cc+1)f = unsafe_get(c+2)ff = unsafe_get(cc+2)ctxt.flush_gets()...
Hope this helps
ps: Your problem looks very interesting.Is it possible that youtell us a little moreabout your usage-pattern and/or your application (potentially in a less public context)?