I would like to be able to detect when a worker has stolen work without the use of a holder. Ideally I would like to to add a "hook" into the runtime system so that a function I define is called upon a successful steal, or have access to a worker-local "successful steal" counter that is incremented each time a worker executes a successful steal. Are either of these things possible using documented or undocumented features of the cilk runtime system in gcc or icc?
A higher level question I have, in addition, is regarding the cost of hypermap lookups. Why are they so expensive? I have a benchmark using a single reducer for which __cilkrts_hyper_lookup takes ~35% of the time using cilk gcc 4.9 and ~25% of the time when using icc 13.1.1.