I have a set of tasks 1 to N. Each task i accesses data[i]. I have W number of workers. W << N. Most of data processed in specific tasks are same so these tasks must be executed on the same core for cache reuse. The data is allocated on memory of specific nodes. The similarity among chunks of data processed in tasks are determined on runtime.
In OpenMP version of this code, thread ids are obtained via omp_get_thread_num(), memory is allocated via numa_alloc_onnode(size_t size, int node), and threads are mapped to nodes via numa_run_on_node(int node). In Cilk++, I was not able to get thread id via neither __cilkrts_get_worker_number() (always returns 0) nor cilk::current_worker_id() (cannot be compiled). Also setting CILK_NPROC in bash does not affect values returned by __cilkrts_get_nworkers() (always returns 8, the number of logical cpus in my quad-core Intel system) and __cilkrts_get_total_workers() (always returns 23).