Is there any efficient prefix scan library for Cilk Plus accessible from C?
I was not able to find any and my implementation can hardly compete with the sequential version :-)
An interface similar to the reducers will work nicely.
http://parallelbook.com/sites/parallelbook.com/files/code20131121.zip has the best implementation of prefix-scan in Cilk that I've been able to write. See the files in code/src/scan/cilkplus . The code is in C++, so you could instantiate it in a C++ object file and call that from C. Whether it beats the sequential code will depend on circumstances.
It is difficult to compete with the sequential version for small core counts, because prefix-scan is inherently a 2-pass algorithm, and thus requires twice the bandwidth of the sequential version. So for small core counts, it's not going to help. With high core counts, it can sometimes come out ahead. On an Intel(R) Xeon Phi(TM), I've seen speedups as high as 64x over sequential code, and even higher by using intrinsics to vectorize subsidiary prefix-scans over chunks.
The scan code that Arch described is also now included as part of Cilkpub.
The C++ file to include is "scan.h".
I also created a short C header file (scan_c.h) which is my initial attempt at creating macros for a C version. There is one macro for declaring a scan function that operates on an array of the specified type, and one macro for invoking the scan function. There is a sample C file using scan_c.h in samples/csample_scan.c that shows how to use the macros. I haven't tried out the interface on more complicated examples yet, but perhaps it or something similar might work for your problem?
Thank you Jim and Arch.
I am very excited to be part of such an active and responsive community.
I will return with my impressions, after I give it a try.