I am curious what exactly does compiler perform further than translating keywords to calls to run-time library? What intelligence and heuristics it has? Your STM compiler does a lot of work analyzing function bodies, generating function wrappers and so on. But what does Parallelism Exploration Compiler perform?
The example I see in documentation:
void f_sum ( int length, int *a, int *b, int *c ){
int i;
__par for (i=0; i {
c[i] = a[i] + b[i];
}
}
suggests that compiler determines right granularity for parallelization. Is it the case? If so, it is really cool! It is a big step forward.

