Hi to everyone,
I have this small question about the Intel Xeon Phi Co-processors, they suffer performance penalties when the execution branch of the
different threads diverges?
For example, in GPUs its SIMD execution model imposes heavy performance penalties on kernels with control flow, where the threads
follow different paths of execution. As you know the hardware makes all these paths execute sequentially.
Is this issue also present on the Intel Xeon Phi Co-processors?. I work mostly on Monte Carlo (MC) simulation of particle transport on matter, where the particle histories diverge very quickly. As a first step I want to parallelize the MC codes using OpenMP, may I execute the codes directly on a Xeon Phi co-processor, or I will have performance penalties due to the different paths of execution for each particle?.