For a statistical application, I run a set of bootstrap replications, which are completely independent iterations ("embarrassingly parallel") of a computation procedure using random weights. Using co-arrays (shared memory, -coarray=shared), I divide these iterations over different images, each of which carries out a do-loop of iterations. Finally, I merge the statistics computed in all images to compute summary statistics.This works fine for a small number of replications; my program finishes without errors. If I increase the replications, the program stops with the following error:"rank 8 in job 1 ### (deleted host name) caused collective abort of all ranks exit status of rank 8: killed by signal 7" (rank and signal can vary)Intuitively, I would suspect a memory problem, but I've tried using the ulimit -s unlimited option to increase the stack space (as well as setting a number like 999999999), increasing the stack size (e.g, export OMP_STACKSIZE=32g - is this just for OpenMP?), and putting automatic arrays and arrays created for temporary computations on the heap instead of the stack (-heap-arrays), but none of this helped.Is it possible that this is an MPI error somehow? It's also weird that the images are carrying out their do-loops at very different speeds. In one example, one of the images managed almost twice as many iterations as another, before the program aborted, and there is no mathematical reason that speed could differ. The image causing the abort (it's [rank]+1, correct, because the ordering starts at 0?) had average speed.Are there any obvious error sources or known issues that could apply here? Unfortunately, I'm not really at liberty to make our code available here, but I would like to try to supply as much information as necessary to work this out. Some of the co-arrays I use are allocatables; I read about possible memory leak using allocatables: http://objectmix.com/fortran/243080-allocatable-components-derived-type.... Could this be an issue here?Thanks!Intel 12.3.174 on Unix,shared memory
For more complete information about compiler optimizations, see our Optimization Notice.