Colleagues:I'm upgrading some legacy engineering code to take advantage of Coarray Fortran. In the routine I've started with there is only one coarray. It executes correctly with eight images and generates the same values as before. I can see the basic speedup in all the work done until, at the very end, I need to sum across the coarray that each image has contributed to. The array is ~ 3000 x 400, with 8 images. The basic work in each image that manipulates the local copy of this array is done in < 1 sec. But the summation performed at the very end of the code (in only the 1st image) takes 10 seconds. sync allif( this_image() == 1 ) then do I = 2,num_images() flux = flux + flux[I] end doend ifIs coarray summing that inefficient? But I reserved the right to be doing this improperly! It is actually faster (by an order of magnitude) to have each image write out the local copy of the array, and have the 1st image read the files into local array(s) and perform the sum.Any suggestions?
For more complete information about compiler optimizations, see our Optimization Notice.