I have a problem when auto parallelizing forall constructs containing a sum() or spread() function. The problem with the sum() functions occurs when using a logical mask. for instance:
real, dimension(100,100) :: a
real, dimension(100,100,100) :: b,c
a(j,i) = sum(b(:,:,j)*c(:,:,i))
works fine and is parallelized, however if I use in the above example something like
a(j,i) = sum(b(:,:,j)*c(:,:,i), mask=(d(:,:) > 0.)
i.e. use a logical mask, the forall construct can not be parallelized. Is this a problem of the ifc compiler or can masked summations simply not be parallel?
Also, when using the spread() function in something like
real, dimension(100,100,100) :: a
real, dimension(100) :: b
a(i,:,:) = spread(b,dim=1,ncopies=100 )
does not parallelize. Still, one can use these functions within a forall construct without the compiler complaining so it should be possible to run things parallel or not?
auto parallelization with sum and spread function