Loop vectorization / Complicated array access

Loop vectorization / Complicated array access

Hi,
I have some issues with loop vectorization:

module global
    implicit none
   
    type type_A
        real(kind=4), allocatable, dimension(:) :: val
    end type type_A
 
end module global
program test
    use global
    implicit none
    type(type_A), target, allocatable, dimension(:) :: A
    type(type_A), target, allocatable               :: AA
    real(kind=4), pointer, dimension(:) :: ptr
    integer :: i
   
    !---
    allocate(AA)
    allocate(AA%val(10000))
    AA%val = 1.0
    
    ptr => AA%val
  
    do i = 1, 100
        !
        ptr(i) = exp(-ptr(i) + 1.0)
        !
    end do
    !---
    
    !---
    allocate(A(1))
    allocate(A(1)%val(10000))
    A(1)%val = 1.0
    
    ptr => A(1)%val
   
    do i = 1, 100
        !
        ptr(i) = exp(-ptr(i) + 1.0)
        !
    end do
    !---    
    
    write(*,*) ptr(500)
end program test

Compiled with Qvec-report3 it produces:

1>main.f90(20): (col. 5) remark: loop was not vectorized: unsupported loop structure.
1>main.f90(24): (col. 5) remark: LOOP WAS VECTORIZED.
1>main.f90(32): (col. 5) remark: loop was not vectorized: unsupported loop structure.
1>main.f90(34): (col. 5) remark: loop was not vectorized: unsupported loop structure.
1>main.f90(38): (col. 5) remark: loop was not vectorized: existence of vector dependence.
1>main.f90(40): (col. 9) remark: vector dependence: assumed FLOW dependence between (unknown) line 40 and (unknown) line 40.
1>main.f90(40): (col. 18) remark: vector dependence: assumed ANTI dependence between (unknown) line 40 and (unknown) line 40.
1>main.f90(40): (col. 9) remark: vector dependence: assumed FLOW dependence between (unknown) line 40 and (unknown) line 40.
1>main.f90(40): (col. 18) remark: vector dependence: assumed ANTI dependence between (unknown) line 40 and (unknown) line 40.
1>main.f90(40): (col. 18) remark: vector dependence: assumed ANTI dependence between (unknown) line 40 and (unknown) line 40.
1>main.f90(40): (col. 9) remark: vector dependence: assumed FLOW dependence between (unknown) line 40 and (unknown) line 40.
1>main.f90(40): (col. 18) remark: vector dependence: assumed ANTI dependence between (unknown) line 40 and (unknown) line 40.
1>main.f90(40): (col. 9) remark: vector dependence: assumed FLOW dependence between (unknown) line 40 and (unknown) line 40.

The only way to get the second loop vectorized seems to add the !dir$ ivdep directive before.
From an old post of Steve(Mon, 02/06/2006 - 18:33):

Quote:

Steve Lionel (Intel) wrote:

[...]
The compiler does not try to vectorize loops where the array access is complicated.[...]
It is a fact that arrays that are components of derived types, especially in conjuction with pointer or allocatable, complicate life for the compiler and as such some optimization opportunities may be missed.
[...]

My understanding is that my issue is related to the complicated array access. Is there a way to make it clear for the compiler without using the vectorization directive on each loop of the code?
Cheers,

Nick

Nick
6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Best Reply

The compiler has changed a lot since 2006, and processors have changed to include new instructions that can help with vectorization. For example, I tried your code with the 14.0 compiler and got this:

C:\Projects\U480019.f90(20): (col. 5) remark: LOOP WAS VECTORIZED
C:\Projects\U480019.f90(24): (col. 5) remark: LOOP WAS VECTORIZED
C:\Projects\U480019.f90(34): (col. 5) remark: LOOP WAS VECTORIZED
C:\Projects\U480019.f90(38): (col. 5) remark: LOOP WAS VECTORIZED
C:\Projects\U480019.f90(32): (col. 5) remark: loop was not vectorized: nonstandard loop is not a vectorization candidate

Looks pretty good to me.

Steve - Intel Developer Support

Indeed, I failed to realize that that many versions have been released since XE 2011 (12.1.3526.2010) (February 2012).
Bad luck me strikes again.

Nick

Nick

A lot has changed in Fortran !!
I don't understand the need for such a complex data structure. Either of the 4 effective loop structures in the following code vectorise, without resorting to the more complex data structures of the original post. I realy don't know what can be achieved by your coding approach.
My suggestion is KISS ... keep it simple..

module global
    implicit none
    real(kind=4), allocatable, dimension(:) :: A_val, AA_val
 end module global
 program test
    use global
    implicit none
    integer :: i
 !---
    allocate (AA_val(10000))
    AA_val = 0.5
    AA_val(1:100) = exp(-AA_val(1:100) + 1.0)
    write(*,*) AA_val(100), AA_val(500)
 !---
   allocate (A_val(10000))
    A_val = 1.0
    do i = 1, 100
       A_val(i) = exp(-A_val(i) + 1.0)
    end do
 !---
    write(*,*) A_val(500) 
end program test

Well,
Indeed I failed to realize that that many versions of the compiler had been released since my XE 2011 12.1.3526.2010 (February 2012), but this has nothing to do with the changes in Fortran.
The more complex data structure of the original post is there especially to simplify the coding, the pointer approach making the derived data type (and the number of objects, a runtime parameter) transparent to the developers and the algorithm.
Unfortunately, all codes are not equal in front of the prerequisites.

Nick

John,

NsK produced a small sample code that exhibited his issue. In his case he had an array of arrays. This type of structure can be used for sparse arrays among other things. Use of pointer can somtime cause optimization issues due to the possibility of alias and stride. If NsK's compiler is new enough to have ASSOCIATE, he might try that instead of pointer.

Jim Dempsey

www.quickthreadprogramming.com

Leave a Comment

Please sign in to add a comment. Not a member? Join today