16 byte alignment of stack variables

16 byte alignment of stack variables

hello,

i am compiling fortran source code for use on pentium 4 cpu's with SSE2. to further enable speed all my double precision arrays are 16 byte aligned. to further improve speed i have set the
!DIR$VECTOR ALIGNED
directive to tell the compiler which arrays are aligned.

the following example however fails, because the DOUBLE PRECISION variable FAC1 is not aligned on a 16 byte boundary.

!DIR$VECTOR ALIGNED
DO I=1,N
E1(I,J)=E1(I,J)-FJAC(I,J)/FAC1
END DO

FAC1 is a parameter to a function. to force a 16 byte alignment i used the compiler option

/Qsfalign16
/Zp16

however variable FAC1 is still not aligned on a 16 byte boundary. what am i doing wrong here or can somebody tell me what to do ?

regards,
hans

here is the function:

SUBROUTINE TEST(N,FJAC,LDJAC,M1,M2,NM1,FAC1,E1,LDE1,IP1,IER,IJOB)
IMPLICIT REAL*8 (A-H,O-Z)
DIMENSION FJAC(LDJAC,N),E1(LDE1,NM1),IP1(NM1)
C
DO J=1,NM1
JM1=J+M1
DO I=1,NM1
E1(I,J)=-FJAC(I,J)
END DO
E1(J,J)=E1(J,J)+FAC1
END DO
DO J=1,M2
!DIR$VECTOR ALIGNED
DO I=1,NM1
E1(I,J)=E1(I,J)-(FJAC(I,J))/FAC1
END DO
END DO
RETURN
END

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

You can't force alignment of a dummy argument. You have to attack this in the caller where the associated variable is declared.

Steve

Steve - Intel Developer Support

steve,

thank you very much for your quick reply.

i am not 100% sure what you mean by dummy arguments? could you please explain this.

so if i understand you correct, if i have a function lets say

SUBROUTINE FUNC(A,B,C,D)
code here...
RETURN
END

where A,B,C,D are for example double precision scalars, then i have to make sure, that if i call this function, with lets say

CALL FUNC(PARAM1,PARAM2,PARAM3,PARAM4)

i have to ensure that already PARAM1 to PARAM4 are 16 byte aligned? am i correct?

what happens if i call this function with

CALL FUNC(1.0D+0, 1.0D+0, 1.0D+0, 1.0D+0)

?

regards,
hanst

i have to ensure that already PARAM1 to PARAM4 are 16 byte aligned? am i correct?

Yes.

As for constants - hmm, I'm not sure how you can force those to be aligned. There is a switch /Qsfalign16 that supposedly aligns the stack on 16-byte boundaries for functions, but I don't see how that would interact with arguments. You may want to consider putting those constants in variables that are then 16-byte aligned.

Steve

Steve - Intel Developer Support

Leave a Comment

Please sign in to add a comment. Not a member? Join today