Typical VTune output from typical FORTRAN code:
Line Source _____________________________ Cloc
133 WTERM=WTERM-0.5*DELTA*W2(L,K)* ________ 65
134 & (UUU(L+1,K+1)+UUU(L,K+1)+VVV(LN,K+1)+VVV(L,K+1)___ 6
135 & +WWW(L,K)+WWW(L,K+1)) ___________ 11
137 IF(ISFCT(MVAR).GE.2)THEN ______________ 11
143 WW=WTERM/(CON(L,K+1)+CON(L,K)+BSMALL) ___ 9
144 FWU(L,K)=MAX(WW,0.)*CON(L,K) _________ 316
145 & +MIN(WW,0.)*CON(L,K+1) ___________ 4
150 ENDDO __________________________________ 4
My question is, "how can you (or the compiler) possibly optimize code when you have no clue how long instructions will take to execute?" I realize some of these are continuation lines; so I combine the times and treat them as one; but I look over the code and can see no consistent relationship between the complexity of the statement and the number of clock cycles! This is maddening.
I know why line 144 takes so long. It has a MAX and a MIN in it which are two branches. I'm not that stupid.
Am I interpreting this correctly? That the numbers in the right column (Cloc) means the average number of clock cycles required to perform this instruction? or is it not this specific? I mentioned in another message I've seen a C statement if(z>Z[i])then take 1784 clocks!
Message Edited by firstname.lastname@example.org on 06-15-200607:50 AM
Message Edited by email@example.com on 06-15-200607:52 AM