DATAN2D seems too slow

DATAN2D seems too slow

I'm comparing the speed and accuracy of several coordinate transformation algorithms and find that the functionDATAN2D doubles the execution time ofthe whole program over and above that when using DATAN2. These two functions should really only differ by a multiplication operation (i.e. 57.3 deg/rad). Anybody have any ideas?

I use VS 2008, IVF 11.1.035, x64 platform,IDE environment.

This code illustrates the problem:

     program main  ! Compute longitude in equatorial xy plane.

     real*8 :: x, y, w, dw, LON, LonDeg, LonRad, t1, err, mx, t2

     real*8,parameter :: delt = 1.d0/360.d0, rpd = datan(1.d0)/45.,&

     drad = delt * rpd, dpr = 1./rpd

     integer*4 :: i,j

     write (*,'("computing longitude")')

     mx = 0.d0

     w = 6000000.d0

     call cpu_time (t1)

     do i = 1, 10000

          w = w + 10.                              ! increment by 10 m

          do j = 1, 32000

           LonDeg = j * delt                    ! 1/360 deg to 88.9 degrees

           LonRad = j * drad                    ! same thing in radians

           x = w * dcos (LonRad)

           y = w * dsin (LonRad)             ! call datan2 to recover Long.

           LON = dpr * datan2(y, x)       ! this way takes 11.3 sec

!          LON = datan2d (y, x)             ! this way takes 22.7 sec  <------

           err = dabs (LON - LonDeg)    ! error in degrees

           if (err > mx) mx = err              ! max err, degrees

        enddo

     enddo

     call cpu_time(t2)

     write (*,'("max error:",f20.14,"  degrees")') mx

     write (*,'("time: ",f20.1,"  seconds")') t2-t1

     end program main

10 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

It may not help, but why not use the generic names (eg SIN, COS, ATAN2, ABS). It is implied that they provide the maximum precision of the variable kind used.
Your conversion from Lat, Lon to X,Y could have a long way to go. The use of DATAN2D is the least of your problems.

John

To John and anyone else, Idid apoorjob of describingmy situation. More details:

My project (in IDE environment)is to test 5 double-precision algorithms to convert geocentric (x, y, z) to geodetic (latitude, longitude, height). I have already determined accuracy and am now measuring how much CPU time is required to process a billion points. Optimization = fast and floating point model = fast. The times will indicate relative speed, hence will help decide which algorithm is best.

Here'smy problem. Common to all the algorithms is the one-line computation

lon = dpr * datan2 (y,x) or
lon = datan2d (y,x)

where dpr is degrees/radian. During the timing runs, I tried each of these equations in several of the algorithms and although accuracy was the same, total CPU time doubled when I used datan2d instead of datan2. I'm baffled. I thought the times would be the same.

To isolate the problem, I wrote the small program abovesimply tocompute longitude 320 million times. If the firstlongitude computationis commented out,the programtakes 23 seconds. If the second is commented out, it takes 11 seconds. So the problem remains, at least on my computer.

While performing the "paste" operation above, the exclamation mark in line 17 (column 1) disappeared and the"greater than"symbol became a "gt;" in line 20.

Perhapse DATAN2 is intrinsic or inlined whereas DATAN2D is a function call.

WRITE(*,*) "LOOK HERE"
LON=dpr*datan2(y,x)!thiswaytakes11.3sec
WRITE(*,*) LON
LON=datan2d(y,x)!thiswaytakes22.7sec&lt;------
WRITE(*,*) LON

Modify your code to contain both functions and the write statements. Compile in Debug build with full optimizations. Place break point on "LOOK HERE". N.B. The write statements should still have valid breakpoint line number information. See if the code is inlined or via call.

Jim Dempsey

www.quickthreadprogramming.com

The call to DATAN2 with the separate multiply calls an optimized ATAN2 routine directly. DATAN2D calls a library routine which in turn calls the optimized ATAN2, and does the multiply, though I see some extra code that seems to be protecting against some edge condition. I'm a bit surprised that DATAN2D takes twice as long, but there is an extra level of function call in there.

Steve - Intel Developer Support

Thanks toboth of youfor commenting.Will stickwith datan2.

Last quick question: Is itPOOR programming practice to let the compiler "promote" the constants to double precision, as I have done in the PARAMETER statement above? Iseem to keepdoing this to save a little space, to make it all look ateensy bit"cleaner", without all the "d"s -- because I know the compiler will take care of it.

The "modern" style is to define a parameter constant that uses SELECTED_REAL_KIND to determine the kind that has the precision you want, say, DP, and then use 1.0_DP. I know, even longer than D0, but it helps with portability. I would discourage you from allowing the compiler to promote types, and would also discourage you from using the specific intrinsic names such as DATAN. Use the generic names and properly typed arguments.

Steve - Intel Developer Support

In addition to Steve's comment, a value that can not be represented exactly in single precision will
not get exactly the same approximate value if specified as single or double precision.

Try:

real(kind=sp) :: x = 1.2
real(kind=dp) :: y = 1.2_dp

write(*,*) x-y

(with obvious definitiions for sp and dp)

So it is better (IMHO) to be explicit about what you want.

Regards,

Arjen

Thanks for your tips, Steve and Arjen. I'm going to read up on them andput them into practiceright now (kind of precision, generic functions).

Don

You don't always need the long syntax and precision definition, as the abbreviated syntax can work in suitable cases, for example:
real*8, parameter :: one = 1
real*8 pi
pi = 4 * atan (one)
write (*,*) pi
end

Another important issue that is often lost in this debate about the difference between 1.2 and 1.2_dp, is what was the original intention of the supplied value of 1.2. Potentially it had an accuracy of 1.2 +/- .05, or quite possibly havebeen +/- .005 if a trailing zero was left out in the recording.
Unlike pi, 1.2 could represent, say, a 20% factor to an original estimate.
It is important to not loose focus on the accuracy of the measurement that can be provided to the numerical algorithm. Debates about thedifference between kind = 4, 8, or 16 often mask the real precision available.

John

Leave a Comment

Please sign in to add a comment. Not a member? Join today