Fortran DLL slows down 2 times when called from Visual C++ in comparison to calling from Fortran

Fortran DLL slows down 2 times when called from Visual C++ in comparison to calling from Fortran

Bild des Benutzers manh nguyen

Hello everyone,

I have this problem, hope that you can help me out:

There is a subroutine in Fortran code that has been given to me. This Fortran code is acually a simulation model of a physical process, so there is a criteria of simulation speed. I compiled the Fortran code into a DLL file.

Then the DLL subroutine is called from a Fortran program by using Windows API functions LoadLibrary and GetprocessAdress, it gives a the speed ratio of 770 which means that it takes 1 second to simulate the physical process in 770 seconds real time

Next, i do the same thing but this time in C++, and this gives me a speed ratio of 380 which is 2 times slower than the calling from Fortran program

i have tried to get through the problem by setting the Runtime library option when compiling C++ program to  "Multithreaded DLL" which was set to the DLL but it did not resolve the problem

Anyone has any idea of what has happened with the call from C++?

Thanks for your help,

30 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.
Bild des Benutzers Steve Lionel (Intel)

Please show the Fortran and C++ code used to call the routine. In particular, knowing how the input and output arguments are declared would be useful.  Changing the runtime library option is not relevant.  My guess is that you're not calling with the same inputs.

Steve
Bild des Benutzers manh nguyen

deleted

Bild des Benutzers bmchenry

I have called simulation dlls from Intel Fortran AND Microsoft C++ (both 32 bit) and found no significant time difference in performance of the simulation.
What i am guessing is that somehow an input parameter or an option for the simulation is different between your Fortran call and your C++ call.
This would show itself in different results of the simulation between the Fortran call and the C++ call.
Are the simulation results identical between the two different calls?
If not, fix things so you are comparing the same simulation results across the two calls.
If so, then please post up the calling routines for Fortran and C++ and it may reveal the reason for the speed degradation.

Bild des Benutzers bmchenry

what's up with these double postings? is it me, or the forum? sorry for the poste duple!

Bild des Benutzers manh nguyen

steve, you can see the fortran code in .f90 and the C++ one in .cpp, .h files. bmchenrry, i used dev-c++ and intel fortran, the results are identical with both fortran and C++

Thx

Bild des Benutzers IanH

In the absence of compiler options to the contrary, a Fortran default logical in ifort uses four bytes.  In the absence of compile options to the contrary, a MS VC++ bool is one byte.  It is quite probable that your fortran dll goes and stomps on the C++ program's stack.  Your C++ program probably gets quite upset about this.  Depending on how the C++ or Fortran compiler feels about packing structures you may have a related problem with the inlet structure (and the outlet structure too, in a slightly different scenario).

Consider using Fortran 2003's C interoperability features to make the link between Fortran and your C++ code robust to this sort of issue.  While you are at it you can get rid of the non standard structure business (make them BIND(C) types).

Bild des Benutzers manh nguyen

lanH, I have trid to put bool variables of the C++ program into "int" which has 4bytes as "logical" type of ifort, but nothing changed

Zitat:

IanH schrieb:

In the absence of compiler options to the contrary, a Fortran default logical in ifort uses four bytes.  In the absence of compile options to the contrary, a MS VC++ bool is one byte.  It is quite probable that your fortran dll goes and stomps on the C++ program's stack.  Your C++ program probably gets quite upset about this.  Depending on how the C++ or Fortran compiler feels about packing structures you may have a related problem with the inlet structure (and the outlet structure too, in a slightly different scenario).

Consider using Fortran 2003's C interoperability features to make the link between Fortran and your C++ code robust to this sort of issue.  While you are at it you can get rid of the non standard structure business (make them BIND(C) types).

thx

Bild des Benutzers mecej4

A look at your code indicates that the program executes a loop ten times, with a call to the DLL inside the loop.

If the program executes in less than, say, 0.1 second, comparisons of run-times in the C++ and Fortran versions are going to be misleading.

What is the order of magnitude of the running time?

Bild des Benutzers manh nguyen

mecej4, the running time is about 37seconds with Fortran, and about 80 seconds in C++

Zitat:

mecej4 schrieb:

A look at your code indicates that the program executes a loop ten times, with a call to the DLL inside the loop.

If the program executes in less than, say, 0.1 second, comparisons of run-times in the C++ and Fortran versions are going to be misleading.

What is the order of magnitude of the running time?

thx

Bild des Benutzers manh nguyen

a strange thing to me: the .exe file of ifort code has only 13k, and the one of C++ has1264k, do you think that is normal?

thx

Bild des Benutzers manh nguyen

With the simulation in Visual Fortran, i set the runtime library of the dll project to Multithreaded, the speed ratio downs to about 380 (knowing that it was about 780 when runtime library is set to Multithreaded DLLs ).

Then, with the option Multithreaded, i add to the Additional dependancies the  Libifcoremd.lib, The simulation spead ups to the the maximum (about 780)
Does this make sense to anyone?

Bild des Benutzers mecej4

You have found that a compiler not so well known for optimization (dev-C) produces code that runs slower than code output by a Fortran compiler known for its optimization (Intel Fortran). The other tweaking that you did, with compiler options, different RTLs, etc., is probably not going to make much of a difference. There is always Amdahl's Law to consider in explaining how multi-threaded programs behave.

In such circumstances, one may rejoice that the Fortran program is "fast", or lament that the C program is "slow", or take a position somewhere in between.

Your quoting "real" time/simulation times is misleading, because it suggests that the ratio is of some significance. In solving a heat diffusion problem, for example, one may change "real" time by changing the diffusion coefficient, without causing any change to the run time of the simulation. Similarly, simple models of climate change can run a simulation of the entire (known) life of the Earth in a few hours.

Bild des Benutzers manh nguyen

 In fact, the case im working on is involving mathemathical optimization, and there is iterative call to the simulation model, so the simulation speed is really something that matters. I dont know much about numerical modeling, so can not tell if  in my case i can change any coefficient to change "real" time as in the example that you mentionned. 

My last comment was to ask why only by changing the runtime library option from MD to MT, the simulation speed of my fortran code decreased two times? which would (i think) answer the initial question
Zitat:

mecej4 schrieb:

You have found that a compiler not so well known for optimization (dev-C) produces code that runs slower than code output by a Fortran compiler known for its optimization (Intel Fortran). The other tweaking that you did, with compiler options, different RTLs, etc., is probably not going to make much of a difference. There is always Amdahl's Law to consider in explaining how multi-threaded programs behave.

In such circumstances, one may rejoice that the Fortran program is "fast", or lament that the C program is "slow", or take a position somewhere in between.

Your quoting "real" time/simulation times is misleading, because it suggests that the ratio is of some significance. In solving a heat diffusion problem, for example, one may change "real" time by changing the diffusion coefficient, without causing any change to the run time of the simulation. Similarly, simple models of climate change can run a simulation of the entire (known) life of the Earth in a few hours.

Bild des Benutzers manh nguyen

hi, I solved the problem

it turned out that the problem comes from a line in fortran code: IMPLICIT INTEGER(i-n)
i changed it to: IMPLICIT INTEGER*2(i-n) and this anwser to initial question. I hope this will help someone who might have the same problem as mine
thx

Bild des Benutzers rase

I can only repeat the advice to use IMPLICIT NONE and to declare every variable explicitly instead of using the automatic declaration feature of ancient Fortran. I made the experience the hard way some years ago with a similar error.

Bild des Benutzers mecej4

I do not understand what you mean by "initial question", but if adding IMPLICIT INTEGER*2(..) "answered" the question, your code has serious problems. If this change was required for the program to run correctly, the speed comparisons that you started out with are invalid because you were comparing a correctly running program with an incorrectly running one.

The Fortran program was probably developed to work on a 16-bit CPU, and code with INTEGER*2 is going to run more slowly on today's 64-bit CPUs than code with default INTEGER for the platform.

Bild des Benutzers manh nguyen

mecej, you're right, there is a bug when i add IMPLICIT INTEGER*2(i-n), the code gives NAN results. :(

Bild des Benutzers mecej4

Zitat:

mecej, you're right, there is a bug when i add IMPLICIT INTEGER*2(i-n), the code gives NAN results. :(

I'd have used the emoticon ":)", instead, since the NaNs in the result give a strong warning that there is something definitely wrong.

Bugs are harder to track down and fix when they do not affect the results so much that questions of plausibility arise. There have been cases where such bugs stayed hidden for decades in highly used and well-reputed software.

Bild des Benutzers manh nguyen

how can i find out if the code was develpped for 16-bit CPU or not?

Zitat:

mecej4 schrieb:

I do not understand what you mean by "initial question", but if adding IMPLICIT INTEGER*2(..) "answered" the question, your code has serious problems. If this change was required for the program to run correctly, the speed comparisons that you started out with are invalid because you were comparing a correctly running program with an incorrectly running one.

The Fortran program was probably developed to work on a 16-bit CPU, and code with INTEGER*2 is going to run more slowly on today's 64-bit CPUs than code with default INTEGER for the platform.

Bild des Benutzers mecej4

The usage of 2-byte integers is a good indication. Such half- (or quarter-) word integers make sense in new code only if either (i) interfacing Fortran to hardware that needs to exchange 16-bit integers, or (ii) to pack more integers into memory than if natural size integers were used.

Bild des Benutzers Tim Prince

I stopped programming for 8-bit CPUs over 20 years ago, yet someone decided to invest in 16-bit floating point data types for the  Intel architectures which are currently being released. presumably on the basis of demand from influential customers.

The 16-bit data types involve use of 32-bit registers, using the same arithmetic instruction set as for 32-bit data types.  As mecej4 says, their use could be justified only if a compensating benefit can be demonstrated in reduced memory usage.  Even the benefit from using approximate divide and sqrt is reduced in the Ivy Bridge CPU.

Bild des Benutzers manh nguyen

i have made a dll without anny input argument (something like "void DLL(void)"), then i simply call the function from both C/C++ and Fortran (there is no loops this time) the program C/C++ is always much slower than Fortran program

Bild des Benutzers IanH

Show all code.  Show all compiler options.

Bild des Benutzers manh nguyen

here are calling programs from dev-c++ and Visual Fortran, i dont understand why there is a slight difference in the result(comparison.JPG)

LanH, I can't find away to show all compiler option for you, is there a particular part you want to see?

Bild des Benutzers IanH

For Fortran projects you can see the compile options in effect by right clicking on the project in the Visual Studio solution explorer and selecting Properties, then Fortran > Command line. 

You don't show the code for the DLL.

Those two console window screenshots appear to have different working directories.  Are you loading exactly the same DLL?  How do you ensure that?

Bild des Benutzers manh nguyen

i did copying from folder to folder to assure comparability, but its was not good. Now,  i copy 2 .exe file to the same folder and it gives same results. im sorry for that,

Bild des Benutzers manh nguyen

linker option for DLL and calling program

Bild des Benutzers IanH

The fact that the factor between the two cases is very close to two is suspicious.  Is the CPU_TIME result printe by the program consistent with the elapsed time measured by your wrist watch?

The source code for the DLL is incomplete - you've only provided one subroutine.  That means readers of the forum can't compile your code and investigate things - they have to "head compile" what you've provided and hope that things that are called are inocuous.  Speaking for myself - my head compiler is notoriously buggy at the best of times.  If you don't want to post the full source, then chop it down yourself to a compilable and linkable subset that still exhibits the problem (as a debugging/diagnostic strategy you should be doing this anyway).  Otherwise we're all guessing.

What options are you using for the C program?

Bild des Benutzers manh nguyen

yes, it is correct compared to wrist watch

Zitat:

IanH schrieb:

The fact that the factor between the two cases is very close to two is suspicious.  Is the CPU_TIME result printe by the program consistent with the elapsed time measured by your wrist watch?

Melden Sie sich an, um einen Kommentar zu hinterlassen.