More info on file and errors on code ported from VBA/Excel

More info on file and errors on code ported from VBA/Excel

Attached is the latest version of the ported code from VBA/Excel and a listing of errors that the compiler is throwing up

AttachmentSize
Download poisson-pb-v2.txt3.99 KB
Download errors.txt1.4 KB
27 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Sorry, the original query is awaiting the forum administrators' approval before it is posted. I'm trying to port some working VBA/Excel code to fortran, but I am falling foul of some of the Fortran rules , I have read around but I still can not see why the code is throwing up these particular errors.

 

Thanks in advance

I thought I could 'end' the main code then have the functions appended but that did not seem to work, too many end statements, but taking out the first end statement then seems to run into the problem of having funtions after executable statemnets

Attachments: 

AttachmentSize
Download poisson.f90457 bytes

I thought I could 'end' the main code then have the functions appended but that did not seem to work, too many end statements, but taking out the first end statement then seems to run into the problem of having funtions after executable statemnets

John,

In the help look at CONTAINS and try having something like :

program console2

..... your code

contains

function poisp

...

end function poisp

function poislow
end function poislow

...

end program console2

But also check that the functions have a value when the error conditions occur. (e.g. poisp = 0.0 at the start of the function poisp)

You should also get into the habit of specifying "implicit none" in your code. It will help identify typos for example.

HTH

Les

Thanks Les,

 

In the meantime (without trying contain)  the code is now building, but is not working correctly. I named  the main program Poisson, then ended it  with an

 end program poisson statement,

 then placed my user defined function after that end statement.

 

Its not giving the right answers though. It is as if it it not going to those user defined function to compute the value of the functions.

Attachments: 

AttachmentSize
Download latest.f904.91 KB

There are several problems here.  The first, as Les suggested, is that you are not ending one program unit before starting another. Les' suggestion of the contains is a good one - the contains line would go in place of the "199 continue".

Les also is hinting that you have argument type mismatches due to the lack of declarations in the functions. You also have the functions returning single precision REAL but you're assigning values to double precision variables - you should be consistent throughout.

Steve - Intel Developer Support

John,

I have attached a modified version of your fortran code that addresses many of the problems Steve and Les have pointed out with the goal of being helpful and illustrating what they are saying.  The code sample compiles and runs.

Changes I made include:

1) removing goto statements in favor of better use of if constructs and return statements.

2) using a parameter to declare the precision of your reals and using this throughout your code and on all your floating point constants.

3) Put the "contains" line where it needs to go and moved "end program" where it needs to go.

4) Re-worked the variable declarations on your internal functions.  You only need to declare the dummy arguments and local variables.  It is not neccesary to declare variables you are not using, nor should you declare the names of your internal functions.  A second note, your internal functions can access variables in their host (your program), though I do consider it good practice to pass those variables (as you are doing) rather than accessing the host scope.

5) Took the simplest approach to making your argument types consistent.  It would be more proper (in my opinion) to make x1, x2 integers to remove the need for checking that in your functions, but that would require reworking more code than I cared to spent time on.

Take a look at the code and verify it performs properly and feel free to ask questions about my changes.

Attachments: 

AttachmentSize
Download latest-cw.f905.21 KB

Thanks Steve and Casey,

 

I was going to revert to old textbook example of the user defined function average to discover how to call such functions.

 

I'll check the type consistency event should be integer, although Poisson events are always integer, their mean can and often is real. Double Precision floating point is needed in most variable instances, otherwise the results will be unreliable.

 

Thanks again for the code, I'll read and digest and apply and see how it goes.

Casey,

 

I'm afraid my Fortran is a bit stuck in the past, normally I dip into Fortran 77 'Principles of programming' by Jerrold L. Wagner, then use Clive Page's, 'Professional Programmer's guide to Fortran 77', with another short document he did ,'Fortran90 for Fortran77 Programmers', and that has in section 2.3 'new forms for specification statements' info about new Type statements:-

The DOUBLE PRECISION data type is now just a special case of REAL so all facilities are identical (???); this means that double precison complex is fully standardised.

 

(In my old '77 manual it appears that there was a somewhat inconsistent prohibition of the use of  DOUBLE PRECISION in certain operations with complex numbers in F77.)

"new form with double-colon allows all attributes of variables to be specified at once."

 

There was no mention however of real(kind=wp).

In the attached document I have put a selection of  information that I found about this topic. It may be of use to others.

 

I found that 'wp' is short for working precision in the section that I lifted from a NAG document (attached), this gives an explanation, but I will have to read it very carefully in order to understand it, as it is not described as clealry as one would hope.

 

 

Attachments: 

AttachmentSize
Download realkindwp.doc28 KB

I have also found a document online by Bo Einarsson and Yuij Shokin, "Fortran 90 for the Fortan 77 Programmer",

 

(aside- copy and paste does not seem to work when posting to this forum. I get a pop-up window, but it is non-functional.)

Anyway, section 13 of the above document 'The new Precision Concept'   the statement REAL(KIND=WP) is explained quite nicely.

Arrgh, my earlier comment/post on REAL(KIND=WP) has been queued again by the admin procedures on this forum, so they will be out of order, sorry.

Casey,

 

There must be a mathematical error in part of my porting of the VBA code, which I'll try to fix, but the part that is working is giving a slightly different answer to the VBA/Excel code, which also using a similar 'R' statistical package code to compute the same value, seems to be the more correct answer, than mine. These values are notorious for the demands they place on the precision of numerical calculations, ref. L. Knusel, 'Computation of  Statistical Distributions - documentation of the Program ELV'. I see that there is now, sort of a double double precision data type in Fortran, and your wp 'trick', seems to be a very convenient way to set the precision. What are the limits on the acceptable values for wp ?

 

Thanks,

 

John

 

ps in the Einarsson document they give the following example

     INTEGER, PARAMETER       :: WP = SELECTED_REAL_KIND(14,300)

    REAL(KIND=WP)                  :: working_precision_variable

 

But they note "Regrettably now we have to give all floating point constants (with?) the additional _WP,

for example

            REAL(KIND=WP)           :: PI

            PI =3.141592653589793_WP

while sine the the intrinsic functions are generic, they will automatically use the corrcet data type and kind, which means that the argument determines which 'kind' the result should have (usually the same as the argument).

 

With this method you will in practice obtain double precision on systems based on IEEE 754 abnd single precision on coputers like Cray or compures based on the Digital Equipment Alpha-processor, which in all cases means a precision of about 15 significant digits."

 

Casey,

 

I sorted out the error in the PoisP routine where I should have had .GE. I had incorrectly written .GT. in the logic test. That got the probablities to match those of R and of some online probablity calculators.  The PoisHigh function was also working ok, but not PoisLow, on its first few circuiots round the loop it was reporting that x2 was not a whole number.

 

I looked through it and could not find any errors in comparison with the original VBA/excel code, but one thing I did note was that in PoisP I was using the same variable names as for the main program unit.

 

Now like subroutines in the function itself we can use new names and thus not worry about any local change to a variable affecting global or main program variables.

i.e. if we send PoisP(x,y,z) we can use in the function PoisP(a,b,c) the function just uses the position of the variable to translate between them so that

in this case global x = local a, global y = local b etc.

 

So I thought I would use this idea and simply rename the repeated names within the PoisP as x1 -> px1, x2 -> px1, z -> pz.

 

So now there should be no danger of 'crosstalk' causing weird (but actually logical) behaviour witin the code.

 

Problem was after that simple change, the code refused to compile giving a linking error cannot open file 'Debug/Casy.exe'.

 

So I'm stuck and I have no idea what is happening.

 

Is it something simple like I need to rename a file in the debug folder ?

 

ANSWER NO this latter problem is because I had an instance of the code running (command window open). When I exited that DOS/command box. The code has compiled, now to see what happens !

 

The offending code is attached,

 

ps I have now made the majority the function variables 'unique' so using the passing of dummy arguments to subroutines method.

 

Attachments: 

AttachmentSize
Download casy-poisson.f905.38 KB

PoisLow still not working, everything else looks ok, any ideas why it is failing mathematically, the porting of the functions looks sound, perhaps I'm getting code blindness though ?

NB confint.xls can be used to check the correct working of the fortran code.

Attachments: 

Now this gets strange.

 

Instead of using 'function' I thought I woulod code up a version using a PoisP as a subroutine to PoisLow just to test the code in this way. Doing it this way seems to have worked ??!!!

 

How come ?

 

 

Attachments: 

AttachmentSize
Download subroutine-version.txt2.61 KB

John,

There are quite a few comments to work through.  Just a note, my perspecive is a bit opposite to yours, and I primarily deal with fortran 90 and use features of 03 and 08 when I can and do not deal much with '77 code.  Books I keep on my shelf related to fortran are "Modern Fortran explained" by Metcalf and "Modern Fortran in Practice" by Markus.  

The (kind=wp) and the _wp attached to constants is just a convientent way to ensure everything is the same type and kind in a way that you can easily change the precision of all the variables in one place.  You'll see I set wp=kind(1.d0), which just makes wp mean double precision.  You can also select an arbitrary precision by using wp = selected_real_kind(sig_digits, exponent_rang), and let fortran worry about what type it will need to hold that number.  E.g. wp = selected_real_kind(10,40) will ensure your variables of that kind have enough precision to represent 10 digits and an exponenet range 10*(+/- 40).  The name "wp" is arbitrary, but as you found is a convention for "working precision".   Your book might call adding _wp to your constants regrettably, but knowing that all math, named variables and temporary variables will be of your desired precision means that you are not losing precision unkowningly somewhere.  

I'll also comment on scope.  In your internal functions (functions between "contains" and "end program") you have access to all variables in your host scope, but if you have dummy arguments or local variables of the same name, you will block access to those in the host scope and only have access to the local ones.  The dummy argument names should not be a problem as they were named (nor are they with the new names).

I don't have time now deal with the mathematical accuracy issues, but if I find some later I may take a shot at it.

Casey,

I found the problem (although I'm not sure why it causing the problem) sort of by accident.

 

This might sound a bit lame, but I use in conjunction with the microsoft intel compiler an old freebie Fortran compiler called Force. Although it has one foot stuck in the old punch card environament, it is very good when it comes to defining errors. Anyway it siad that the BigNum 'integer' !!?? was too large. This has the form of 1 followed by ten zeroes, now in the excel/VBA this causes no problems and no warnings at any stage when doing the lower limit with 3 as the number of events and 0.025 as probability limit. In Fortran this triggered the repeated warning "x2 not a whole number" (i.e. when the PoisLow function was in turn using the PoisP function) and then gave a number way too high. So in Force to get it to compile I had to drop a zero off the length of BigNum, when I did that the subroutine version ran without any warnings. Then re-loading the function version back into the Intel complier and knocking a zero off BigNum, got that to run with out the warnings and gave the correct answer.

 

I've no idea why that number is ok in excel/VBA but causes probelms in Fortran.

 

Now I can move on to getting the code to work with input and output files instead of interactive input.

 

There aren't going to be any open unit surprises, are there ?

 

Thanks for your help !

 

I'm not surprised by that actually.  I did not mention it (i forgot), but in the code I sent you I took the easy way out and declared BigNum to be a real(kind=wp) rather than an integer.   If you want BigNum to be an integer you can make it type integer(kind=selected_int_kind(11)) to ensure it is allocated enough storage to hold a number that big.  A default 4 byte integer can only hold numbers +/- 2**31 (~2 billion), which is not enough to hold your BigNum. 

Quote:

John B. wrote:

Now I can move on to getting the code to work with input and output files instead of interactive input.

 There aren't going to be any open unit surprises, are there ?

 Thanks for your help !

 

The only potential surprise I can think of is regarding record lengths in unformatted files.  If you are specifying a record length (recl=) in bytes, you'll want to add the compile flag /assume:byterecl because the intel default is not bytes (as it is on many other fortran compilers).  

In this thread much has been said about the various versions of Fortran code that is intended to calculate the  Poisson cumulative distribution function and its inverse. Another side of the question is whether the mathematics is correct. Unless I am mistaken, the calculations are missing an important element: the Poisson density function is eλx/x! , but I see no instance of the EXP function in any of the codes. What is the explanation?

My second comment is concerning the use of bisection of the interval in solving for the inverse. It is simple to code, but very inefficient, taking roughly three calls to POISP() to obtain each decimal digit in the answer.

The third comment is regarding BigNum.  Because of the exponential function in the PDF, for values of λ from 0 to 20, using BigNum = 40 is a good approximation to the usage of BigNum = ∞. Therefore, there is little reason to worry about using 8-byte integers, etc. Even single byte integers are sufficient!

Thanks for adding those comments mecej4.  I only wanted to address the fortran and keep out of the statistics / math aspect of the problem.    

Casey,

My comments were addressed mainly to the OP ( I wish the forum software assigned sequence numbers to responses, so that it would be easy to say, "re #12..."). Specifically, in http://software.intel.com/en-us/comment/reply/413921/1747524?quote=1#com... he wrote about having suspicions that double precision might not be sufficient, and speculated that quadruple precision might be needed.

This is a Fortran related forum and, properly so, you (Casey) responded within that context. When the results are not quite right, one is often tempted into asking whether tweaking the code or the compiler options would remove the errors. This thread furnishes an example that shows why one must get the problem specified correctly before it makes sense to discuss the reasons for the failure of the program to deliver correct results.

Fri, 08/16/2013 - 13:02>> ( I wish the forum software assigned sequence numbers to responses, so that it would be easy to say, "re #12...")

You could use the date and time (more to type)

Another gripe related to this:

While you can Select, Drag and Drop from prior post, performing a Select, Ctrl-C (copy), Ctrl-V (paste) crashes the comment edit box.

Jim Dempsey

www.quickthreadprogramming.com

mecej4,

I was also wondering how the algorithm worked.

When I looked at it and started going through the loop manually (and then I temporarily put print statements into the code in order to see what was happening to all the variables from loop to loop), it gave me the idea that a recurrence relationship was involved in computing the Poisson probabilities. If you want to know why the textbook exponential equation is actually not used in statistical code ('behind the scenes') then you have to look at the paper by prof. Knusel, which I attached to an earlier post, (see page 37) (I don't see it in the thread so I'll re-attach). Sections 6 onwards deal with this problem.

 

Anyway to be doubly sure I contacted retired Prof. John Pezzullo, who wrote the original Excel/VBA code, and his response is given in the attached word doc.

Yes, I remember reading in one of my old computing textbooks that the bisection algorithm is not the most efficient at finding roots of equations, alternatives being  secant method, method of false positives, and Newton-Raphson,  (see 'Numerical methods for engineers and scientists' by A.C. Bajpai, I.M. Calus, and J.A. Fairley, section 'Solution of non-linear equation', pp 28-81), but it is robust, and by some yardsticks termed 'efficient', but I guess not in terms of machine cycles!

 

Finally ,I found a less technical article on the recurrence relationship for computing Poisson probablities, see page 10 and the pages leading up to it in the article (attached) on Poisson distributions.

 

Cheers,

 

John

 

Attachments: 

John B.:

I read the references that you gave and reexamined the VB code. What it does is to sum terms in the series expansion of e, stopping when the last term is smaller than the current sum by a factor of 1E-10. That works because the exponential series is absolutely convergent with infinite radius of convergence.

Here is a shorter replacement for the part of your code that computes the CDF using the intrinsic EXP function. To compensate for the forum software's harsh treatment of leading blanks in the source code, I have also attached the source file.

This program gives the result 0.879107, which agrees with the value given by Knüsel's ELV.EXE (http://knuesel.userweb.mwn.de/elv/). [Notes: Prof. Knüsel died in April this year, and some of the links in his Web pages are extinct. ELV.EXE is a DOS program, and does not run in Windows 8. However, it runs inside DosBox, which is available at http://www.dosbox.com/download.php?main=1 ].

The VBA code in your XLS file was rejected by Excel 2013 running on W8-64.

program rpoisp
integer :: x1=2,x2=8
double precision :: z = 5.3
write(*,'(F11.8)')PoisP(z,x1,x2)
CONTAINS
 double precision Function PoisP(z,x1,x2)
 implicit none
 double precision z,q,S
 integer x1,x2,k
 !
 If(z < 0) Then
 write(*,*) 'PoisP: z < 0'
 PoisP = 0
 Return
 Endif
 !
 q = 1
 do k=1,x1-1
 q=q*z/k
 end do
 S = 0
 DO k = x1,x2
 q = q*z/k
 S = S + q
 End Do
 PoisP = S*exp(-z)
 Return
 End Function PoisP
End Program RPoisP
 

Attachments: 

AttachmentSize
Download poisp1.f90596 bytes

Hi mecej4,

The reason for the code work in this particular fashion is to avoid underflow and over flow in the cumulative probabilities. What Prof. Pezzullo does is arbitrarily set P(0)=1, so in the normal recurrence formula for Poisson probabilities he sidesteps having to evaluate P(0) at all, i.e. by using the normal formula for the first term, i.e. e^(-lambda).  [In some problems lambda can easily be 300 or much more ! Such numbers can get rounded down to zero]. Now you can construct the cumulative probabilities for the Poisson distribution from P(0) upwards,  as P(0) * (some terms), due to the recurrence relationship between successive probability terms, Pk=(lambda/k)*Pk-1. He computes those 'some terms' correctly, even though P(0) is deliberately incorrect (he watches for the possible overflow of the variable Tot at large lambda, and just rescales everything if the numbers involved do get too large, but his real 'trick' is to use the same P(0) for the partial sum of probabilities, S, between the desired limits. Thus when he divides S/Tot, the same incorrect P(0) s in the numerator and denominator simply cancel out, and you have the correct answer for the integrated probability between the limits, without having to calculate P(0) using the exponential formula, which as Knusel clearly illustrates can be a significant computational problem. A bit of reflection indicates why P(0)=1 is good choice as a 'dummy' value for P(0).

 

His other neat 'trick' is to avoid the difficulties with the range of lambda, i.e. from zero to infinity, in PoisHigh and PoisLow. This he does by mapping lambda to another variable v, defined as v=lambda/(1+lambda+obs)   [obs = observed number of events, pz, or vz, etc. depending upon if you are looking at the VBA code or the fortran code]. When lambda =0, v=0, and when lambda = infinity, v=1; and when lambda = obs, v has some value between 0 and 1 (near 0.5 for large obs)  That makes the half interval search more well behaved, and once the routine has found a value for v which makes the appropriate sum of Poisson terms equal to say 0.025, then he can calculate lambda as (1 + obs ) * v/(1-v), as done in the code.

Original explanation is all due to Prof. Pezzullo, I have just interpreted his comments and filled in bits for my own future reference. Originally I could not figure out from any of the Poisson equations in the literature where v came from ! It was driving me crazy trying to manipulate the various Poisson equations to get a variable like v. But the answer was that it wasn't really from those Poisson equations. No it came as clever idea from Prof. Pezzullo ! 

 

I guess a mathematician would have spotted this 'trick' and why it was being employed here more quickly, I was just thinking why ''v', where has that come from, which Poisson equation, (or the related Chi squared or exponential relationships).

 

Anyway there you have it.

 

Regards,

 

John

 ps And when the expression [see VBA version]  (1 + vz) * v/(1 - v),  or (1 + obs) * v/(1 - v) [my version] is sent from PoisHigh or PoisLow to the PoisP function or subroutine, he is actually sending lambda, as a bit of algebraic manipulation of  his mapping relationship v = lambda/(1 + lambda + obs), shows, making lambda the subject of the equation, gives as required, lambda = (1 + obs)*v/(1 - v). So unlike other cases of mathematical substitution nothing else has to be altered, e.g. like the limits in an integration when making a trigonometric substitution. The 'chopping/bisection' is in v, but the Poisson probablitity computations use lambda.

and Special thanks to Prof. Pezullo for his patience and help on this !!

Leave a Comment

Please sign in to add a comment. Not a member? Join today