"Everything you've always wanted to know about VB Arrays of Strings*
May 1998
"Everything you've always wanted to know about VB Arrays of Strings*
(*but were afraid to ask)"
Lorri Menard
As promised in the last newsletter, here is an article on "How to pass
arrays of strings from Visual Basic to Visual Fortran". Actually, it
should be called "How DVF can receive arrays of strings from VB", because
Visual Basic doesn't need to do anything special to pass the arrays to
Fortran.
The structure that VB uses to pass arrays of strings is called a "Safe
Array". These are often used in COM interfaces, and contain information
about the dimensions and bounds of the arrays within them.
Appended is an example Fortran subroutine that receives a one-dimensional
SafeArray of strings from Visual Basic and writes the contents of each
string out to a data file. It then modifies the strings, within the
SafeArray structure, and passes them back to VB. I've noted the areas of
interest with the keystring "!**", and included a long and involved
explanation of why you need to do it that way. (Of course, I reserve the
right to claim "Because I said so!")
The call from the Basic routine is as simple as this:
Dim MyArray(2) as String
MyArray(0) = "First element"
MyArray(1) = "Second element"
MyArray(2) = "Third element"
Call ForCall(MyArray)
Now, let's get into the Fortran program.
! ARRAYS.F90
! This subroutine takes as input an array of strings from Visual Basic,
! and writes each string out to a datafile.
! It also writes various pieces of information about the array to that
! file, for illustrative purposes.
!
subroutine ForCall (VBArray)
!dec$ attributes alias : "ForCall" :: ForCall
!dec$ attributes dllexport :: ForCall
!dec$ attributes stdcall :: ForCall
!** Declare the array of strings (SafeArray) as being passed by REFERENCE.
!** This must be explicit.
!dec$ attributes reference :: VBArray
!** The following module declares the interfaces to SafeArrayxxx
use dfcom
implicit none
!** Declare the SafeArray as a pointer. Use a generic
!** integer as something to point to, because the POINTER statement
!** requires it.
!** When this is declared as a pointer it will automatically expand
!** to fit the size of a pointer for the particular platform. Today
!** that is 32 bits - in the future, that may expand.
pointer (VBArray,SADummy) !Pointer to a SafeArray structure
integer SAdummy
!** What is returned by SafeArrayGetElem is a BSTR. The structure of
!** a BSTR is such that the length of the BSTR is returned in the word
!** preceding the pointer, and the string itself is pointed to by
!** the pointer. When using COM, BSTRs are coded in Unicode. Through
!** experimentation with Visual Basic V5.0 I've found that it passes
!** BSTRs coded in 8-bit ASCII.
!** Please note: This may not be true with future releases of VB!
!** The good news is that it allows us to take some shortcuts for now.
!**
!** Set up the appropriate structures. Declare a character string
!** that is "long enough". It doesn't actually take up any space
!** in your program; it is used as a template to describe the memory
!** pointed to by the pointer StringPtr
character*2000 mystring
pointer (StringPtr, mystring)
integer i, result, lbound, ubound, length
! Create the data file
open (2, file="test.out", status="unknown")
write (2, *) "Det
ails of the array passed by VB"
! Get the lower array bound
result = SafeArrayGetLBound(VBArray, 1, lbound)
write (2, *) "GetLBound gives ", lbound
! Get the upper array bound
result = SafeArrayGetUBound(VBArray, 1, ubound )
write (2, *) "GetUBound gives ", ubound
!** In this next loop, get each element of the array. This returns a
!** pointer to a copy of the string, which can then be referenced through
!** mystring. The length of the string is retrieved by the routine
!** SysStringByteLen.
!** This copy must be freed when we're done with it.
write (2, *) "Strings from the array:"
do i = lbound, ubound
result = SafeArrayGetElement(VBArray, i, LOC(StringPtr))
length = SysStringByteLen(StringPtr)
write (2, *) mystring(1:length)
call SysFreeString(StringPtr)
end do
!Done with the data file.
close (2)
!** This next loop writes a string back into each element of the array.
!** Through experimentation I've discovered that you MUST write back as
!** many characters as were there before: no more, no less. This loop
!** gets the length of the element, and writes back that many characters.
!** Once again, the SafeArrayGetElement makes a copy, which must be
!** freed.
!** SafeArrayPutElement also makes a copy, which is then passed back
!** to Visual Basic. Unfortunately, the memory occupied by the original
!** strings passed in is still allocated, and no longer pointed to.
!Let's try writing back into VB's array
do i = lbound, ubound
result = SafeArrayGetElement(VBArray, i, LOC(StringPtr))
length = SysStringByteLen(StringPtr)
mystring(1:length) = "Element#" // char(i+1+48)
mystring(length+1:length+1) = char(0)
result = SafeArrayPutElement(VBArray, i, LOC(mystring))
call SysFreeString(StringPtr)
end do
return
end
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
Ask Dr. Fortran
May 1998
"Ask Dr. Fortran"
Steve Lionel
Dear Dr. Fortran,
I know this program who seems to be OK, but he is a little different from
all the other programs. (Just between you and me, he is a legacy program.
Don't let that get out. It would not be politically correct.)
He started out life written for the IBM 1130 Disk Monitor System with 8k of
core storage. He was written in 1130 FORTRAN. The original documentation
gives direction as to which switches on the computer must be flipped to
invoke certain options. But he is still alive and works well. We still add
things to him. He lives on PCs now.
We really don't view him as belonging to us. His original programmers have
retired and some have died. So in that respect he is doing better than his
creators. We are like park rangers taking care of some national treasure
that is to be passed on to our successors.
But lets give this a go. I need some help understanding what really goes on
in this guys head. There are many cases of the following coding:
subroutine xyz(n,array)
integer n
real array(1) <-- Note the size is only 1
. . . <-- code in here loops from 1 to n.
return
end
My questions are:
- Why does this work? It seems an out of bounds exception should be
generated for array since its size is only 1.
- In the main program the arrays are explicit shape. What type is array
in subroutine xyz?
- Is array(1) standard FORTRAN, or is this something that most compilers
just allow?
Sincerely,
Robert Magliola
De Leuw, Cather and Co.
Dear Mr. Magliola,
When this charming program was written, in the days of keypunches and
storage drums, FORTRAN IV (FORTRAN-66) was the current standard. While
FORTRAN IV did have the "adjustable array" feature, (which could have been
used in the above example by using "array(n)" instead of "array(1)"), it did
not have the "assumed-size array" feature (where the rightmost upper bound
is specified as "*") that was to be introduced in FORTRAN 77. Therefore,
programmers who wanted to write subroutines which would accept an array of
unknown total size would use a last dimension of 1. This worked because the
last upper bound is not needed to calculate the position of an element in
a Fortran array, and compilers of the time didn't have array bounds checking
(or if they did it could be disabled).
Now fast-forward to 1978 when the FORTRAN 77 standard was adopted. It included
a new "assumed-size" array feature (which had already shown up as an extension
in many vendors' compilers). So now there was a standard-conforming way to
say "I don't know what the upper bound is", yet there were still thousands of
existing programs that used the old (1) convention and more compilers
supported bounds-checking, even at compile-time (VAX FORTRAN did this, for
example.) These old programs would suddenly start getting errors, which
was not desirable - the Fortran tradition is provide as much upward
compatibility as possible. What to do?
The solution was to have compilers treat a last upper bound of 1 as a special
case that was equivalent to *, disabling bounds checking (which answers your
first question). The a
rray in the above example has a single dimension with
lower bound 1 and an implicit upper bound of the total number of elements in
the array that was passed, though most compilers don't pass that information
and just treat the upper bound as infinite (questions two and three.) It is
valid to have a multi-dimension assumed-size array, but only the rightmost
(last) dimension can have an upper bound of * (or 1 treated as *). If you
have a multi-dimension array with upper bounds other than the last of 1, then 1
is what you get.
I hope you have enjoyed this trip back into the history of the Fortran
language.
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
Dr. Fortran - Obsolescent Features
October 1998
Ask Dr. Fortran
Hey! Who are you calling "obsolescent"?
Steve Lionel, DVF Development Team
Dr. Fortran didn't receive any appropriate questions for his column this
time, so he's going to take on a topic that is sure to raise a ruckus each
time it is brought up in the comp.lang.fortran newsgroup: Obsolescent and
Deleted Features.
Fortran (or FORTRAN) has had a long history of general upward compatibility
- Fortran 77 included almost all of Fortran 66, and Fortran 90 included all
of Fortran 77. But Fortran 90 formally introduced the concept of "language
evolution" with the goal of removing from the language certain features that
had more modern counterparts in the new language.
The Fortran 90 standard added two lists of features, "Deleted" and
"Obsolescent". The "Deleted" list, features no longer in the language, was
empty in Fortran 90. The "Obsolescent" list contained nine features of
Fortran 77 which, to quote the standard, "are redundant and for which better
methods are available in Fortran 77." Furthermore, the F90 standard said:
If the use of these features has become insignificant in Fortran programs,
it is recommended that future Fortran standards committees consider
deleting them from the next revision.
It is recommended that the next Fortran standards committee consider for
deletion only those language features that appear in the list of
obsolescent features.
It is recommended that processors supporting the Fortran language continue
to support these features as long as they continue to be widely used in
Fortran programs.
Proponents of "cleaning up" the language argued that it would make compiler
implementors' jobs easier. The compiler vendors disagreed; most said that
they would not remove support for any features since they knew that users
continue to compile old programs. Furthermore, deleting a feature from the
language means that there is no official description of how that feature, if
still supported, interacts with other language features. (The Fortran 77
standard included an appendix describing how Hollerith constants, a
FORTRAN IV feature not included in Fortran 77, should work if a compiler
chose to support them.) For the record, DIGITAL will not remove support for
any "deleted" language features from its Fortran compilers.
In Fortran 90, the list of "obsolescent" features was as follows:
- Arithmetic IF
- Real and double precision DO control variables and DO loop control
expressions
- Shared DO termination and termination on a statement other than END
DO or CONTINUE
statement.
- Branching to an END IF statement from outside its IF block
- Alternate return
- PAUSE statement
- ASSIGN statement and assigned GO TO statements
- Assigned FORMAT specifiers
- cH edit descriptor
Descriptions of obsolescent features in the standard appeared in a small font
and compilers were to provide the ability to issue diagnostics for the use
of obsolete features.
Now we come to Fortran 95. Keep in mind that the Fortran 90 standard did
not say that the next standard HAD to delete any of the
previously-designated "obsolescent" features, but that's exactly what the
standards committee did. Six of the nine "obsolescent" features (numbers
2, 4, 6, 7, 8 and 9
) above were "deleted". Poof! Gone! And guess what -
that meant that a valid Fortran 77 program was no longer a valid Fortran 95
program! But never fear: DVF (and indeed most vendors' compilers) will
continue to support the deleted features (with optional diagnostics
informing you of the fact, of course.)
The Fortran 95 list of obsolescent features includes the remaining items of
the above list from Fortran 90 (1, 3 and 5), as well as several new
additions. Are you sitting down? Here's the new list:
- Arithmetic IF
- Shared DO termination and termination on a statement other than END
DO or CONTINUE
- Alternate return
- Computed GO TO statement (use CASE)
- Statement functions (use CONTAINed procedures)
- DATA statements amongst executable statements (betcha didn't know
they could go there!)
- Assumed length character functions (this means CHARACTER*(*)
FUNCTIONs)
- Fixed form source (!!!!)
- CHARACTER* form of CHARACTER declaration (use CHARACTER([LEN=]))
Needless to say, the inclusion of fixed-form source on this list has raised
a LOT of eyebrows... Assumed-length CHARACTER functions (and the CHARACTER*
form of declaration) are deemed to be an "irregularity" in the language,
which they are, and there are alternatives available, but Dr. Fortran
doesn't see these disappearing from users' code anytime soon.
So does this mean that some of these features will be deleted in the next
standard, currently called "Fortran 2000"? [Fortran 2003 - ed.] At present, the answer is "no".
The standards committee has agreed to NOT move any features from the
"obsolescent" list to the "deleted" list for F2K, and furthermore, is not
proposing any additions to the "obsolescent" list. So it would appear that,
for now, anyway, the "concept of language evolution" excludes extinction,
and that should make Fortran programmers around the world breathe easier.
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
Re: Visual Fortran Newsletter Articles
July(?) 1999
Dr. Fortran says "Better SAVE than sorry!" Steve Lionel, Compaq Fortran Engineering
In this issue, Dr. Fortran takes on another less-understood feature of the Fortran language, the SAVE attribute.
Back in the "good old days" of Fortran programming, when lowercase letters hadn't been invented yet and we strung our core memory wires by hand, programmers knew that local variables lived in fixed memory locations and, of course, took advantage of that, writing code such as this:
SUBROUTINE SUB
INTEGER I
I = I + 1
WRITE (6,*) 'New I=',I
END
The idea was that the value of the local variable I was preserved between calls to routine SUB, so that subsequent calls would get successive values of I. (Many of these same programmers assumed that variables were zero-initialized as well.) However, the Fortran language didn't make such promises and, with the advent of improved optimization and "split lifetime analysis" which could make variables live in registers or on the stack, programs which made such assumptions could break.
To accommodate the useful notion of a local variable whose definition status is preserved across routine calls, Fortran 77 added the SAVE statement. If a local variable (not a dummy argument) was named in a SAVE statement, its value at the point of the RETURN or END statement was preserved to the next call to that subroutine or function. Named COMMON blocks could also be SAVEd, but didn't need to be if there was always a routine active in the call tree which used that COMMON. Blank COMMON was implicitly SAVEd. An interesting tidbit is that in Fortran 77, a DATA-initialized variable's value was preserved without needing SAVE as long as you didn't redefine the variable. Fortran 90 removed that last clause for local variables, so that an "initially defined" local variable is implicitly SAVEd, but the catch about redefinition still applies to variables in named COMMON blocks.
One common misconception is that SAVE implies static (fixed address) allocation for a variable. This is not so - in fact, if the compiler can determine that a SAVEd variable is always defined before use, then it could decide to make that variable live in a non-static (register or stack) location. The Fortran standard has no mechanism for saying "static and I mean it" - even the Compaq Fortran STATIC extension doesn't do this. Right now, the best way to ensure that a variable is allocated statically is to put it in COMMON and give it the VOLATILE attribute (VOLATILE is an extension [but is standard in F2003 - ed.]).
Fortran 90 added a new twist to this - ALLOCATABLE arrays. Fortran 90 implied, and Fortran 95 makes clear, that local ALLOCATABLE variables get automatically deallocated and become undefined when the routine in which they are declared is returned from. This has been a big shock to some programmers who figured that the values would stay around. If you want the array to remain there, use SAVE.
Given that many programs assumed SAVE semantics for variables, most vendors, including DIGITAL, had their Fortran 77 compilers give implicit SAVE semantics to variables which were used before being defined. (Note that this doesn't apply to ALLOCATABLE variables.) [Intel Fortran does NOT give SAVE semantics by default.] So why use SAVE? First, it is a lways a good idea to tell the compiler what you want, rather than making assumptions based on a current implementation. Compilers keep getting smarter and what "works" today might not work next year. Proper use of SAVE can also aid with error reporting - some compilers will suppress "use before defined" warnings for variables with an explicit SAVE attribute. It's also good to let the human who reads your code know that you are assuming the variable's value is preserved. That's why Dr. Fortran says, "Better SAVE than sorry!"
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
Dr. Fortran and "The Dog That Did Not Bark"
October 1999
Dr. Fortran and "The Dog
That Did Not Bark"
By Steve Lionel
In past issues of the newsletter, Dr. Fortran has discussed
an assortment of things, sometimes obscure, that the Fortran
standard says. In this issue, he's going to take a page
from Sherlock Holmes and talk about things that the standard
doesn't say, and how they can bite you as well.
Let's start with a simple observation that the standard describes
a "standard-conforming program". That is,
it establishes the rules to which a program must conform in
order to produce results as specified by the standard.
If your program is not standard-conforming, then all bets
are off - the processor (compiler and run-time environment)
can do anything (a common example used in comp.lang.fortran
is "Start World War III", though the Doctor is not
aware of any implementations which would do this - he would
consider this a "quality of implementation issue").
You've probably written many non-conforming programs without
realizing it. Got INTEGER*4 in your programs?
Non-standard. Use LOGICAL variables in arithmetic
expressions (or use logical operators such as .AND.
on integers)? Non-standard. What these do is implementation-dependent.
If a compiler supports these and similar uses, it does so
as extensions to the standard and is generally required to
have the ability to detect the non-conformance at compile-time.
If your program uses such extensions, it is non-portable and
may execute differently on different platforms or with different
compilers.
However, there is another class of non-conformity that, in
general, can't be detected at compile time and which can cause
big headaches for programmers who make unwarranted assumptions.
Let's start with one of the Doctor's favorites - order of
evaluation of LOGICAL expressions.
Many programmers write something like this:
IF ((I .NE. 0) .AND. (ARRAY(I) .NE. -1) THEN
and expect that if I is zero, then the reference to ARRAY(I)
won't happen. The program may work on one platform,
but get array bounds errors when ported to another.
However, the standard allows the operands of a logical operator
to be evaluated in any order, and at any level of completeness,
as long as the result is algebraically correct. For
logical expressions, Fortran does NOT have strict left-to-right
ordering nor does it have short-circuit evaluation.
The standard-conforming way of writing this is:
IF (I .NE. 0) THEN
IF (ARRAY(I) .NE. -1) THEN
Here's another place where the standard's silence can trap
the unwary. What do you see when you execute the following
statement?
WRITE (*,'(F3.0)') 2.5
Many Fortran programmers expect "3.". But
try this in Visual Fortran, as well as in most other PC and
UNIX workstation Fortran implementations and you'll get "2."!
Why? Well, the Fortran standard says that the value
is to be "rounded", but doesn't define what that
means! On systems which implement IEEE floating arithmetic,
the IEEE default rounding rules are used and they specify
that if the rounding digit is exactly half-way between two
representable results, you round so that the low-order digit
is even. If you're a VAX user, you'll get "3."
because VAX rounding uses the "5-9 rounds up" rule,
and an OpenVMS Alpha user can see it either way, depending
on whether or not IEEE float was selected! The Doctor
notes that the Fortran standards committee is working on a
proposal for a future standard that would allow the programmer
to specify the rounding method, but for now, the standard
is silent and you get whatever the compiler writers think
is right.
Pop quiz time - in a CHARACTER(LEN=n) declaration,
what is the lowest value of n that a compiler is required
to support, according to the standard? Is it A) 1?
B) 11? C) 255? D) 1000? The standard doesn't
explicitly say, but one can make a good argument for one of
these. Go to the end of the column to see which
one and why. The Doctor's
point is that there are many compiler limits which the standard
does not specify (including things such as the number of nested
parentheses in an expression, number of actual arguments supported,
etc.). While most implementations have reasonable limits
for such things, the Doctor has seen programs which exceed
the limits of some implementations (for example, using hundreds
of actual arguments) and become non-portable. Just because
one compiler supports something, that doesn't mean that all
will!
There are many other things the standard doesn't say that
programmers often take for granted. For example,
the standard doesn't even say that 1+1=2, or how accurate
the SIN intrinsic must be. An implementation which grossly
violates reasonable expectations here would probably be a
commercial failure, but it wouldn't be violating the standard!
In summary, writing standard-conforming and portable programs
is not just a matter of throwi
ng the "standards checking
switch". You also need to be aware of things the
standard doesn't say and to make sure that your application
doesn't depend on implementation-dependent features and behaviors.
The more platforms you port your application to, the more
likely it is that you'll uncover such assumptions in your
code.
Answer to Dr. Fortran's pop
quiz: B) 11. Why? Because INQUIRE(FORM=) is
supposed to assign the value "UNFORMATTED" to the
specified variable (for unformatted connections) and that's
11 characters long, the longest of the set that INQUIRE returns.
No other language rule implies a longer minimum length.
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
It's only LOGICAL
April 2000
Doctor Fortran in "To .EQV. or to .NEQV., that is the question", or "It's only LOGICAL"
By Steve Lionel Visual Fortran Engineering
Most Fortran programmers are familiar with the LOGICAL data type, or at least they think they are.... An object of type LOGICAL has one of only two values, true or false. The language also defines two LOGICAL constant literals .TRUE. and .FALSE., which have the values true and false, respectively. It seems so simple, doesn't it? Yes... and no.
The trouble begins when you start wondering about just what the binary representation of a LOGICAL value is. An object of type "default LOGICAL kind" has the same size as a "default INTEGER kind", which in Visual Fortran (and most current Fortran implementations) is 32 bits. Since true/false could be encoded in just one bit, what do the other 31 do? Which bit pattern(s) represent true, and which represent false? And what bit patterns do .TRUE. and .FALSE. have? On all of these questions, the Fortran standard is silent. Indeed, according to the standard, you shouldn't be able to tell! How is this?
According to the standard, LOGICAL is its own distinct data type unrelated to and not interchangeable with INTEGER. There is a restricted set of operators available for the LOGICAL type which are not defined for any other type: .AND., .OR., .NOT., .EQV. and .NEQV.. Furthermore, there is no implicit conversion defined between LOGICAL and any other type.
"But wait," you cry! "I use .AND. and .OR. on integers all the time!" And so you do - but doing so is non-standard, though it's an almost universal extension in today's compilers, generally implemented as a "bitwise" operation on each bit of the value, and generally harmless. What you really should be using instead is the intrinsics designed for this purpose: IAND, IOR and IEOR.
Not so harmless is another common extension of allowing implicit conversion between LOGICAL and numeric types. This is where you can start getting into trouble due to implementation dependencies on the binary representation of LOGICAL values. For example, if you have:
INTEGER I,J,K
I = J .LT. K
just what is the value of I? The answer is "it depends", and the result may even vary within a single implementation. Compaq Fortran traditionally (since the 1970s, at least) considers LOGICAL values with the least significant bit (LSB) one to be true, and values with the LSB zero to be false. All the other bits are ignored when testing for true/false. Many other Fortran compilers adopt the C definition of zero being false and non-zero being true. (Visual Fortran offers the /fpscomp:logicals switch to select the C method, since PowerStation used it as well.) Either way, the result of the expression J.LT.K can be any value which would test correctly as true/false. For example, the value 1 or 999 would both test as true using Compaq Fortran. Just in case you were wondering, Compaq Fortran uses a binary value of -1 for the literal .TRUE. and 0 for the literal .FALSE..
The real trouble with making assumptions about the internal value of LOGICALs is when you try testing them for "equality" against another logical expression. The way many Fortran programmers would naturally do this is as follows: IF (LOGVAL1 .EQ. LOGVAL2) ...
but the results of this can vary depending on the internal representation. The Fortran language
defines two operators exclusively for use on logical values, .EQV. ("equivalent to") and .NEQV. ("not equivalent to"). So the above test would be properly written as: IF (LOGVAL1 .EQV. LOGVAL2) ...
In the Doctor's experience, not too many Fortran programmers use .EQV. and .NEQV. where they should, and get into trouble when porting software to other environments. Get in the habit of using the correct operators on LOGICAL values, and you'll avoid being snared by implementation differences.
However, there is one aspect of these operators you need to be aware of... A customer recently sent us a program that contained the following statement: DO WHILE (K .LE. 2 .AND. FOUND .EQV. .FALSE.)
The complaint was that the compiler "generated bad code." What the programmer didn't realize was that the operators .EQV. and .NEQV. have lower precedence than any of the other predefined logical operators. This meant that the statement was treated as if it had been: DO WHILE (((K .LE. 2) .AND. FOUND) .EQV. .FALSE.)
what was wanted instead was: DO WHILE ((K .LE. 2) .AND. (FOUND .EQV. .FALSE.))
The Doctor's prescription here is to always use parentheses! That way you'll be sure that the compiler interprets the expression the way you meant it to! (And you therefore don't have to learn the operator precedence table you can find in chapter 4 of the Compaq Fortran Language Reference Manual!)
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
Re: Visual Fortran Newsletter Articles
December 2000
Don't Touch Me There - What error 157 (Access Violation) is trying to tell you
Steve Lionel - Compaq Fortran Engineering
One of the more obscure error messages you can get at run time is Access Violation, which the Visual Fortran run-time library reports as error number 157. The documentation says that it is a "system error," meaning that it is detected by the operating system, but many users think they're being told that their system itself has a problem. In this article, I'll explain what an access violation is, what programming mistakes can cause it to occur, and how to resolve them.
Windows (the 95/98/Me/NT/2000 varieties) is a 32-bit virtual memory operating system. The "32 bit" means that a memory address is 32 bits in size, potentially having over four billion possible addresses. "Virtual memory" means that not every memory address in use corresponds 1-to-1 with a physical memory address - some may be "resident" in RAM and others "paged out" to a disk file. The other important aspect of virtual memory is that only those address ranges currently being used exist at all - others are not represented. It's like a telephone book, which has pages for only those names of people who live in the city. If a phone book had to include a space for every possible name, every city and town's phone book would fill rooms!
When your program starts to run, Windows allocates (creates) just enough virtual memory to hold the static (fixed) code and data in the executable. As the program runs, it may ask to allocate additional memory, for example, through calls to ALLOCATE or malloc, either directly by your code or indirectly by the run-time library. Each allocation creates a new range of now-valid virtual addresses which didn't exist before. When the program ends, Windows automatically deallocates all the virtual memory the program used.
Since not every possible 32-bit value represents a currently valid address, what happens if you try to access (read from or write to) an invalid address? Yes, that's right, you get an "Access Violation"! Probably the most common address involved in an access violation is zero. Because a zero address is typically reserved as meaning "not defined", Windows (and most operating systems) deliberately leaves unallocated the first group of addresses (page) starting at zero. This means that an attempt to access through an uninitialized address will result in an error. You can also get an access violation trying to access memory with a non-zero addres s when that memory's address range hasn't yet been allocated.
Common causes of the "invalid address" type of access violation are:
- Mismatches in argument lists, so that data is treated as an address
- Out of bounds array references
- Mismatches in C vs. STDCALL calling mechanisms, causing the stack to become corrupted
- References to unallocated pointers
Another type of access violation is where the address space exists but is protected. Usually, the address space in question is set up as "read only," so an attempt to write to it will result in an access violation. In Visual Fortran, the most common cause of this is passing a constant as an argument to a routine that then tries to modify the argument. Visual Fortran, as of version 6, asks the linker to put constants in a read-only address space. Windows NT/2000 honors this, so trying to modify a constant gets an error, but Windows 95/98 (not sure of Me) does not, so the modification is allowed. This is why some programs that run on Windows 9x don't on NT/2000. (It is a violation of the standard to modify a constant argument.)
If you are running your application in the debugger, the debugger will stop at the point of the access violation. You may need to use the Context menu in the debugger to look at the statements of a caller of the code where the error occurred, but this can usually give you a good idea of what might be wrong. Compare argument lists carefully and look for the mistake of trying to modify a constant. Rebuild with bounds and argument checking enabled, if it's not already on (it is by default in Debug configurations created with V6 and later).
So now you know that when you see "Access Violation", Windows is trying to tell you "Don't Touch Me There".
Note from Steve - As of December 2005, Intel Fortran does not put constants in read-only image sections. That will be enabled in an update due in January 2006. Current versions of the compiler do support the
/assume:noprotect_constants
switch which tells the compiler to pass constants in a stack temporary so that the called procedure can safely store to it, with the changes being discarded on return.
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
Doctor Fortran and the Virtues of Omission
December 2000
Doctor Fortran and the Virtues of Omission
Steve Lionel - Compaq Fortran Engineering
As I was walking up the stair I met a man who wasn't there. He wasn't there again today. I wish, I wish he'd stay away. Hughes Mearns (1875-1965)
Up through Fortran 77, there was no Fortran standard-conforming way of calling a routine with a different number of arguments than it was declared as having. This didn't stop people from omitting arguments, but whether or not it worked was highly implementation and argument dependent. For example, you can often get away with omitting numeric scalar arguments but not CHARACTER or arguments used in adjustable array bounds expressions, as code in the called routine's "prologue" tries to reference the missing data, often resulting in an access violation (see Don't Touch Me There.)
Fortran 90 introduced the concept of optional arguments and a standard-conforming way of omitting said optional arguments. Many users eagerly seized upon this and started using the new feature, but soon got tripped up and confused because they didn't follow all of the rules the standard lays out. The Doctor is here to help.
First things first. To be able to omit an argument when calling a routine, the dummy argument in the called routine must be given the OPTIONAL attribute. For example:
SUBROUTINE WHICH (A,B) INTEGER, INTENT(OUT) :: A INTEGER, INTENT(IN), OPTIONAL :: B ...
If an argument has the OPTIONAL attribute, you can test for its presence with the PRESENT intrinsic. The standard prohibits you from accessing an omitted argument, so use PRESENT to test to see if the argument is present before touching it. That part is simple.
The part that people tend to miss, though, is that the use of OPTIONAL arguments means that an explicit interface for the routine is required to be visible to the caller. Generally, this means an INTERFACE block (which must match the actual routine's declaration), but this rule is satisfied if you are calling a CONTAINed procedure. If you don't have an explicit interface, the compiler doesn't know that it has to pass a n "I'm not here" value (usually an address of zero) for the argument being omitted, and you could get an access violation or wrong results.
An interesting aspect of OPTIONAL arguments is that it's ok to pass an omitted argument to another routine (which declares the argument as OPTIONAL) without first checking to see if it is PRESENT. The "omitted-ness" is passed along and can be tested by the other routine. What's even more interesting is that the standard allows you to pull this trick on intrinsics such as MAX, PRODUCT, etc.!
There are some additional aspects of optional arguments, such as the use of keyword names in argument lists, that are worth learning about. For more information, see the section "Optional Arguments" in the Language Reference Manual. Another very important reference is the Language Reference Manual, "Determining When Procedures Require Explicit Interfaces.". The Doctor highly recommends this for your reading pleasure. There will be a quiz next week (just kidding!).
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
The Perils of Real Numnbers (Part I)
April 2001
The Perils of Real Numbers (Part 1)
Dave Eklund
Compaq Fortran Engineering
One of Fortran's greatest strengths is its ability to manipulate
real numbers. It is astonishing, however, that many Fortran
programmers lack even a rudimentary understanding of them.
In this series, perhaps we can acquire a better understanding
and, at the very least, see how some of the "experts"
deal with problems.
Let's begin by asking the simple question, "Which real
numbers can be represented EXACTLY?" If I gave you a
number, how would you find out if the number actually had
a precise representation on any given machine?! In what follows
I am going to ALWAYS use a decimal point (.) when I am discussing
real (floating point) numbers, and I will NEVER use a decimal
point when discussing integers.
So the following would be integers:
17
150
-12
0
1000000000000000000
and the following would be reals:
1.0
-12.5
.1234
0.567
-7.00
3.14159265
0.30517578125
1000000000000000000.
Which of the above do you believe are EXACTLY representable
as integers or as reals? Why?
Think POWERS OF TWO. If we start with the positive whole
numbers, what we find is that both integers and real numbers
are internally represented as sums of powers of 2. Now integers
are easier to look at, and real numbers do have a pesky exponent
field that needs to be considered, but an integer or real
like "9" is the sum of 8 and 1, both of which are
powers of 2 (2**3 and 2**0 respectively). The main exceptional
value is zero. If you view zero as a power of two, perhaps
it's time to increase your medication...
Now negative numbers are somewhat different. For integers,
we are talking two's complement arithmetic (normally), but
for real numbers we just turn on the sign bit in the real
number, which the hardware designers so nicely provided. That's
all well and good for whole numbers, but how about decimal
fractions like .5 and .25? Well, continue to think POWERS
OF TWO. Only now it's the negative powers of two. So, for
example, .5 is 2.**(-1) and .25 is 2.**(-2) and so on. In
point of fact .5 and .25 look identical as far as the "fraction"
part of each real number is concerned, and only the exponent
changes! As powers of two, both ARE exactly representable.
When you REALLY need to look at numbers, there are several
formats that we find useful (alphabetically):
| B |
Binary |
| E |
Real values with E exponents |
| F |
Real values with no exponent |
| I |
Integer values |
| O |
Octal values |
| Z |
Hexadecimal values |
My personal favorites tend to be F and Z. So let's take an
up-close and personal look at some whole numbers first and
then some fractions.
Try the following program (printing small whole numbers,
both as integers and as reals):
integer, parameter :: lower = 0
integer, parameter :: upper = 8
do i = lower, upper
type 1, i, i, i
1 format (' integer: ', i, 1x, b, 1x, z)
enddo
do i = lower, upper
x = float(i)
type 2, x, x, x
2 format (' Real: ', f, 1x, b33.32, 1x, z12.8)
enddo
end
It produces:
integer: 0 0 0
integer: 1 1 1
integer: 2 10 2
integer: 3 11 3
integer: 4 100 4
integer: 5 101 5
integer: 6 110 6
integer: 7 111 7
integer: 8 1000 8
Real: 0.0000000 00000000000000000000000000000000 00000000
Real: 1.0000000 00111111100000000000000000000000 3F800000
Real: 2.0000000 01000000000000000000000000000000 40000000
Real: 3.0000000 01000000010000000000000000000000 40400000
Real: 4.0000000 01000000100000000000000000000000 40800000
Real: 5.0000000 01000000101000000000000000000000 40A00000
Real: 6.0000000 01000000110000000000000000000000 40C00000
Real: 7.0000000 01000000111000000000000000000000 40E00000
Real: 8.0000000 01000001000000000000000000000000 41000000
The integers form a nice progression of bits (look at the
"b" formatted column). If we look at the reals,
using "b" or "z" format, we see a similar
pattern. Notice that zero is the same for both integer and
real (although for a real we CAN represent -0.0). Look at
2.0000000. There is only a single bit set! And it's way up
in the exponent field. How can this be?
Normally we would observe that a real number (IEEE) comprises
a sign (high, left) bit, an exponent (8 bits for single precision
-- real), and a fraction (the remaining, rightmost 23 bits).
When we "normalize" any real number, the fraction
gets shifted so that the high bit is "1" and the
exponent adjusted accordingly. But if the high bit is always
"1", we can elect to just discard it to save space
(and add precision), and generally this is done. So the fraction
is really the rightmost 23 bits PLUS a "hidden"
bit of 1. For a number like 2.0, which is exactly 2.**1, the
fraction is 10000000000000000000000 before we toss the hidden
bit and, hence, is 00000000000000000000000 afterwards! If
you look carefully, you will observe that 2.0, 4.0, and 8.0
all have a zero fraction (rightmost 23 bits). But 3.0 whose
fraction starts out as 110000000000000000000 becomes 10000000000000000000000
after dropping the (high) hidden bit. And, of course, there
are also appropriate exponent bits to the far left (perhaps
discussed in more detail in a later article).
Notice that in these real numbers there are quite a few zeros
in the fraction (rightmost 23 bits). ALL small integer values
will look like this! For example, let's take 42, which as
an integer in binary is 101010 (2**5 + 2**3 + 2**1). The fraction
before tossing the hidden bit would be 10101000000000000000000
and afterwards is just 01010000000000000000000, so there are
lots of zeros (still) to the right. This is a good indication
that we are dealing with an "exact" value (not proof,
but it happens a lot).
Let's try another program to look at the small negative powers
of 2:
integer, parameter :: lower = 0
integer, parameter :: upper = 10
x = 1.
do i = lower, upper
x = x/2.0
type 2, x, x, x
2 format(' Real: ', f25.20, 1x, b, 1x, z)
enddo
end
Notice that we used a very "wide" format -- f25.20
so that we can get a better look at the "full" result
(all of the nonzero digits). This is a VERY useful trick...
The result is:
Real: 0.50000000000000000000 1111110
00000000000000000000000 3F000000
Real: 0.25000000000000000000 111110100000000000000000000000 3E800000
Real: 0.12500000000000000000 111110000000000000000000000000 3E000000
Real: 0.06250000000000000000 111101100000000000000000000000 3D800000
Real: 0.03125000000000000000 111101000000000000000000000000 3D000000
Real: 0.01562500000000000000 111100100000000000000000000000 3C800000
Real: 0.00781250000000000000 111100000000000000000000000000 3C000000
Real: 0.00390625000000000000 111011100000000000000000000000 3B800000
Real: 0.00195312500000000000 111011000000000000000000000000 3B000000
Real: 0.00097656250000000000 111010100000000000000000000000 3A800000
Real: 0.00048828125000000000 111010000000000000000000000000 3A000000
So these are the first few negative powers of two. Just like
the positive powers from the first example, these all have
a zero fraction (after tossing the hidden bit). Notice that
the actual values in f format all end in "5". And
the 5 keeps moving to the next column. This means that ANY
fractional sum will also end in 5. The consequence is that
if you provide a fraction whose last nonzero digit is NOT
5 (like 0.000276000000) it CANNOT be exactly represented as
the sum of any negative powers of two! This is a VERY important
point. You say, "So what." Well, this means that
lots of "common" numbers are not exactly representable,
like 0.10000000 and 0.200000000000000, although 0.500 IS exactly
representable. And while some fractions ending in 5 CAN be
represented, many cannot. Consider 5.0 divided by powers of
10.:
do i = 1,10
x = 5.0/(10.**i)
type 1, x, x
1 format (1x, f40.30, 1x, b)
enddo
end
which produces:
0.500000000000000000000000000000 111111000000000000000000000000
0.050000000745058059692382812500 111101010011001100110011001101
0.004999999888241291046142578125 111011101000111101011100001010
0.000500000023748725652694702148 111010000000110001001001101111
0.000049999998736893758177757263 111000010100011011011100010111
0.000004999999873689375817775726 110110101001111100010110101100
0.000000499999998737621353939176 110101000001100011011110111101
0.000000050000000584304871154018 110011010101101011111110010101
0.000000004999999969612645145389 110001101010111100110001110111
0.000000000499999985859034268287 110000000010010111000001011111
Hey, only that first one is EXACT! Notice that the others,
while "close" to .05, .005, .0005. etc. are not
EXACTLY .05, .005, .0005 etc. Some are a little bigger, some
smaller (popularly called "nines disease"). In fact,
with the exception of 0.500, all the others CANNOT be exactly
represented as sums of powers of 2! Observe, however, that
only when a wide format is used is this apparent. With a smaller
format width, most of these will look just fine due to rounding!
We are finally at the point where we can decide which numbers
are EXACTLY representable:
- 1.0
- Yes (any small integer is fine)
- -12.5
- Yes, small integer + negative power of 2 (.5)
- .1234
- No, last fractional digit is not 5
- 0.567
- No, last fractional digit is not 5
- -7.00
- Yes, small integer
- 3.14159265
- Cannot easily tell (last fractional digit is 5) [Is actually
NOT representable]
- 0.000030517578125
- Cannot easily tell (last fractional digit is 5) [IS actually
representable]
- 1000000000000000000.
- Maybe (small integer, for some value of "small")
To decide the last 3 values, just try the following:
type 1, 3.14159265
type 1, 0.000030517578125
type 1, 1000000000000000000.
type 1, 1000000000000000000.D0
1 format(f60.30)
end
and observe:
3.141592741012573242187500000000
0.000030517578125000000000000000
999999984306749440.000000000000000000000000000000
1000000000000000000.000000000000000000000000000000
We see that the closest representable real number to 3.14159265
is actually 3.14159274...; 0.000030517578125 CAN be represented
exactly (it is, in fact, a power of 2); and while 1000000000000000000.
cannot be represented as a real number (can you explain this
more precisely?), it CAN be represented as a double-precision
number (more than twice as many fraction bits). Once again,
notice the use of an even wider format to help get a better
look at the numbers! Keep in mind that for a statement like:
type 1, 3.14159265
the Fortran compiler and runtime library will do a "double
conversion." The compiler will convert the string 3.14159265
into a real value, and the runtime system will then convert
back to a string (under format control) to produce 3.141592741012573242187500000000.
Neither of these conversions is easy, but thankfully the Fortran
compiler and runtime library perform all of this heavy lifting!
As a quiz for next time, consider the following program:
i = 1000000013
x = i
type 1, i, x
1 format(1x,i,1x,f20.5)
end
It gives:
1000000013 1000000000.00000
Try to figure out where the "unlucky 13" went!?
Why does it come back if we use /real_size:64? Look for the
answers in Part II of this article in a future newsletter
issue.
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter |
Re: Visual Fortran Newsletter Articles
April 2001
Win32 Corner - ShellExecute
Steve Lionel Visual Fortran Engineering
Win32 Corner is a new feature of the newsletter that illustrates how to use Win32 API routines to do commonly requested tasks.
The ShellExecute API routine is handy for opening a web page, or any document using its natural editing tool. It's equivalent to right clicking on a file and selecting Open - or you can also choose the default action (whatever is listed first), Print or Edit. I've found it most useful for opening a web page with the user's default browser.
Open shellexecute.f90 (attached) and reference the numbered comments (!!1, etc.) below:
- ShellExecute is part of the Shell API and is defined in module SHELL32. You could also USE IFWIN.
- The hWnd argument is the handle of the owner's window. In a Console Application, NULL is the thing to use, but in a Windows Application you might want the main window, and in QuickWin, use GETHWNDQQ(QWIN$FRAMEWINDOW).
- lpOperation (referred to as lpVerb in newer versions of the MS documentation) is a C-string that says what to do with the file. "open" is what you'll want most often, but you could also specify "edit" or "print". If the argument is null, then the "default action" is used.
- lpFile is the thing we want to open. It could be an ordinary file, or a URL. Note the NUL-termination to make it a C-string.
- If we were opening (running) an executable file, command parameters would go here.
- You can specify a default directory if you want. NULL_CHARACTER passes the equivalent of a C NULL here.
- nShowCmd specifies how you want the window to appear. SW_SHOWNORMAL is the standard behavior, but you could also specify minimized, maximized and whether or not to hide the active window.
- If ShellExecute returns a value greater than 32, it succeeded, otherwise an error occurred. Note that ShellExecute returns immediately - it does not wait for the opened application to finish.
Try building and running shellexecute.f90 as a "Fortran Console Application". Enter a favorite URL, such as http://www.intel.com/, or the path to a file on your system, then watch it open!
For more information on ShellExecute, look it up in the Visual MSDN index.
Message Edited by Steve_Lionel on 12-07-2005 11:54 AM
Steve
Attaching or including files in a post
Doctor Fortran blog
@DoctorFortran on Twitter | |