Doctor Fortran in "It Takes All KINDs"

So this is retirement? As I noted earlier, I may no longer be working for Intel, but I do intend to stay active in the Fortran and Intel development communities. While I am back in the forum answering questions, it is liberating knowing that I am not responsible for making sure every question (and bug report) gets answered. I recently learned a wonderfully appropriate Polish saying "Nie mój cyrk, nie moje małpy", which translates literally to "Not my circus, not my monkey", or more colloquially, "not my problem". I plan to apply this a lot.

As has often been the case in Doctor Fortran posts, I get the ideas for these by seeing what sorts of problems and questions often arise about Fortran. Recently there was an extended thread in the comp.lang.fortran newsgroup about Fortran KIND values, started by someone learning Fortran who had a basic misunderstanding of how they worked. Most of the advice in the thread was good, but there were some folk who, while very knowledgeable and whose expertise I admire, offered a suggestion I thought flat-out wrong. Thus begat this post. But first, some history is needed.

In the beginning, there was FORTRAN - FORmula TRANslator for the IBM 704. The original FORTRAN, as well as its successor FORTRAN II, was almost entirely lacking in the ability to specify what kind of numbers you were calculating with. The 704 had "fixed point" registers, which we know as integers, and "floating point" registers (reals). Just one size of each. The early FORTRAN language didn't even have a way to explicitly declare whether a variable (or in the case of FORTRAN II, a function) was  real or integer - that was indicated by the choice of first letter of the variable's name. Yes, this is where the implicit typing rule for "letters I through N being integer, everything else being real" came from. (FORTRAN II also applied a meaning to the last letter of a name being F (made it a function) or the first letter being X (made the function be integer).)

Next came FORTRAN IV, also known as FORTRAN 66. F66 introduced the notion of explicit typing through the REAL and INTEGER declaration statements. (F66 also introduced COMPLEX and LOGICAL, concepts that had not been in the earlier languages.) While there was still only one kind of integer (and logical or complex), there were two kinds of reals, reflecting newer computer architecture. The larger one was called DOUBLE PRECISION. In addition to being able to declare variables as either REAL or DOUBLE PRECISION, you could indicate which kind a real constant was by the use of an E exponent letter for REAL or a D exponent letter for DOUBLE PRECISION - for example, 3.1415926535897D0.

What the language didn't specify, and really still doesn't, was the precision, range and in-memory format of integer and real values. Different computer architectures had different designs for these. Register sizes could be 8, 12, 16, 30, 32, 36, 48, 60 or 64 bits, some CDC machines had ones-complement integers (allowing minus zero for an integer!), the PDP-11 line had several different floating point formats depending on the series, the IBM 360/370 had "hex normalization" for its floating point, etc. As many computers offered a choice of integer and real sizes, programmers naturally wanted to be able to specify these in their programs. Thus came about the common extension of adding *n to the INTEGER or REAL keyword where n was the size in bytes of the value, for example, REAL*8 or INTEGER*2. So popular were these extensions that many programmers thought (and even today many think) that this syntax is standard Fortran; it isn't!

Next we come to FORTRAN 77, but nothing really changed here in the way that variables were typed. F77 did add CHARACTER (with the *n syntax), but that's it for typing.  This was the era of the shift of the mainstream computer architectures to 32-bit (from smaller or larger sizes). Since FORTRAN (still in upper case) still assumed only one, unspecified size of integer (and two unspecified sizes of real), and the default sizes changed as architectures did, the use of syntax such as INTEGER*2, as well as compiler options such as -i8) became even more entrenched, making code portability problematic. 

Fortran 90 (mixed-case!) finally addressed the problem of different kinds of integer and real (also logical, complex and character, but I'm not going to spend much time on those.) Recognizing that the widely-used *n syntax served only to specify size in bytes and not any other properties, the language adopted the notion of "kind numbers" that identified a particular variation of a type but did not specify it directly. Kind numbers were to be positive integers, but other than that an implementation was free to choose whatever set of kind numbers it wanted. The syntax for using these in declarations was the "kind type parameter" specified by the optional keyword KIND, for example: INTEGER(KIND=3) or REAL(17). For literals, the kind type followed an underscore, and the kind could be an integer literal or a named (PARAMETER) constant, for example: 486_4 or 3.1414926535897_DP.

Now that Fortran admitted that there could be multiple kinds for a type, it introduced the concept of "default" kinds (default integer, default real, etc.). The default kinds were implementation dependent, though up through Fortran 2008 a compiler was required to support only one integer kind and two real kinds. (That's still true in Fortran 2015, but there's an added requirement that at least one integer kind support 18 decimal digits.) If you write a constant literal without a kind specifier, you got the default kind. Since FORTRAN 77 didin't have this concept, many compilers gave you "free" extra precision or range if you typed a constant with more digits than the default. For example, one might see:

DOUBLE PRECISION PI = 3.1415926535897

and the compiler would evaluate the constant as if it were double precision. Fortran 90 put a stop to that, though - the number here is default real (typically "single precision") and you might end up with a much less accurate value. If the programmer had added D0 at the end, that would have been ok, but the Fortran 90 way is to use kind specifiers.

While most vendors decided to map the old *n syntax to kind numbers, so that real kinds of 4 and 8 were common, not all did. One popular implementation chose sequential kind numbers starting at 1 (1,2,3, etc.). Obviously, using explicit kind numbers this way was still not portable. The language provided two intrinsic functions, SELECTED_INT_KIND and SELECTED_REAL_KIND, to allow you to, finally, specify what characteristics you wanted from your numeric type. Instead of saying "I want a 32-bit integer", you would instead ask for an integer kind that can represent values of n significant decimal digits, or a real with a particular minimum precision or range. The recommended way of using these is to define, perhaps in a module, named constants for the various kinds you want to use, and then use these constants in declarations and constants. For example:

integer, parameter :: k_small_int = SELECTED_INT_KIND(5) ! 5 decimal digits
integer, parameter :: k_big_int = SELECTED_INT_KIND(12)
integer, parameter :: k_extra_real = SELECTED_REAL_KIND (10,200) ! 10 digits, 10**200 range
integer(k_small_int) :: tiny_int
real(k_extra_real) :: lotsa_digits = 2.780486193E98_k_extra_real

Doing it this way makes you completely independent of the compiler's scheme for assigning kind numbers and makes sure that you get the precision and range your application needs, in a portable fashion. Notice there's no mention of bits or bytes here, but there is also no mention of the underlying capabilities of the type. What if you specifically wanted an IEEE floating type? There are architectures where both IEEE and non-IEEE floating types are available, such as the HP (formerly Compaq formerly DEC) Alpha. In this case you can use IEEE_SELECTED_REAL_KIND from intrinsic module IEEE_ARITHMETIC to get an IEEE floating kind. And what if there is no supported kind that meets the requirements? In that case the intrinsics return a negative number which will (usually, depending on context) trigger a compile-time error.

If you want to ask what the kind of some entity is, the KIND intrinsic returns it. So if you want the kind of double precision, say KIND(0.0D0).

There is one interesting aspect of the new kind numbers that trips up some programmers, and that relates to the COMPLEX type. If you wanted double precision complex (note that DOUBLE COMPLEX was never standard), you might write COMPLEX*16 to indicate a COMPLEX that occupies 16 bytes. But in a Fortran 90 (or later) compiler where, just for illustration's sake, the kind of double precision was 8, you would use COMPLEX(KIND=8) instead. (Or really, some PARAMETER constant with the proper double precision kind value.)

A brief mention of the C interoperability features: intrinsic module ISO_C_BINDING declares constants for Fortran types that are interoperable with C types, for example C_FLOAT and C_INT. Use these if you're declaring variables and interfaces interoperable with C.

And now we get to the point of contention I mentioned at the start of this post...  Fortran 2008 extended intrinsic module ISO_FORTRAN_ENV to include named constants INT8, INT16, INT32, INT64, REAL32, REAL64 and REAL128 whose values correspond to the kinds of integer and real kinds that occupy the stated number of bits. More than one response in that newsgroup thread recommended using these instead of hard-coding integer and real kinds. In my view, this is little better than the old *n extension in that it tells you that a type fits in that many bits, but nothing else about it. As an example, there's one compiler where REAL128 is stored in 128 bits but is really the 80-bit "extended precision" real used in the old x86 floating point stack registers. If you use these you might think you're using a portable feature, but really you're not and may get bitten when the kind you get doesn't have the capabilities you need. My advice is to not use those constants unless you truly understand what they do and do not provide. Most applications should use the SELECTED_xxx_KIND intrinsics as I described above. If you want to call your kind DP to indicate "double precision", that's fine - just use SELECTED_REAL_KIND to get it.

If you have questions about the specific issues I discuss here, feel free to ask in the comments but I might not see the question for a while. If you have general questions or want to report a problem with the Intel compiler, please do that in the user forums.

For more complete information about compiler optimizations, see our Optimization Notice.