fseek causes problem with stream files

fseek causes problem with stream files

Hi all,

I have been migrating some old code which has been through the migration mill a few times (Microsoft -> DEC -> Compaq -> Intel), so it's a bit of a mixed bag in terms of use of extensions. The code does a fair amount of reading and writing data to a binary file and I've replaced most of these operations using binary stream I/O. But in doing this I encountered a problem where a file opened with form='binary', access='stream' would sometimes fail with an end-of-file during a read. I tracked the problem down and have replicated it in the piece of code at the end. In this cut down version, the initial data is written out to the file with a dummy integer at the start of the file. After some processing, the dummy integer is rewritten with its true value and that works fine. But in the final stage, fseek is used to reposition the file before updating the integer and then the subsequent read fails with an end of file condition.

So my question is whether this is a bug in the Intel portability library or is there something that says that you can't use portability functions on stream files? I do hope that it's not the latter as it'll save me a lot of rewriting of code.

Many thanks
John Paine

program test_stream

use ifport

implicit none

integer i,iRet
integer idata
integer ndata

integer iz(2048)

c open the stream data file

idata=1234
open(idata,file='test_stream.dat',form='binary',access='stream',status='unknown')

c initialise the data

do i=1,2048
iz(i)=0
end do

c write it out to the data file with a dummy count at the start of the file

ndata=0
write(idata)ndata
write(idata)(iz(i),i=1,2048)

c do some calculations to work out the actual count to be written at the start of the file

ndata=2048

c write out the new count

write(idata,pos=1)ndata

c read the data following the count value to check that the file has not been truncated by the write

read(idata,iostat=iRet)(iz(i),i=1,2048)
if(iRet.ne.0) then
write(*,'(a)')' Bad read after write using pos=1'
stop
endif

c got here ok, so the file wasn't truncated, so now use fseek to reposition to the start of the file

iRet=fseek(idata,0,0)

c write out the new count

write(idata,pos=1)ndata

c read the data following the count value to check that the file has not been truncated by the write

read(idata,iostat=iRet)(iz(i),i=1,2048)
if(iRet.ne.0) then
write(*,'(a)')' Bad read after using fseek followed by a write using pos=1'
stop
endif

c close and delete the file if we get to this part of the code as everything worked fine

close(idata,status='delete')

end

10 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Mixing I/O runtimes often causes problems. You are attempting to use fseek and FORM='binary', which are extensions to standard Fortran, with stream files.

Why not use rewind(idata) instead of fseek(idata,0,0) and write(idata) instead of write(idata,pos=1)?

For the Windows environment, you can abandon the (ancient) Fortran file routines and use the Win32 API routines directly, much more efficient and also much easier, simpler and more productive programming.  Here is a handle-based example which solves your problem:

RECURSIVE SUBROUTINE Set_File_Pointer (ihandl, offset, truncate)
    IMPLICIT NONE
    INTEGER(HANDLE), INTENT(IN)     :: ihandl
    INTEGER, INTENT(IN)                :: offset
    LOGICAL, INTENT(IN), OPTIONAL    :: truncate
    INTEGER                            :: rslt
    rslt = SetFilePointer (ihandl, MAX0(offset,0), NULL, FILE_BEGIN)
    
    IF (PRESENT(truncate)) rslt = SetEndOfFile (ihandl)                
END SUBROUTINE Set_File_Pointer

Thanks for the responses.

I only showed the fseek to the start of the file as an illustration, so rewind (while it would probably work in the example) doesn't resolve the problem of the clash between stream files and portability library functions.

But Paul's suggestion to use the direct call to the Win32 API routine doesn't affect later writes to the stream file. So it looks like I can simply replace the compatability library calls with the API calls and that should solve my problem.

Many thanks
John

I'd be a little cautious about the workaround of calling directly into the Windows API for this situation - that's broadening the number of sub-systems that your program has to directly interact with, and the overlying libraries may not be expecting things like operating system file position to be changing underneath them.  As mecej4 notes, "form='binary, access='stream'" is a bit of a curious monster too - a mix of an extension and a standard language feature that essentially do the same thing as far as I know (?) - did you mean form='unformatted', access='stream'?

Given use of stream access, why do you need to use fseek at all?  Can't you just supply the appropriate pos specifier?  If you want to reposition a stream file without actually transferring data then just use an data transfer statement with an empty io-list e.g. "write (idata,pos=x)".  That will let you stick within the standard language, which leaves less room for vendor quibbling around whether something is supported or a bug and also makes your life easier when someone tells you that your program needs to run on linux tomorrow.

I can't reproduce a problem using a current compiler version.  Which version are you using?

I'm a bit uncomfortable mixing FORM='BINARY' with ACCESS='STREAM' - the latter is a Fortran 2003 feature meant to be combine dwith FORM='UNFORMATTED' (in this case), but it seems to work ok here.

Steve - Intel Developer Support

Ian, thanks for the comments. I do agree about the problems with interactions as I've encountered many of them over the 30 years I've been working on this code. If I had the time, a full rewrite using standard code would be the best thing to do, but who has time to do that when a easier path can be found and there are more interesting new things to be done? So if the direct API calls don't clash with the use of stream files, that is probably the one I'll go with.

The 'binary' part was a hangover from DVF/CVF usage as I thought that the 'unformatted' option would write out a record length value to the file when it did the unformatted write. I just checked the output and found that doesn't appear to be the case for stream file output, so I can certainly drop that extension (Which begs the question: "just what is the difference between 'binary stream' and 'unformatted stream'?" Steve's comment seems to suggest that it's not a question that is easy to answer).

Steve, I was using XE_2011: package ID: w_fcompxe_2011.7.258 with VS2010 running on Vista 64. I just updated to XE_2013: package ID: w_fcompxe_2013.1.119, but encountered exactly the same behaviour: writing to the unformatted stream file after doing an fseek caused the file to be truncated. Using the win32 API call directly did not result in a truncate after the write. I'm updating to VS2012 to see if that has any impact.

The system I'm using is pretty bloated as it's had all sorts of software installed and uninstalled, so I'll test out the code on a new machine once VS2012 installs (as it needs Win7).

I do change a few flags in the configuration for the program:

132 columns
Use Bytes as RECL
/fpe:0

but all else is standard as far as I can tell. The command line looks like this:

/nologo /debug:full /Od /extend_source:132 /assume:byterecl /fpe:0 /module:"x64\Debug\\" /object:"x64\Debug\\" /Fd"x64\Debug\vc100.pdb" /traceback /check:bounds /libs:static /threads /dbglibs /c

PS I have just tested the code running under VS2012 on a Win7 machine and it still truncates the file on a write after an fseek.

John,
.
This might not be what you want to hear, but I have emulated the system dependent stream I/O, using standard conforming Fortran direct access, fixed length record files (reclen = 64kbyte), for buffering for variable length record, word addressible access. It is a surprisingly concise solution.
My experience has been that the performance penalty for Fortran Direct Access files in comparison to other system based access is not significant for large records and the buffering offers some performance improvement for small records. Windows 7 also offers a significant performance improvement over XP with it's improved buffering management.
Direct access files are re-writeable and so do not exhibit the problem you are identifying.
The documentation of Record Types shows that Stream access requires sequential file organisation only, which could imply that re-writable is not available ??, although why would you provide word addressing; perhaps for reading only ?
.
John

Hi John,

I understand the thrust of your comment as I have implemented something along the same lines for reading large dxf files. The details are probably different, but it was driven by a need to process the files quickly and the usual formatted reads of the time (probably about 10 years ago) weren't up to the task (especially as the files were sometimes in unix format, so reading them with standard code wasn't easy).

But I am reluctant to go the same route for the current code I'm working on as the use of the win32 api calls to replace the existing portability function calls looks like it will do the job with minimal changes required to the existing code. I realise this is short-term 'ism, but most of the code is pretty robust and has survived that way through quite a few changes in compiler vendors. My usual strategy when changing vendor is to do the work required to change any system dependencies and then move on to the real work. In this case I am finally migrating the code from CVF to Intel to get 64 bit. But the main reason for the work is that I have to migrate the user interface from VB6 to VB.Net. Compared to the change to .Net, the fortran problems are trivial!

Thanks
John

Leave a Comment

Please sign in to add a comment. Not a member? Join today