Which is faster to write Binary file or ASCII file?

Which is faster to write Binary file or ASCII file?

Hi,

I performed a very simple test and the results surprised me. First I wrote a very large number of data ( double percision numbers) on a text file. Then I wrote the same data on a Binary file. I measured the process time for each case and surprisingly writing on the text file was faster. Is that correct or I am doing something wrong? 

Thanks!

Binary file: 

CALL CPU_TIME(Time1) 
 OPEN( 1, FILE='Cyrus_In.bin', STATUS='UNKNOWN', ACCESS='STREAM')  //I want to use stream access. 
DO 200 I=1,NDOF 
 WRITE (1) (Number(I)) 
 200 CONTINUE
CLOSE(1) 
 CALL CPU_TIME(Time2) 
 Time3 = Time2 - Time1

Text file:

CALL CPU_TIME(Time1)
OPEN (1,FILE='Cyrus_In.txt',STATUS='UNKNOWN')  
DO 270 I=1,NDOF 
 WRITE (1,2760) (Number(I))      //2760  FORMAT(E25.15)
 270 CONTINUE
CLOSE(1)
CALL CPU_TIME(Time2) 
Time3 = Time2 - Time1



19 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de mecej4

Quote:

I am doing something wrong?

Yes: (i) do not use stream access unless you need it; (ii) write as much in a single WRITE statement as you can. In the example above, replace the DO loop by WRITE(1)Number(1:NDOF).

mecej4 Thanks for your quick and useful reply. I replaced the DO loop with one line WRTIE statement and it reduced the time process. But what if I have a more complicated DO loop like this: 

DO 275 I=1,N1
 DO 274 J=1,N2
 IF (S(I,J).NE.0) THEN 
 WRITE (1) I
 WRITE (1) J+I-1
 WRITE (1) (S(I,J)) 
 ENDIF
274 CONTINUE 
275 CONTINUE

 How can I make this more efficient? 

You could try the following changes for binary file. I'd expect that unless NDOF is very large, Time3 = 0

 CALL CPU_TIME (Time1) 
!
  OPEN ( UNIT=11, FILE='Cyrus_In.bin', STATUS='UNKNOWN',    &
         FORM='UNFORMATTED', ACCESS='SEQUENTIAL', IOSTAT=iostat) 
!
  WRITE (11) NDOF
  WRITE (11) Number(1:NDOF)
  CLOSE (11)
 !
  CALL CPU_TIME (Time2)
  Time3 = Time2 - Time1
 

Could I ask a question that has puzzled me for a long time:
This forum is "intel-visual-fortran-compiler-for-windows";   thats WINDOWS

Why doesn't it support windows file formats when using cut and paste ?

Some of us are Windows users

Does it really matter how much CPU time is consumed by these writes?  What about elapsed time, which might be much greater?  What device are you writing to? Did you use one of the methods to set buffered_io, as you would do for best performance?

Tim,

With Windows 7, default buffered I/O has been signficantly improved, in comparison to XP.
Do the available methods to set buffered_io still provide much of a performance improvement ?

I agree with your comment on CPU time. SYSTEM_CLOCK would be more relevant choice, although the precision might be a problem. Elapse_time might help.

John

Fichiers joints: 

Fichier attachéTaille
Télécharger elapse-time.f901.78 Ko
Portrait de Paul Curtis

It's much faster to ditch the fortran and use the WinAPI functions directly. As the example makes clear, there is zero intervening code or formatting or anything other than a block transfer directly from memory:


IF (WriteFile (ihandl,			&  ! file handle

			   loc_pointer,		&  ! address of data

			   nbytes,			&  ! byte count to write

			   LOC(nact),		&  ! actual bytes written

			   NULL_OVERLAPPED) == 0) THEN

  	!  Error writing file

END IF

Thanks for comments

Does it really matter how much CPU time is consumed by these writes?  What about elapsed time

TimP, it is the fisrt time that Iam trying to measure the process time and I have no experience in that. So you mean CPU time is not the process time and I have to use elapsed time?  if the CPU time in ASCII format is less than the Binary format then I think the elapsed time must be less as well. Am I right? Can you give me an example for elapsed time? 

Did you use one of the methods to set buffered_io, as you would do for best performance?

How can I do that? Please give me an example. Thanks!

Portrait de app4619

Quote:

Paul Curtis wrote:
It's much faster to ditch the fortran and use the WinAPI functions directly....

Interesting is much faster? I guess I would also (having read a little) need ReadFile, CreateFile and CloseHandle do you have the correct fortran interfaces for these routines, I check the standard includes and didn't seem to find them.

Portrait de jimdempseyatthecove

Before you go the distance to impliment WinAPI, I suggest you impliment the other FORTRAN suggestions first. Make some test runs, then determine if you should change your focus from I/O time to compute time improvements. Your focus should be on making the overall program run in the shortest amount of time.

Jim Dempsey

www.quickthreadprogramming.com

"With Windows 7, default buffered I/O has been signficantly improved, in comparison to XP.
Do the available methods to set buffered_io still provide much of a performance improvement ?"

ifort flushes each record by default when performing record oriented writes.  You will not get the advantage of of Win7 buffering unless you set bufffered_io.  So it's possible these options could make more difference than before Win7.  There are several such options, some applying to all file units (not to stdin/stdout) such as the compile option /assume:buffered_io or equivalent environment variable, as well as the "buffered" keyword for OPEN and equivalent environment variables.  When these are set, data goes out to the physical device as the buffers fill, or when the unit is closed or flushed.

Paul,

I would be interested to see the comparison tests to support your claims. I did tests about 2 years ago of a range of options for direct access file structures. The file sizes were from 0.1 gb to 8gb in size, with records about 64kb. 
The elapsed time results showed no significant difference between the I/O library alternatives, with the big changes coming when changing from XP to Win 7 OS, increased installed memory and also from HDD to SSD.
I found that staying with standard conforming Fortran was the best approach.
I think that one of the reasons is that, given the optimisation that is available in most Fortran I/O, the dominant time usage comes from the O/S management of disk transfers and buffering, which is outside of the fortran libraries. 
I'm sure that the selection of file and record size could change the result.

I'd also expect that ifort's buffered I/O options are less effective now with Win 7 and Win 8 than they might have been with Win NT or XP, although I've never tested these alternatives. What has changed in the last 20 years is that CPU rates had increased at a faster rate than disk rates, although SSD has changed this somewhat.

If you have an example that shows differently, I would be interested to see it.

John

Portrait de Paul Curtis

John,
I have not bothered with test scenarios comparing standard fortran file i/o with the WinAPI version, although that would not be difficult. I started using this approach a long time ago and have found that it is not only faster, but provides much more versatility. REWIND and BACKSPACE made sense with magtape (and I'm way old enough to have been programming back then), but have become more than a bit antique (risible, in fact).

When IVF compiles for Windows, fortran statements relating to i/o and memory allocation will be realized as sets of WinAPI calls, that's as atomic as one can get in Windows programming. It's not to difficult to see how fortran's formatted and record-oriented syntax would be resolved into fundamental API calls. But skipping all that and using the API calls directly cannot fail to be more efficient than having the compiler do the job. And this approach enables fortran i/o as a direct block-move of memory to/from any file, port or pipe structure which has a Windows handle, and is completely independent of the format in which data is represented in that memory block.

A set of sample routines illustrating WinAPI file i/o is attached.

Fichiers joints: 

Fichier attachéTaille
Télécharger win32-f90-filesubs.f9010.67 Ko

The test I performed was on Wndows7 . I did the same test on my XP computer and got completely different results! On XP binary file was faster than text file while on windows7 text file was faster. 

I read the comments but since I am not specilaized in computer I can not exactly understand them! Is there any simple explanation for that? Is there any simple way to make binary faster on windows7?

I am now using Elapsed time and also did the suggested changes.  

Portrait de Steve Lionel (Intel)

Look at the BUFFERED option for OPEN. I will comment that writing lots of small unformatted records is going to be less efficient than writing fewer, larger records.

Steve

Vahid,

This is the first case where I have heard that XP performs better than Windows 7 for fortran file I/O. This is not my experience.

John

Connectez-vous pour laisser un commentaire.