How can I write to binary file faster?

How can I write to binary file faster?

Аватар пользователя vahid s.

I have a the following code for wrtiting to a binary file:

CALL system_clock(Time1, rate) 
 OPEN( 1, FILE=Test.bin', STATUS='UNKNOWN', ACCESS='STREAM')
 DO 275 I=1,NDOF 
 DO 274 J=1,UBW 
 IF (S(I,J).NE.0) THEN 
 WRITE (1) I
 WRITE (1) J+I-1
 WRITE (1) (S(I,J)) 
 ENDIF 
 274 CONTINUE 
 275 CONTINUE
 CLOSE(1)
 CALL system_clock(Time2) 
 print *, "elapsed time: ", real(Time2-Time1) / real(rate)

I know by using less WRTIE statement I can make it faster. So inside the loop I am using the following code and it is faster:

IF (S(I,J).NE.0) THEN 
WRITE (1) I, J+I-1,  (S(I,J)) 
ENDIF

Is there any way to get ride of the loop (since it is time consuming) or make any other change to have a more efficient code? 

Please note that I want to have the order of I, J+I-1 and S(I,J) ( only non zero values) in my writing. Also since I am using a C++ program to read the binary file I have to use stream access.

Any suggestions are greatly appreciated. 

9 сообщений / 0 новое
Последнее сообщение
Пожалуйста, обратитесь к странице Уведомление об оптимизации для более подробной информации относительно производительности и оптимизации в программных продуктах компании Intel.
Аватар пользователя John Campbell

I converted your code sample into a test of a number of file OPEN options:
 OPEN ( UNIT=11, FILE='Test1.bin', STATUS='UNKNOWN', ACCESS='STREAM')
 OPEN ( UNIT=11, FILE='Test2.bin', STATUS='UNKNOWN', FORM='UNFORMATTED')     ! 3 writes
 OPEN ( UNIT=11, FILE='Test3.bin', STATUS='UNKNOWN', FORM='UNFORMATTED')     ! 1 write
 OPEN ( UNIT=11, FILE='Test4.bin', STATUS='UNKNOWN', ACCESS='DIRECT', RECL=rec_len, FORM='UNFORMATTED')
 OPEN ( UNIT=11, FILE='Test5.txt', STATUS='UNKNOWN')
 OPEN ( UNIT=11, FILE='Test6.bin', STATUS='UNKNOWN', FORM='UNFORMATTED', BUFFERED='YES')

I ran them on Win7_64 with an SSD drive and got very poor write times using:
Intel(R) Visual Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1.5.344 Build 20120612

For most cases the elapsed time was very poor, although BUFFERED='YES' was reasonable:
 elapsed time: 0 Define S      0.1090000   
 elapsed time: 1 stream         27.78400   
 elapsed time: 2 unformatted    28.08000   
 elapsed time: 3 unformatted    9.719000   
 elapsed time: 4 direct         9.796000   
 elapsed time: 5 formatted      12.21500   
 elapsed time: 6 buffered       1.435000   

I tried an alternative compiler for some of the options and got:
Program entered 
 elapsed time: 0 Define S        0.192800   
 elapsed time: 1 transparent      1.16340   
 elapsed time: 2 unformatted      1.22750   
 elapsed time: 3 unformatted     0.732800   
 elapsed time: 4 direct           8.56580   
 elapsed time: 5 formatted        2.02770   

Quite a surprise ?  Perhaps I've missed something in the Intel OPEN ?

Both compilers were poor for DIRECT, so I tried a single type variable for the I/O list, but still did not work. Again a surprise.
I would have expected option 4 to be the best, then option 3 (ie 6), although I typically use DIRECT with much bigger records.

The code is attached, so if anyone as an alternative OPEN or WRITE, let us know

John

ps: I tried DIRECT and BUFFERED and got 1.061 seconds. Thankfully that makes sense.
Why is BUFFERED='YES' not the default ? It should not be too hard to test when a subsequent READ or REWIND occurs, although wouldn't they use the buffer also ?

Вложения: 

ВложениеРазмер
Скачать stream-test.zip2.62 КБ
Аватар пользователя Quba

I obtained on my PC (without SSD) little bit better results than in case 6 with

 OPEN ( UNIT=11, FILE='Test8.bin', STATUS='UNKNOWN',RECL=rec_len, FORM='BINARY', BUFFERED='YES')

Аватар пользователя John Campbell

Too many things to check !

I corrected the ifort compile options in the attached batch file I was using, and introduced BUFFERED=NO/YES for all cases and now the results are more as expected. Again BUFFERED='YES' should be used and probably should be the default, unless I am missing something else. The latest attachment gives the revised results.

 Tests for BUFFERED=NO
     elapsed time: 1 stream           24.14700
     elapsed time: 2 unformatted      24.64700
     elapsed time: 3 unformatted      8.704000
     elapsed time: 4 direct/buffer    8.595000
     elapsed time: 5 formatted        21.45000
  Tests for BUFFERED=YES
     elapsed time: 1 stream          0.8730000
     elapsed time: 2 unformatted      1.139000
     elapsed time: 3 unformatted     0.7180000
     elapsed time: 4 direct/buffer   0.4050000
     elapsed time: 5 formatted        1.810000   

In summary, the expected result is:
binary is better than text (if the program to read the file is generated by the same compiler)
fewer big records are better than lots of small ones
access='DIRECT' is best if fixed length records can be adopted.
BUFFERED='YES' should be selected ( although this is non-standard fortran)

One of the goals of Fortran 90/95 was to improve portability of standard conforming code and a significant focus was file I/O. Defaulting BUFFERED='NO' is probably one step backwards.

John

ps : as this is a "intel-visual-fortran-compiler-for-windows" note windows; why can't cut and paste accept windows format text ?
This appears to be the most active of the Intel software development forums, but we continue to be subjected to a forum environment which tells us we should be Linux C users.

Вложения: 

ВложениеРазмер
Скачать streamb.zip2.01 КБ
Аватар пользователя Paul Curtis

You might try separating your sorting loop from the output activity, and then use the much faster direct API calls to write the file (see my post to your previous forum thread).

TYPE triplet
    INTEGER :: i
    INTEGER :: ij
    INTEGER :: sij
END TYPE triple
TYPE(triplet), ALLOCATABLE   :: output(:)
INTEGER         :: k
INTEGER(HANDLE) :: ihandl
ALLOCATE(output(ndof))
k = 0
DO i = 1, ndof
    DO j = 1, ubw
        IF (s(i,j) /= 0) THEN
            k = k + 1
            output(k)%i   = i
            output(k)%ij  = i+j-1
            output(k)%sij = s(i,j)
        END IF
    END DO
END DO
ihandl = open_the_file ('test.bin'c, 'W') 
CALL rw_file ('W', ihandl, k*SIZEOF(output(1)), LOC(output(1)))
CALL close_file (ihandl)
DEALLOCATE (output)

Аватар пользователя vahid s.

Setting  BUFFERED to YES is a key point. Thanks John Campbell!

Аватар пользователя John Campbell

You can thank Steve for recomending BUFFERED. It certainly has some effect.

If you are going to read the information back into the same program, binary is the best approach.
However, if you are going to read the information into another program, I would recomend using text. I have found that intermediate text files are much more convenient, as they are easier to read or import into Excel, especially if you need to check the contents.

The examples above show that there is not a big time penalty for this. The attached example is of a simple .csv format with maximum precision and minimum size. If you run it you should see a minimal penalty for the .csv formatting.

John

Вложения: 

ВложениеРазмер
Скачать stream-text.f901.24 КБ
Аватар пользователя Tim Prince

The default choice of buffering for some cases of direct access was changed with the xe2013 compilers.  Now it's necessary to set buffered='yes' or one of the other alternatives such as /assume buffered_io if you want the faster performance.

I guess the behavior without buffered_io is a hold-over from the time when there was no Fortran standard FLUSH so this was the only portable way to make each record visible outside the program immeidately after write.

Аватар пользователя Steve Lionel (Intel)

It's a bit more complicated than that. The way buffering was done changed - that change helped some and hurt others. The release notes has a discussion of this. But buffering was not the default before.

Steve

Зарегистрируйтесь, чтобы оставить комментарий.