Serious memory leak when writing data to a binary file

Serious memory leak when writing data to a binary file

Hi, all:

I'm having a serious memory leak problem on the WRITE statement in Intel Visual compiler version 12. As demonstrated in the following simple code, I'm trying to write about 400M binary data to a file in direct access, unformatted form. If I compile the code and run the program. I noticed that the Performance Tab in the Windows Task manager shows that the system memory usage is increased by about 400M. After a couple of minutes, the memory usage decreased and returned to the time before running the progam. So I called it a memory leak. This behavior is not accepatble in my real application because I usually write out several gigbyte of data and it uses up my system physcial memory and my application becomes very slow.

It seems that when writing to the file, Windows system caches the data in the memory first and keeps data for a while and then Release it. How can I disable this caching behavior in OPEN/WRITE statements? Or is this a compiler bug in IVF?

Thanks in advance.

Ben

program WriteMemoryLeak

integer :: ios
      
character(36) :: fil

integer, dimension(512) :: p
integer :: k, i
    fil='IntelWriteTest.dat'      
      
    open( 8, file=fil(:len_trim(fil)), access='direct', form='unformatted', recl=kind( 1. )*512, status='unknown', buffered='NO', iostat=ios ) ! DEVWARN: PORTABILITY -- again, this may not be a Good Thing

    do m = 1, 200000
       p=m
       ! write( 8, rec=4+m-1, iostat=i ) p
       call writefile(m, p, 512)
    enddo

end program WriteMemoryLeak

subroutine writefile(k, p, n)
integer :: k, n
integer, dimension(n) :: p

integer :: i

       write( 8, rec=4+k-1, iostat=i ) p

end subroutine writefile

publicaciones de 37 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.
Imagen de Steve Lionel (Intel)

I tried your program with the 12.1 compiler and did not see any memory growth while the program ran - either for the process itself or for the system as a whole.  I will comment that your multiplying the RECL by KIND(1.) will make the file four times larger than it should be, since, by default, RECL= is in 4-byte units.  Either remove the multiply or compile with /assume:byterecl.  I would also comment that using KIND this way is very non-portable, since KIND numbers do not necessarily correspond to element size.  You could use the SIZEOF() intrinsic extension.  We support the F2008 C_SIZEOF function from ISO_C_BINDING in our next release coming soon.

If Windows is holding on to memory, there's not much we can do about it, but I don't see such an effect when I try it.

Steve

Hi, Steve:

Thanks for your reply. I compiled my program with IVF 12.1.5.344 and saw the serious memory leak. It's very strange but good to hear that you do not have the memory leak. If possible could you please email me the whole solution file in a zip file? I think that some setting of the project is causing this memory leak. And I would like to try your solution file. My email address is pinnacleman98@gmail.com.

Another thought is that it could be Windows 7 that causes this problem. What operating system are you using?
BTW, we already noticed the portanbility issue on the kind(1.0). We kept it there for historic reason.

Thanks,

Ben

Imagen de Steve Lionel (Intel)

I did not use a solution - I just compiled your source from the command line, using 12.1.5.344 and Windows 7 and no switches. I saw no growth at all in process or memory use during the run of the program.

The KIND is not just portability - it will make your file 4 times bigger than it should be, unless you enabled the "byterecl" option.

Steve

Hi, Steve:

Thanks. So what command do you use to compile my program? Can you post it here? If you did not see memory leak but we do, then there must be something different in using IVF, given same IVF version and wIndows 7. I would like first duplicate what you are doing on my machine and figure out what's wrong in my project settings.

We do enable the "byterecl" option in our project setting and the size of our output data file is correct.

Thanks,

Ben

Hi, Steve:

I figured out how to use command line to compile my demo code. But again I still see the increase of the memory usage about 400M in the Performance Tab in the Windows Task Manager. I don't why.

Ben

Imagen de Steve Lionel (Intel)

I just used "ifort" with no options. Tried it again with /assume:byterecl and I see that the memory usage does go up a bit less than 400MB the first time I run it, but that's not what I'd call a leak as it just goes there and stays while the program is running.  If there was a leak I'd expect the memory usage to keep growing as the program runs.

When I run the program a second or third time, I don't see the system memory use increasing. I also note that Windows claims the process is using just a bit over 1MB, which definitely lets out anything in the compiler or run-time library.  I guess Windows is allocating some internal memory for the operations - don't really know.

Steve

Ben,

I tried and modified your example but did not see any memory leakage. I put a pause in your program, so I could see the process in the Processes tab of task Manager. It runs very quickly !
You might like some of the alternative OPEN I used.

program WriteMemoryLeak

  character(36) :: fil
  integer, dimension(512) :: p
  integer :: k, i, ios
!
    fil='Mem_Leak.log'      
!      
!    open ( 8, file=fil(:len_trim(fil)), access='direct', form='unformatted', recl=kind( 1. )*512,    &
!           status='unknown', buffered='NO', iostat=ios ) ! DEVWARN: PORTABILITY -- again, this may not be a Good Thing
!
    open ( unit     = 8,              &
           file     = fil,            &
           access   = 'direct',       &
           form     = 'unformatted',  &
           recl     = size(p),        &
           status   = 'unknown',      &
           buffered = 'NO',           &
           iostat   = ios ) 
    write (*,*) 'Opening ', trim(fil),' status =',ios
!
    do m = 1, 200000
       p = m
       ! write( 8, rec=4+m-1, iostat=i ) p
       call writefile (m, p, 512)
    end do
!
    write (*,*) 'end of test : to leave process live in Task Manager'
    read  (*,*) m
!
end program WriteMemoryLeak

subroutine writefile (k, p, n)
!
  integer :: k, n
  integer, dimension(n) :: p

  integer :: i

    write ( 8, rec=4+k-1, iostat=i ) p

end subroutine writefile

Steven:

It seems that you finally saw what i meant for the 'memory leak'. I forgot to mention to you. If you run the program many times with very short interval time between each run, you only see one-time memory increase. But if you wait  long enough, may be couple of minutes,  after the previous run is finished (monitor the Performance Window to let the memory decrease to the level before the run), and you run the program again, you will see the memory leak again. In other words, if you run the program with very long interval time, you will see the memory leak every time you run the program.

Now I think that this problem is related to the file caching feature of the Windwos system. In C++, I can program the code to disable the file caching and flush the data directly into the binary file. But In Fortran, we do not have such control on the file caching. We have to reply on IVF compiler to provide us this capability on the OPEN/WRITE statements.

Ben

Imagen de David White

Can you periodically use the FLUSH statement to cause the data to be written from the cache to disk?

David

Imagen de Steve Lionel (Intel)

How do you do it in C++?  I will repeat that this is not a leak - it is simply using more memory than you would like.  A leak indicates memory allocated and then lost track of.

Steve

Steve:

I agree that it is not a memory leak. But our customers reported it as memory leak because it is our software product that used up all their memory when they write out a few gigbyte of data.

Here is the link in Microsoft MSDN that discusses the file caching and how to disbale it http://msdn.microsoft.com/en-us/library/windows/desktop/aa364218%28v=vs..... Basically when you open the file for access, you can set a flag to turn the file caching off for that particular file. I think that when Intel implemented the OPEN statement in IVF, you might ignore that feature. I hope that you can add this capability in as soon as possible if it has not been implemented. Right now, we are stuck with this "memory leak" problem.

Thanks,

Ben

David:

I tried both FLUSH(8) and call FLUSH(8), both failed to solve the problem. So I don't know what is really causing this problem?

Ben

Imagen de Steve Lionel (Intel)

You can use a USEROPEN routine to set that flag for when the file is opened.  Look up USEROPEN in the documentation.

Steve

Steve:

I tried USEROPEN as you suggested and set the FILE_FLAG_WRITE_THROUGH flag. I found that the problem is still there and the writing speed is much slow, which is expected as there is no caching anymore. But the strange thingg is the memory usage is still slowly increase as writing to the file continues. It seems that Windows is still caching the data. I'm really puzzled by this problem. Any new ideas on what's happening?

BTW, does any other customer report or notice this problem before?

Ben

Imagen de Steve Lionel (Intel)

I have not seen other reports of this problem.  In the tests I did, memory usage was steady throughout the program - indeed, I saw available pages actually go up a bit during the run.

Steve

This memory leakage is disk cacheing of file I/O. It has become much more noticeable with windows 7. I've only noticed a significant improvement in performance when the OS uses this approach. I do not know how to turn it off, or would I want to. Closing the file would flush the buffers and possibly reduce the buffering by the OS. The OS is estimating that this memory is best used for this purpose.John

Hi, all:

Thanks you all for your kind replys. The problem is solved! The trick is to pass/set FILE_FLAG_NO_BUFFERING flag in the USEROPEN function. Note passing/setting FILE_FLAG_WRITE_THROUGH  flag does not work. I noticed that the writing speed is a little bit slower compared to normal writing, but not significant. I'm not concerned too much on the writing speeding, comparing the memory usage. Also I found  avery useful memory monitoring tool called RAMMAP.exe, which can be downlaoded from microsoft, http://technet.microsoft.com/en-us/sysinternals/ff700229.aspx. This tool shows all cached memory for each individual file plus a lot  of other information.

But now I'm having a calling convention problem in my real application. I
think I can figure that out by myself. Here is the final working code
that does not have te 'memory leak'.

program WriteMemoryLeak  
      
      character(36) :: fil  
      integer, dimension(512) :: p  
      integer :: k, i, ios  
      
    EXTERNAL UOPEN
      
        fil='WriteMemoryLeak.dat'
        recl_len = size(p)
        open ( unit     = 8,              &  
               file     = fil,            &  
               access   = 'direct',       &  
               form     = 'unformatted',  &  
               recl     = recl_len,        &  
               status   = 'unknown',      &  
               buffered = 'NO',           &  
               iostat   = ios, useropen=UOPEN )   
        do m = 1, 200000  
           p = m
            
           call writefile (m, p, 512)  
        end do  
    !  
        write (*,*) 'end of test : to leave process live in Task Manager'  
        close(8)
    !  
    end program WriteMemoryLeak  
      
    subroutine writefile (k, p, n)  
    !  
      integer :: k, n  
      integer, dimension(n) :: p  
      
      integer :: i  
      logical :: commit_result
      
        write ( 8, rec=4+k-1, iostat=i ) p  

      
    end subroutine writefile  
    
    INTEGER FUNCTION UOPEN( FILENAME,      &
                           DESIRED_ACCESS, &
                           SHARE_MODE,     &
                           A_NULL,         &
                           CREATE_DISP,    &
                           FLAGS_ATTR,     &
                           B_NULL,         &
                           UNIT,           &
                           FLEN )
    !DEC$ ATTRIBUTES C, ALIAS:'_UOPEN' :: UOPEN
    !DEC$ ATTRIBUTES REFERENCE :: FILENAME
    !DEC$ ATTRIBUTES REFERENCE :: DESIRED_ACCESS
    !DEC$ ATTRIBUTES REFERENCE :: SHARE_MODE
    !DEC$ ATTRIBUTES REFERENCE :: CREATE_DISP
    !DEC$ ATTRIBUTES REFERENCE :: FLAGS_ATTR
    !DEC$ ATTRIBUTES REFERENCE :: UNIT
    USE IFWIN
    IMPLICIT INTEGER (A-Z)
    CHARACTER*(FLEN) FILENAME
    TYPE(T_SECURITY_ATTRIBUTES), POINTER :: NULL_SEC_ATTR
    
    ! Set the FILE_FLAG_WRITE_THROUGH bit in the flag attributes to CreateFile( )! (for whatever reason)
    ! FLAGS_ATTR = FLAGS_ATTR + FILE_FLAG_WRITE_THROUGH   ! does not work
    FLAGS_ATTR = FLAGS_ATTR + FILE_FLAG_NO_BUFFERING
    ! Do the CreateFile( ) call and return the status to the Fortran rtl
    STS = CreateFile( FILENAME,             &
                      DESIRED_ACCESS,       &
                      SHARE_MODE,           &
                      NULL_SEC_ATTR,        &
                      CREATE_DISP,          &
                      FLAGS_ATTR,           &
                      0 )
     UOPEN = STS  
     RETURN   
     END

I am surprised that you need to resort to this approach. I have done some testing of file I/O performance on XP, XP_64 and Win7_64, with HDD and SSD. My experience has been that Win 7 will utilise free memory much better (and take much more) than XP does. However I have not seen this cacheing adversely affecting other processes.
You might need to find out what hardware profile your user is comparing. Certainly more memory or using a SSD can significantly improve disk and then response performance.
My impression has been that Win7 does a much better job of buffering I/O. I would not expect turning this off is the best solution. This post should be about "Useful memory leak when writing to a binary file".
There must be more to your user's perceived problem.

John

Hi Ben, I'm a little bit late since the problem is already solved... Congratulations!

Quoting intel@breault.com...Thanks you all for your kind replys. The problem is solved! The trick is to pass/set FILE_FLAG_NO_BUFFERING flag in the USEROPEN function. Note passing/setting FILE_FLAG_WRITE_THROUGH  flag does not work. I noticed that the writing speed is a little bit slower compared to normal writing, but not significant... 
This is absolutely expected and such tricks, like writing some huge amounts of data without caching, always affect performance.

I simply wanted to join a "Not-A-Memory-Leaks" group and your problem really didn't look like a memory leaks.

Please take a look at a screenshot and this is an example of a very fast memory leaks followed by an application crash and
then Windows released all allocated memory:

Best regards,
Sergey

Imagen de Steve Lionel (Intel)

You mentioned calling convention problems.  You must make sure that your USEROPEN routine uses the default (C) convention, and not STDCALL, even if the rest of the program uses STDCALL.  This applies to any kind of "callback" routine - the calling convention must match the expected default. (In the case of Windows API routines, of course, that's STDCALL.) For USEROPEN and IMSL passed functions, always use the ifort default convention.

Steve

I have tried the same code for opening a large direct access file but without complete success. UOPEN seems to work if I leave out the statement:

FLAGS_ATTR = FLAGS_ATTR + FILE_FLAG_NO_BUFFERING

But with that statement included, the first READ from the file gives me IOSTAT=39 (severe error on read).

I am using Composer 2013.2.149.

How large is the file? The error code 39 matches to a system error The disk is full ( on Windows platforms ) and it is very strange. Please provide more details.

Imagen de iliyapolak

>>>Also I found  avery useful memory monitoring tool called RAMMAP.exe, which can be downlaoded from microsoft, http://technet.microsoft.com/en-us/sysinternals/ff700229.aspx. This tool shows all cached memory for each individual file plus a lot  of other information.>>>

I am bit late and I am glad thsat your problem has been solved.I wanted to recommend you a M.Russinovich RamMap tool for tracking memory consumption in the system.For finding memory leak you can use umdh tool from windows debugging tools.UMDH will provide you also a call stack of the offending thread(s) and option to track memory usage as a function of time.More advanced option could be usage of windbg invasive break when you could put a breakpoint on Heap freeing and destroying routines.

Link://blogs.msdn.com/b/ntdebugging/archive/2012/04/26/troubleshooting-memory-leaks-with-just-a-dump.aspx

Sergey,

I wish it was that simple. The file is less than 10MBytes on a disk with 350GBytes free.

Chris

Imagen de app4619

Where is the file located? W7 puts restrictions and does some strange things with some folders e.g. "Program Files"

The file is on my D: drive which is all user file space. The system (including Program Files) is on C:

Why do you want to turn buffering off ?
I would use standard fortran direct access I/O and FLUSH when required.

John

It appears that Windows allows the memory used by buffering to grow to fill the machine so that eventually the system becomes unresponsive. The odd thing is that the total memory in use increases but the memory used by the application doesn't. "intel@breault.com" reports that using USEROPEN fixes the problem for him. I am asking for more information so I can discover why it does not work for me in very similar circumstances.

Found in the small print of CreateFile() a restriction that the record length should be an integer multiple of the volume sector size. So, having changed my code to use a slightly larger record size which satisfies that criterion, I can now switch buffering off and memory does not grow.

>>...It appears that Windows allows the memory used by buffering to grow to fill the machine so that eventually
>>the system becomes unresponsive...

Chris,

Thereare no any unknowns here and you need to check Virtual Memory ( VM ) settings using Systems applet from Control Panel. In one of my VM test I was able to allocate ~1.95GB of memory on a computer with 128MB of physical memory ( 32-bit Windows 2000 Professional ). It is a special system that allows to simulate embedded environment with limited system resources.

Sergey,

I think you ight have missed the point of this story. No memory is being explicitly allocated. The size of the process does not grow. But with a standard OPEN (i.e without the USEROPEN option) the total system memory usage grows until the system becomes unresponsive.

After implementing USEROPEN with FILE_FLAG_NO_BUFFERING, I no longer see any increase in the total system memory usage.

Chris

Imagen de jimdempseyatthecove

Too much candy gives one a bellyache.

www.quickthreadprogramming.com

I think the problem could be a clash between the operating system's disk cacheing and ifort's buffering.
I did tests on another compiler, comparing the buffering performance between XP and Win 7. With Win 7, there was a significant increase in the amount of (unused) memory being allocated to disk cacheing and at times appeared to be taking too much memory, especially for files larger than 2gb. However, the net performance was significantly improved. There is a sweet spot for disk cacheing when the active scope of the file being used is less than the total memory installed. (It was also my impression when first comparing Win 7 to XP that the too much memory was being taken for cacheing, as is the memory leakage claim of this post)
There could be a conflict between the operating system cacheing and ifort's standard fortran BUFFERED='YES'. However I would again expect that there is a net improvement in performance.
My earlier testing (on another compiler) showed that for standard Fortran direct access files, their performance was just as good as the direct system routines that have been suggested in this discussion. Direct access fixed length record files should not pose an efficiency problem.
I have also been surprised by the poor performance of ifort's BUFFERED='NO' and I am surprised this is the default.
Perhaps Intel could check if there is a clash between the OS disk cacheing in Win7 (and Win8, which I have not tested) and BUFFERED='YES'. Having multiple layers of buffering can be counter productive.

John

I could not get UOPEN to work, but I did test fortran I/O with and without BUFFERED='YES'.
Someone might like to add the UOPEN option to the attached test and supply their run time results.
When runing task manager with this test, the memory usage for disk cacheing is clearly evident, but is only using the vacant memory pool.
The attached test was run on Win7, 12gb memory and 128gb SSD. 
The elapsed times for the different options are a bit mixed; not as I would have expected.
Writing as Buffered=yes and 1.6gb size appears to be surprisingly slow.
There is a run time difference between CLOSE and CLOSE (status='delete')

I ran alternatives of:
- with and without BUFFERED='YES'
- 400mb and 1,600mb file size
- rewriting an existing file or a new file
It would be good to see the UOPEN option performance. Probably requires a change to the compile and run batch file.

If you are getting a problem with cacheing and 400mb files, how much memory is installed ?

John

ps: I have attached an updated version of the test, to allow easier modification of the UOPEN option, if someone can help.

Adjuntos: 

Imagen de iliyapolak

>>>It appears that Windows allows the memory used by buffering to grow to fill the machine so that eventually
>>the system becomes unresponsive.

By buffering do you mean using a cache manager to store in memory recent file I/O operations?

Imagen de iliyapolak

For troubleshooting you can use cacheset utility.For more advanced usage it is possible to use kernel debugger(in order to confirm responsibility for large memory allocation) and put breakpoints on memory manager allocation routines.One of such a routine will be MmAdjustWorkingSetSize which is responsible for trimming working set.I think that perfmon counters like AsyncCopyRead and LazyWriter needs to be monitored when your fortran application performs disk I/O operation.

Inicie sesión para dejar un comentario.