Big Size of Matrix error in Fortran

Big Size of Matrix error in Fortran

Imagen de allexberg

Hi everybody;
I'm not computer expert, but during working with my thesis which is about civil Eng., got serious problem with my program seemingly about big size of matrices. I'm working with meshes and using Fortran to analyze entire of my mesh including nodes and elements. So, I have some variables, vectors and matrices. The biggest matrix has around 61,000x61,000 arrays. During debugging of my program, following common message and highlighted ERROR are appearing every time:

--------------------Configuration: Ali-0 - Win32 Debug--------------------
Compiling Fortran...
D:\\Thesis\\Fortran programming\\Copy of shin yokohama fortran\\Ali-0.for
D:\\Thesis\\Fortran programming\\Copy of shin yokohama fortran\\Ali-0.for(25) : Error: A common block or variable may not exceed 2147483647 bytes
& nfix(5000),r1(61000),sk(61000,61000)
----------------------------------^
Error executing df.exe.

Ali-0.exe - 1 error(s), 0 warning(s)

I know the reason is beacasue size of my matrix in bytes outnumbers maximum allocatable bytes for each matrix. I have no choice to use this matrix. I've searched the web for resons and solutions. It seems that this error relates to ram and operating system as well.
I have 2 system 1- Window 32 bit 2GB ram, other Windows 64 bit, 6.00 GB.
Could anybody help me resoling this problem!

publicaciones de 46 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.
Imagen de mecej4

Do a "back of the envelope" calculation. The array sk requires about 15 GB in single precision and twice that in double precision. These requirements are not reasonable for the hardware available to you.

It is probably true that you do not need dense matrices since matrices arising from discretization of differential equations are sparse. However, you will have to rewrite your programs to exploit that sparseness. MKL provides linear algebra routines that are suited to sparse matrix manipulations.

Imagen de allexberg

Thank you so much mecej4 for your comment;
even if I want to use spareness of the above SK(61000, 61000) matrix, it could be densed to a matrix of SK(61000, 30500) which is still needs lots of bytes.
I've heard there is another solution using so called "Gfortran" as well as Math Kernel Library (MKL). I'm completely new to both of them and litteraly have no idea what are they, how can I use them in my case.
Do you know which one is better to stick with, I mean, which one is much straight forward to use in my case?
Can you please explain a little bit more about how can I use MKL, or is there any simple instruction which shows me how to use MKL with fortran??
Thanks

Imagen de mecej4

Instead of relying on dubious rumours, seek guidance from a knowledgeable source.

> even if I want to use spareness of the above SK(61000, 61000) matrix, it could be densed to a matrix of SK(61000, 30500) which is still needs
lots of bytes.

There is no perceptible basis for that statement, and there are many situations where we know that statement not to be true.

> Can you please explain a little bit more about how can I use MKL, or is
there any simple instruction which shows me how to use MKL with
fortran??

Take courses on numerical mathematics/numerical analysis/computing.

Imagen de jimdempseyatthecove

"common block or variable may not exceed 2147483647 bytes"

If the array is in COMMON or module, make the array allocatable and perform the allocation in an init subroutine you call at the start of the program.

If the variable is in subroutine/function scope use the heap arrays option.

(this size of allocation will only work in x64 app)

Jim Dempsey

www.quickthreadprogramming.com
Imagen de Sergey Kostrov
Quoting allexberg ...The biggest matrix has around 61,000x61,000 arrays...

[SergeyK] In case ofSingle-Precision 61,000x61000 matrix you need

61,000*61,000*4 = 14,884,000,000 = 13.86GB of memory

--------------------Configuration: Ali-0 - Win32 Debug--------------------

[SergeyK] It is impossible to allocate 13.86GB of memory for one application on a32-bit Windows platform.

...
...\Ali-0.for(25) : Error: A common block or variable may not exceed 2147483647 bytes
...

I have 2 system 1- Window 32 bit 2GB ram, other Windows 64 bit, 6.00 GB.

[SergeyK] You need to use a 64-bit platform with a large amount of physical memory, for example 16GB. You couldtry to
increase a Virtual Memory size up to 16GB on the 6GB platform but performance of yourapplicationwill be significantly affected.

Since your matrix is very largeI would recommend to useDouble-Precision data type to improve accuracy
of your calculations. In that case a 64-bit platform with 32GB of physical memoryhas to be used.

Best regards,
Sergey

Imagen de allexberg

Thanks Jim Dempsey and Sergey Kostrov for your helpful comments;

Sergey Kostrov, you told I have to increase virtual memory up to 16 GB. I checked it. Amount of space available for virtual memory of my platform seems to be 521GB. (I checked it from following address:
Control panel\system\ Advanced system setting\advanced\performance setting->advance tab->virtual memory). Then, I set it to 64GB which already was only 6GB.
I was just trying to test new condition with simple small program with allocatable but large array the same previous sk(61000,61000) matrix, I had before. After running this simple program with 64GB vitual memory, I got another warning like:
".exe has triggered a breakpoint ".
This happens exactly at allocation line of my code.
When I decrease the size of my array, everything is okay. The problem is obviously due to the big size of my array again.

During reading in related Intel forums and other sources; there were some suggestion about involving with stack and heap arrays concepts, as well as increase stack size.
According to the Intel posts, default stack size is around 1MB.
If I want to increase it
1- Is there any idea how much do I have to increase stack size for my case (In the case of sk(61000,61000) matrix) ?
2- What is the maximum stack limit if there is, for example for windoms 64 platform?

Imagen de Tim Prince

Win32 default thread stack size is 1MB, but you didn't give any indication why you mention that here. This is distinct from the stack size which you set in /link /stack: or by editbin, which has a much larger default. Intel libraries, such as you get with MKL, boost the thread stack size default to 2MB for win32 and 4MB for X64, and provide function calls and environment variables to control it. All probably irrevelant to the subject you have been discussing.

Imagen de anthonyrichards

I think you need to rethink the size of your arrays, which appear directly related to the 'fineness' of your mesh.
you are clearly asking for way too much than can be catered for on your system, without rolling stuff in and out from disk.

Do you really need the fineness that requires 61000 divisions?

Are the effects you are modelling likely to have a range covering the whole width of your mesh?

If the effects are likely to be confined to a local area, then you should produce local meshes using much fewere points.

If the range of the effects you are modelling reach over the whole physical space spanned by your mesh (i.e. have long 'wavelengths'), then I suggest reducing the mesh fineness, as variations over small regions are likely to be small and could eventually be filled in using interpolation from a coarser mesh.

Imagen de jimdempseyatthecove

Is your build configuration Win32 or x64?
If Win32 then you are building a 32-bit application (which will run on an x64 platform but with the 2/3 GB memory restrictions).

To convert to 64-bit in VS click on the pull-down arrow on the platform (Win32), then New, then you should get or find x64, also choose to import settings from Win32.

Rebuild and run your test program.

Jim Dempsey

www.quickthreadprogramming.com
Imagen de Sergey Kostrov
Quoting allexberg ...After running this simple program with 64GB vitual memory, I got another warning like:
".exe has triggered a breakpoint ".
This happens exactly at allocation line of my code.

[SergeyK] I understood that it still can't allocate a memory block for61,000x61,000xSizeof(your-data-type) matrix.
My question is: Did you try to execute a 32-bit program on the 64-bit platform?

When I decrease the size of my array, everything is okay. The problem is obviously due to the big size of my array again.

[SergeyK] Please, give exact numbers. Is it fora 32-bit application or 64-bit application?

During reading in related Intel forums and other sources; there were some suggestion about involving with stack and heap arrays concepts,
as well as increase stack size.
According to the Intel posts, default stack size is around 1MB.

If I want to increase it
1- Is there any idea how much do I have to increase stack size for my case (In the case of sk(61000,61000) matrix) ?

[SergeyK] Let me answer in as generic as possible way:

it has be greater than 61,000x61,000xSizeof(your-data-type)

2- What is the maximum stack limit if there is, for example for windoms 64 platform?

[SergeyK] I thinka couple of TBs( it could besomething like 0xFFFFFFFFFFFFFFE0 ).

You need tobuild a 64-bit version of your program and I would recommend to use a memory from the heap instead of
from the stack. Please post updates on your progress.

Best regards,
Sergey

Imagen de John Campbell

My understanding of this problem is if you use ALLOCATE in a 64-bit version, then there should be no need to consider the heap or stack size.
My preference is to declare the array as allocatable inside a module, such as:

Module Big_Array
real*8, allocatable, dimension(:,:) :: sk
end module Big_Array

If the size of array "sk" is significantly larger than the available physical memory, but less than the size of the pagefile.sys, then expect a long wait for the program to run.
You will benefit from many of the old programming techniques that suited paged memoryin the 70's and 80's,as demonstrated by the run time difference between

do i = 1,n
do j = 1,n
sk(i,j) = 0
end do
end do

compared to

do j = 1,n
do i = 1,n
sk(i,j) = 0
end do
end do

With n = 61,000, my estimate is this is a 28 gb array.
You will probably find that any approach that considers either sparsity, symmetry or the banded nature of the array will significantly reduce both the storage demand and the run time. Look for these savings.

John

Imagen de jimdempseyatthecove

For what its worth. Yesterday I changed the properties on my Windows 7 x64 from 16GB to 128GB. Then ran a test program

real(8) :: skd(61000,61000)
real :: sk(61000,61000)
...
allocate(skd(61000,61000), STAT=i)
(success)
do i=1,61000
skd(i,i) = 0.0
end do
(success)
allocate(sk(61000,61000), STAT=i)
(fail)
??? I do not know why this failed, I think the error code was 47, but I am not sure

Due to the above (working part) filling the diagonal of skd, the Virtural memory commit remained under the 16GB. So I changed the above to

do j=1,6100
do i=1,61000
skd(i,j) = 0.0
end do
end do

When page file started expanding, it never quit (after 2 hours runtime). I had to reboot my system.
I have not re-run the test since.

I've felt Windows Virtual Memory system was severely lacking when configured for applications with memory requirements many times larger than physical memory.

BTW I've written two virtual memory operating systems myself, and other than following John's advice of efficiently sequencing your loop controls, you can get relatively good performance out of an application much larger than physical memory.

Jim Dempsey

www.quickthreadprogramming.com
Imagen de Sergey Kostrov
Quoting jimdempseyatthecove For what its worth. Yesterday I changed the properties on my Windows 7 x64 from 16GB to 128GB. Then ran a test program
...
I do not know why this failed, I think the error code was 47...

Jim,

Do you think it was returned by aGetLastError Win32 API function? I'm interested to see a screenshot, if possible.

By the way, just checked MSDN andfor the GetLastError function the code 47 is not listed at all.

I wonder ifthe code 47 is some internal errorcode ofthe Fortran compiler?

Best regards,
Sergey

Imagen de John Campbell

Jim,

Like you, I have also felt that Windows Virtual Memory (WVM)is severely lacking.
However, it is very difficult to provide an objective test.
I developed most of my understanding of virtual memory using Prime and Vaxcomputers. I considered Prime's virtual memory to be superior to any other I have used (certainly more flexible than Vax), although the expectation we have on disk response for paging has changed a lot in 20 years.
From what I can remember, the paging space in 1980 was 2 platters of a 300mb CDC drive or about 40 MB, which is a lot less than the 128 GB you are testing today.
My present approach is to never use paging while the program is running and make sure the memory footprint of the program is less than 80% of the physical memory installed.
My "out of core" algorithms that I have to address larger problems are far superior to WVM.

For those few lazy times when you want a quick solution and rely on WVM, it is a bit of a worry to hear it doesn't work. I experienced a similar problem with Win2k when resorting to WVM years ago, when installed memory was much less. (Even then, it wasquicker to develop an out-of-core approach, than solve the problems of windows paging.)

It would be good to know the reason that it has failed this time, as these few bad examples can lead to a view of WVM that may not be valid. Past experience was that identifying thereal error was very elusive, so I resorted to a solution I knew would lead to an answer.

Alex should be looking at a solution that can run in the available physical memory.

John

Imagen de allexberg

I appreciate all you guys;
>Tim: "Win32 default thread stack size is 1MB, but you didn't give any indication why you mention that here. This is distinct from the stack size which you set in /link /stack: or by editbin, which has a much larger default. Intel libraries, such as you get with MKL, boost the thread stack size default to 2MB for win32 and 4MB for X64, and provide function calls and environment variables to control it. All probably irrevelant to the subject you have been discussing."

Thanks Tim for your advice......

>anthonyrichards: "Do you really need the fineness that requires 61000 divisions?
Are the effects you are modelling likely to have a range covering the whole width of your mesh?"

Yeah, actually, that matrix of Skis just one of my matrices, but the biggest. Even if I want to shrink my mesh,still it will be too large to get the errro again, and another issue is that unfortunately it is not my only array. I have similar arrayand matrices in my program.This is 3D mesh of Soil mediumfor tunnelling simulation.

>Jim Dempsey:"Is your build configuration Win32 or x64?

If Win32 then you are building a 32-bit application (which will run on an x64 platform but with the 2/3 GB memory restrictions). To convert to 64-bit in VS ......."

>SergeyK: "Please, give exact numbers. Is it for a 32-bit application or 64-bit application?"

I am using Visual studio 2008. My build ConfigurationI was Win32 (and, at the moment, I was using my windows 64bit system). So, yeah,ckecked it from Project Property\platform\win32. Next step I changed it to 64x from the same window\configuration managment\.. and changed Active solution platformto 64x.

>[SergeyK] "Let me answer in as generic as possible way, it has be greater than 61,000x61,000xSizeof(your-data-type):"


I changed theactive platform to 64X and there was no problem, and mention again that I increase my virtual memory up to near 100 GB. I was able to run the program up to the half way. All of the sudden another error. The error on the console window was:
forrtl: server (157): program exception-access violation
message on the pop up window (Microsoft visual studio) was:

unhandled exception at 0x000000014000da49 in(name of my file.exe): 0xc0000005: Access violation writing location 0x0000000000000004

I checked the program itself, there was no problem about allocation, I think it agan relate to the memory.
Any thought on this??

It is worthy to mention that inmy program, I have many variable, arrays, and matrices, but most of them are small size. The biggest one was the SK matrixI mentioned which I manged to shrink to Sk(55000,8000). But, I think as the program goes on,volume of used memory is increasing up to the point that facing no memory. So, I tried to makemy big size arrays allocatable including Sk. But, still got above message. I'm not sure is this error becauase of memory or an internal error of program!!
Thanks.

Imagen de Les Neilson

It is likely that you have an array bounds being exceeded.
In Project -> Properties -> Fortran -> Run-time I suggest that you turn on "Generate Traceback Information" and "Check Array and String Bounds".
The traceback will hopefully tell you which routineis failing and the line number, and if it is a bounds error, the array and value of the subscript.
Once you have fixed the problem you can turn the checks off again.
(Personally I prefer to leave traceback on, so if a client should ever get one of these errors I can get more info of where the problem lies. YMMV)

Les

Imagen de allexberg

Neilson thanks,
Both of the options youmentioned were default "on" when I checked them.

Imagen de jimdempseyatthecove

John,

>>My present approach is to never use paging while the program is running and make sure the memory footprint of the program is less than 80% of the physical memory installed

Assume for the moment you have an application data requirement that is several/many times that of physical memory.

Solution 1:

Rewrite code to use file I/O

Solution 2:

Rewrite code to assure array indexes (in do/for loops) are ordered favorably for virtual memory access

Note, Solution-2 may require no changes, had you originally organized your loops in a paging favorable manner (which by the way is also cache favorable w/rt TLB).

When some rewrite for Solution 2 is necessary, it generally requires little effort.

Should access be totally random, then Solution 1 is not a solution either.

Jim Dempsey

www.quickthreadprogramming.com
Imagen de jimdempseyatthecove

>>Access violation writing location 0x0000000000000004

Writing into first 4KB of memory is generally indicative of a NULL reference/pointer.
This can be due to:

incorrect calling (args do not match between caller/callee)
failing to allocate
calling library (C) functions with incorrect calling parmeters
writing to stack based array,outside of bounds, and trashing a pointer/reference/array descriptor.

In Debug build, you usually can find the location in your source where the error occured (Call Stack). When error occurs in library function then sometimes the call stack does not expose the caller as source+line.

Jim Dempsey

www.quickthreadprogramming.com
Imagen de allexberg

Jim;
I changed the active platform to 64X, andincreased my virtual memory up to100 GB. I was able to run the program through the half way. All of the sudden, another error happended. The error on the console window was:
forrtl: server (157): program exception-access violation

and onpop up window:

unhandled exception at 0x000000014000da49 in (name of my file.exe): 0xc0000005: Access violation writing location 0x0000000000000004

Here is the part of main program and part of subroutine. Error isocuuringin the bold line of subroutine formk.
_______________________________________________________________

c main program

implicit none

integer::np,ne,nb,ndf,ncn,nld,nmat,nszf,nstep

integer::imat(16250),nbc(3900),nfix(3900)

real::ort(10,2),r1(55250),x(8),y(8),z(8)

real::detj,coord(8,3),dn(3,8),ajm(3,3),ajmi(3,3),estifm(24,24)

real::shapef(8),bm(6,24),db(6,24),dnxy(3,8),dmat(6,6),s(3,8)

integer::i,j,k,l,n,m,ii,jj,kk,nband,istep

real::anu,comm,d11,d12,d44

real,allocatable::cord(:,:)

integer,allocatable::nop(:,:)

real,allocatable::sk(:,:)

.
.
.

call formk(nszf,ne,nb,nbc,nfix,ncn,nop,ndf,cord,imat,ort,estifm,sk,x,y,z)

.
.
.
end

c subroutine
subroutine formk (nszf,ne,nb,nbc,nfix,ncn,nop,ndf,cord,imat,ort, estifm,sk,x,y,z)

implicit none

integer::ne,nb,ndf,ncn,nld,nszf,AllocateStatus3

integer::imat(16250),nbc(3900),nfix(3900)

real::ort(10,2),estifm(24,24),bm(6,24),x(8),y(8),z(8)

integer::i,j,k,l,n,m,ii,jj,kk,nband,nrowb,ncolb,ncol,nx,nr,icon

real,allocatable::cord(:,:)

integer,allocatable::nop(:,:)

real,allocatable::sk(:,:)

.
.
.

nband=6000

ALLOCATE (sk(nszf,nband), STAT = AllocateStatus3)

IF(AllocateStatus3 /= 0) STOP "*** Not enough memory ***"

.

.
.

do 500 n=1,nb !(Error occurs here)

nx=10**(ndf-1)

i=nbc(n)

nrowb=(i-1)*ndf

do 490 m=1,ndf

nrowb=nrowb+1

icon=nfix(n)/nx

if(icon)450,450,420

420 sk(nrowb,1)=1.0d0

do 430 j=2,nband

sk(nrowb,j)=0.0d0

nr=nrowb+1-j

if(nr)430,430,425

425 sk(nr,j)=0.0d0

430 continue

nfix(n)=nfix(n)-nx*icon

450 nx=nx/10

490 continue

500 continue

return

end

_______________________________________________________________
In subroutine formk, when it comes to Do number 500, n=1 roundis okay, and the loops goes on, on the next round, exactlybefore n=2, I'm gettingthe error.

unhandled exception at 0x000000014000da49 in (name of my file.exe): 0xc0000005: Access violation writing location 0x0000000000000004.

I've checked the program several times,but can't find the probleam!!!

Imagen de Sergey Kostrov
Quoting allexberg ...
>SergeyK: "Please, give exact numbers. Is it for a 32-bit application or 64-bit application?"

I am using Visual Studio 2008. My build ConfigurationI was Win32 (and, at the moment, I was using my windows 64bit system). So, yeah,ckecked it from Project Property\platform\win32. Next step I changed it to 64x from the same window\configuration managment\.. and changed Active solution platformto 64x.
...

I'm ready toimplement for youa very simple 64-bitapplication in C ( source codes will be provided) that willallocate as
much as possible memoryfrom the Heap and on the Stack.

Are you interested to spend some time on it and to test capabilities,related to memory allocation,of your
64-bit Windows platform?

Once again, everything will be very simple and I think it could help to understand how tosolve your main problem.

Best regards,
Sergey

Imagen de allexberg

Sergey;
I acknowledge your assistance.
You know I'm working with Fortran Not C.
But, anyway, yeah, I'm intrested to know more about memory allocation, and the way it works ......
How do you implement it?

Thanks
Alex

Imagen de Sergey Kostrov

Hi Alex,

I'll do it over the weekend andI truly believe that it could help you with understanding ofallocation huge memory blocks. Your case isvery interesting!

A test-casewill be implemented in C (as small as possible)and I remember that you do a programming with Fortran.

Best regards,
Sergey

Imagen de Steve Lionel (Intel)

As others have noted, you need to use dynamic allocation to exceed 2GB on Windows, even 64-bit Windows. Please see Memory Limits for Applications on Windows for more information.

Steve
Imagen de Sergey Kostrov
Quoting Steve Lionel (Intel) As others have noted, you need to use dynamic allocation to exceed 2GB on Windows, even 64-bit Windows. Please see Memory Limits for Applications on Windows for more information.

A quote from the article:

...The practical limit is about 1.75GB due to space used by Windows itself...

The limit could be push upto 1.95GB for very clean, in terms of DLLdependencies, and smallutilities. A dependency list in that case looks like:

kernel32.dll
msvcrt.dll

or

kernel32.dll
user32.dll
msvcrt.dll

Imagen de Sergey Kostrov

Hi everybody,

Alex has a very challenging case and he needs todo processing with 61,000 x 61,000 matrix. I've implemented a very
simple and configurabletest application ( MemTestApp )and it allows to allocate different amounts of memory.
Take into account, that forthe 61,000 x 61,000matrix of adouble-precision data type ofat least~28GB of memory isneeded.

Steve Lionel (Intel)tested the application on his 64-bit Windows platform andapplication allocated 8GB of memory.

I wonder if somebody elsecould trythe test applicationon a 64-bit Windows platform?

A zip-archive with sources is enclosed. This is a Visual Studio 2005 projectandthe test application could be
built for Win32 and x64 platforms.

Please take a look atReadme.txt file since it has some Release Notes.

Best regards,
Sergey

Adjuntos: 

AdjuntoTamaño
Descargar MemTestAppv2.zip6.1 KB
Imagen de Les Neilson

Running (the debug x64 version) on Vista Ultimate 64bit OS with 8Gb RAM and 8Gb Virtual Memory

with TEST_MEMALLOC_1
The allocationsworked until "Failed to Allocate 13.86GB"

with TEST_MEMALLOC_3 and _MXSIZE = 16384
I got stack overflow on first allocation 0.25Gb

Les

Imagen de Sergey Kostrov
Thank you, Les!

Quoting Les Neilson Running (the debug x64 version) on Vista Ultimate 64bit OS with 8Gb RAM and 8Gb Virtual Memory

with TEST_MEMALLOC_1
The allocationsworked until "Failed to Allocate 13.86GB"

[SergeyK] Did you try to increase a size of VMto 16GB, or to32GB, or to64GB?

with TEST_MEMALLOC_3 and _MXSIZE = 16384
I got stack overflow on first allocation 0.25Gb

[SergeyK] Did you try to uncomment _TEST_HEAPSTACK_RESERVECOMMIT_VALUES_03 macro?
( It is in CM1 section. )

There are two response already( from Steve and Les )and I can see that the testapplication is working.

Alex, do you have any updates on your progress?

Best regards,
Sergey

Imagen de Les Neilson

Sergey
No to both questions.
I just had time to do a quick run through before I had to resume my normal day.
I will try later.

Les

Imagen de John Campbell

I thought that a simple Fortran example of allocating large arrays might help.
This example progressively allocates larger arrays and reports the status. By running this in conjunction with task manager, you may be able to identify the use of both physical and virtual memory.
If you change the loop parameters on line 3 (?) you can achieve different results, depending on the size of physical memory installed and virtual memory allowed from your page size setting.

Virtual memory can be changed by runing Control Panel > SYSTEM
selecting
Advanced system settings
> Advanced
> Performance Settings...
> Advanced
>Virtual memory Change...

The program is:

! simple program to progressively allocate large arrays

!

      do k = 31000, 61000, 5000

         call sk_test (k)

      end do

      end
subroutine sk_test (n)

!

      real,   allocatable, dimension(:,:) :: sk1, sk2, sk3, sk4

      integer stat, n, m

!

      write (*,1001) ' Performing memory test for n = ',n

1001  format (//a,i0/)

!

      m = n/4

!

      allocate ( sk1(m,n), stat=stat)

      call use_sk ('sk1', sk1, m, n, stat, 1)

!

      allocate ( sk2(m,n), stat=stat)

      call use_sk ('sk2', sk2, m, n, stat, 2)

!

      allocate ( sk3(m,n), stat=stat)

      call use_sk ('sk3', sk3, m, n, stat, 3)

!

      allocate ( sk4(m,n), stat=stat)

      call use_sk ('sk4', sk4, m, n, stat, 4)

!

      end
      subroutine use_sk (sk_name, ski, n, m, stat, k)

!

      character sk_name*(*)

      integer   n, m, stat,k,  i,j

      real      ski(n,m)

      integer*8 :: total = 0
      integer*8 :: bytes, addr

      real*8    :: gb

      character looks*15

      external  looks

!

      write (*,1001) sk_name, n, m, stat

1001  format ('Array allocation : ',a,'(',i0,',',i0,') allocated : status = ',i0)

      if (stat /= 0) then

         write (*,*) ' error status on allocation'

         return

      end if

!

      addr  = loc(ski)

      bytes = size (ski)

      bytes = bytes * 4

      write (*,*) 'Allocation size    = ', looks(bytes)

      write (*,*) 'Allocation address = ', looks(addr)

      do j = 1,m

        do i = 1,n

           ski(i,j) = i+j

        end do

      end do

!

      if (k==1) total = 0

      total = total + bytes

      gb    = total / 1024./1024./1024.

      write (*,1002) 'Ski has been initialised',looks(total),' bytes          = ', gb,' gb'

1002  format (a,a,a,f0.3,a)

      end
      character*15 function looks (bytes)

      integer*8 bytes

      character aa*12, bb*15

      write (aa,fmt='(i12)') bytes

      bb = ' '

      do i = 3,12,3

         j = (i/3)*4

         if (aa(i-2:i) /= ' ') bb(j-3:j) = aa(i-2:i)//','

      end do

      looks = bb

      end

         

Hopefully this is a simple example of using large amounts of virtual memory. The maximum size that can be used is a combination of the amount of physical and virtual(paged) memory available. I successfully ran 13.9gb of array with 12gb of memory and 12gb of paging.

If you start with do loop parameters on line 3 that fit your physical memory then change to ones that are bigger than either physical or virtual memory, you will notice a significant slow down and hopefully not a crash. Larger sizes again should lead to a crash. I tested this on XP_64 and I'd expect that Win7_64 would perform better.
I changed line 3 to "do k = 31000, 66000, 5000" and it slowed down.
Youneed toidentify the difference between program/system crashing and very slow paging performance, due to the large size of disk transfers (in my case,12gb at about 30mb per second on HDD)
Hopefully this will demonstrate my earlier recomendation of not using more memory than is physically installed.

John

Imagen de Sergey Kostrov
Quoting John Campbell ...Hopefully this is a simple example of using large amounts of virtual memory. The maximum size that can be used is a combination of the amount of physical and virtual(paged) memory available. I successfully ran 13.9gb of array with 12gb of memory and 12gb of paging.

[SergeyK] Hi John, Thank you! Did you try to increase the VM size to 32GB, or 64GB, for example?

...I tested this on XP_64 and I'd expect that Win7_64 would perform better...

[SergeyK] VM Managers on boths platforms are highly optimized. I successfully tested a test case when
a ratio between VM and PM ( Physical Memory )was 16.

By the way, I looked at MSDN and it clearly says that a 64-bit process coulduse up to 8TB of memory.

Best regards,
Sergey

Imagen de John Campbell

Sergey,

I was hoping that the example I provided could demonstrate how using virtual memory is such a poor solution option, expecially with a hard disk drive. Much better options are to either buy more memoryor rewrite the solution to mimise (package) the disk I/O in a more efficient way.
At present I am using 2 pc's. One has a HDD and running VM on it is a waste of time. The other has SSD, but the disk capacity is not enough to significantly increase the VM size. Neither of these pc's wereconfigured with the expectation of 64gb program footprint, so running VM to 64gb is not a good solution.
With memory prices now less than $10 / gb, if you need to run a problem of that size, buy more memory.
You are right to identify that large memory size programs can be used, although having devoted a lot of my programming effort over many years to manageing the available memory, it looks too easy to just scale up the problem size.

John

Imagen de Sergey Kostrov
Quoting John Campbell Sergey,

I was hoping that the example I provided could demonstrate how using virtual memory is such a poor solution option,
expecially with a hard disk drive. Much better options are to either buy more memory...

Icompletely agree with you. In my case forthe 64-bit softwaredevelopment a system with32GB, or more, RAMwill be considered.

Best regards,
Sergey

Imagen de allexberg

Dear Sergey;
I'm realy sorry for not being able to answer sooner than this. Was too busy!
Any way, TRULY appreciateyou and all others forthe constructive comments.

Iworked a bit more on your program for allocation.
So,my virtual memory had set to 100 GB,while workingon 64 bit platform. I was able to allocate up to 64GB on "TEST_MEMLLOC_01"of your program.

64-bit Windows platform
Allocating memory with 'malloc' CRT-function - 1D
Succesfully Allocated 64GB
Press ESC to Exit...

Friend of mine just told me that I had to invoke the program from its origin ..../64x/Release/MemTestApp.exe
so doing that got no error, seems that all of64GB has been allocated onvirtual memory.

Imagen de Sergey Kostrov
Quoting allexberg ...
So,my virtual memory had set to 100 GB,while workingon 64 bit platform. I was able to allocate up to 64GB...

Hi Alex,

Thank you for the update! So, your system is configured andyou need todo a similar test(s) with a Fortran application.
I think you're gettingcloser to resolve your problem.

Best regards,
Sergey

Imagen de Sergey Kostrov

Quoting allexberg ...
So,my virtual memory had set to 100 GB,while workingon 64 bit platform. I was able to allocate up to 64GB
...

Hi Alex,

I wonder if you could post a screenshot of the Windows Task Manager ( property page 'Performance')
afterallocation of64GB was done. I'm veryinterested to see it. Also, how long does it take to allocate 64GB?

Best regards,
Sergey

Imagen de John Campbell

Alex,

I'm not sure if you used the fortran example I provided. If you are able, could you change the code line:
do k = 31000, 61000, 5000
to a larger size, say
do k = 40000, 130000, 10000

It would be interesting to see how the program performs as it approaches the amount of physical memory installed and how it then continues, as it approaches the 64 gb.

Also changing the order of the do loop in use_sk would be interesting.
change to
do i = 1,n
do j = 1,m
ski(i,j) = i+j
end do
end do
This should have a significant run time penalty as it approaches the physical memory limit.
I don't know if ifort would optimise the loops to improve the virtual memory performance ?

My latest PC now has some good and bad features.
Good is it has a solid state drive, so much faster paging.
Bad is it is only 128gb in size so I can't configure a 64gb paging file and a 12gb hibernate file.
If you are planning to have a large paging file, you need to make sure there is sufficient disk space.

Hope the examples have helped you understand the benefits and limits of virtual memory.

John

Adjuntos: 

AdjuntoTamaño
Descargar ski2.f954.17 KB
Imagen de Sergey Kostrov
Quoting Sergey Kostrov ...A zip-archive with sources is enclosed. This is a Visual Studio 2005 projectandthe test application could be
built for Win32 and x64 platforms...

A version 3 of the 'MemTestApp' that could be used to test allocation of large blocks of memory is attached.

I finally tested the application on a 64-bit Windows 7 andthe applicationpassed all tests.

Here are some technical details:

     OS : Windows 7 64-bit Home Premium

     CPU: AMD ( 4 cores )

     RAM: 6GB

      VM: 96GB initial size  128GB maximum size

     APP: MemTestApp64.exe  Release configuration
     Attempts to allocate memory:
      0.25GB - Allocated

      0.50GB - Allocated

      1.GB   - Allocated

      2.GB   - Allocated

      4.GB   - Allocated

      8.GB   - Allocated

     14.GB   - Allocated ( Target for a 61000x61000 Single-Precision matrix )

     16.GB   - Allocated

     28.GB   - Allocated ( Target for a 61000x61000 Double-Precision matrix )

     32.GB   - Allocated

     64.GB   - Allocated

Adjuntos: 

AdjuntoTamaño
Descargar MemTestAppV3.zip6.15 KB
Imagen de Sergey Kostrov

This is a follow up and pleasetake a look at a 'Matrix Size-to-Memory Required' tables:

     Matrix Size          Memory Required

     Single-Precision
     131072x131072        64GB

      92681x 92681        32GB

      65536x 65536        16GB


and

     Matrix Size          Memory Required

     Double-Precision
      92681x 92681        64GB

      65536x 65536        32GB

      46340x 46340        16GB

Imagen de Sergey Kostrov

This is a follow up.

Just verified how MemTestApp64.exe works on a system with Windows 7 Professional and Intel CPU ( 16GB of physical memory / VM: 96GB initial size \ 128GB maximum size ):

0.25GB - Allocated
0.50GB - Allocated
1.GB - Allocated
2.GB - Allocated
4.GB - Allocated
8.GB - Allocated
14.GB - Allocated ( Target for a 61000x61000 Single-Precision matrix )
16.GB - Allocated
28.GB - Allocated ( Target for a 61000x61000 Double-Precision matrix )
32.GB - Allocated
64.GB - Allocated

A screenshot will be provided for cases 4GB, 8GB, 16GB, 32GB and 64GB.

Imagen de Sergey Kostrov

Here is a screenshot of the Windows Resource Monitor for cases 4GB, 8GB, 16GB, 32GB and 64GB:

Imagen de Sergey Kostrov

Here is a screenshot of the Windows Resource Monitor for cases 16GB, 32GB, 64GB, and extreme cases 128GB, 160GB and 192GB:

A project with updated version 4 of 'MemTestApp' will be enclosed.

Imagen de Sergey Kostrov

>>...A project with updated version 4 of 'MemTestApp' will be enclosed...

Here it is. Thanks for using it and let me know if you have any questions.

Best regards,
Sergey

Adjuntos: 

AdjuntoTamaño
Descargar memtestappv4.zip6.64 KB
Imagen de Sergey Kostrov

This is a follow up on a statement made by John Campbell on Mon, 07/02/2012 at 19:55

Hi John,

>>...
>>Also changing the order of the do loop in use_sk would be interesting.
>>change to
>>do i = 1,n
>>do j = 1,m
>>ski(i,j) = i+j
>>end do
>>end do
>>This should have a significant run time penalty as it approaches the physical memory limit.
>>I don't know if ifort would optimise the loops to improve the virtual memory performance?

In extreme cases when data doesn't fit into memory even a magic "Nano-Optimization" of some compiler ( Fortran, C++, etc ) won't help. A Virtual Manager of OS will be so busy with swapping memory blocks that an overall performance of some application will be significantly affected. The only solution is to add more physical memory ( DIMMs are cheap now ).

Imagen de Sergey Kostrov

Here are a couple of notes related to allocation of memory on the stack ( also known as automatic allocation ):

- For 'float' ( single-precision ) floating-point data type maximum size of the matrix is ~23170x23170

- For 'double' ( double-precision ) floating-point data type maximum size of the matrix is ~16383x16383

- In cases when size of the matrix exceeds the 2GB limit on 32-bit or 64-bit platforms (!) a C/C++ compiler or a Fortran compiler could fail to compile sources. For example, MS C++ compiler fails with an Error C2148: Total size of array must not exceed 0x7FFFFFFF bytes or with an Error C1126: Automatic allocation exceeds 2G. Ifort fails with the Error: A common block or variable may not exceed 2147483647 bytes

0x7FFFFFFF = 2147483647 bytes

Inicie sesión para dejar un comentario.