"Insufficient virtual memory" Error

"Insufficient virtual memory" Error

Imagen de pourmatin85

Dear guys;

I'm trying to run an FEM code of mine. The code works well until a certain number of input element (10,000). However, when I try to run the code with 11,000 elements this error appears:
forrtl: severe (41): insufficient virtual memory

Although my code is still very unoptimized since I use a regular solver (DGETRS) instead of sparse solvers, the cluster still has 4 GB of free memory when I run the code with 10,000 elements.

Is there any trick that I'm missing?
Regards
Hossein

publicaciones de 14 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.
Imagen de ArturGuzik
Quoting - pourmatin85 Dear guys;

I'm trying to run an FEM code of mine. The code works well until a certain number of input element (10,000). However, when I try to run the code with 11,000 elements this error appears:
forrtl: severe (41): insufficient virtual memory

Although my code is still very unoptimized since I use a regular solver (DGETRS) instead of sparse solvers, the cluster still has 4 GB of free memory when I run the code with 10,000 elements.

Is there any trick that I'm missing?
Regards
Hossein

Is this x64 cluster?

A.

Imagen de pourmatin85

yes it is.

Imagen de draceswbell.net
Quoting - pourmatin85 Dear guys;

I'm trying to run an FEM code of mine. The code works well until a certain number of input element (10,000). However, when I try to run the code with 11,000 elements this error appears:
forrtl: severe (41): insufficient virtual memory

Although my code is still very unoptimized since I use a regular solver (DGETRS) instead of sparse solvers, the cluster still has 4 GB of free memory when I run the code with 10,000 elements.

Is there any trick that I'm missing?
Regards
Hossein

I have most commonly seen this whenever I run past the 2 GB limit of the standard fortran model. Sometimes a simple cure is using the -mcmodel=medium option. If you are using allocatable arrays, you might check to verify that the sizes are correct. If you are using OpenMP and have large private arrays, this can also use up memory rather quickly since you will allocate multiple arrays concurrently.

Imagen de pourmatin85
Quoting - draceswbell.net I have most commonly seen this whenever I run past the 2 GB limit of the standard fortran model. Sometimes a simple cure is using the -mcmodel=medium option. If you are using allocatable arrays, you might check to verify that the sizes are correct. If you are using OpenMP and have large private arrays, this can also use up memory rather quickly since you will allocate multiple arrays concurrently.

Thanks for your reply. I checked "-mcmodel=" and it turned out that it's for linux only, is that right? However, I'm working in windows and I don't use OnepPM!

Any other suggestions?

Regards
Hossein

Imagen de pourmatin85
Quoting - pourmatin85 Dear guys;

I'm trying to run an FEM code of mine. The code works well until a certain number of input element (10,000). However, when I try to run the code with 11,000 elements this error appears:
forrtl: severe (41): insufficient virtual memory

Although my code is still very unoptimized since I use a regular solver (DGETRS) instead of sparse solvers, the cluster still has 4 GB of free memory when I run the code with 10,000 elements.

Is there any trick that I'm missing?
Regards
Hossein

Finally, I managed to use SCR format of sparse matrix and now I'm using PARDISO mkl sparse solver to solve my Ax=b equation. However, the problem with big matrices still remains!
When I try to run the program in x64 platform with 21,000 elements, the code stops the execution when it reaches calling PARDISO for the first time (phase 11) with this error:
program exception - access violation

And when I run it in win32 platform, the code execution just stops on the same place but with no error!!

Any ideas?

Imagen de ArturGuzik
Quoting - pourmatin85

Finally, I managed to use SCR format of sparse matrix and now I'm using PARDISO mkl sparse solver to solve my Ax=b equation. However, the problem with big matrices still remains!
When I try to run the program in x64 platform with 21,000 elements, the code stops the execution when it reaches calling PARDISO for the first time (phase 11) with this error:
program exception - access violation

And when I run it in win32 platform, the code execution just stops on the same place but with no error!!

Any ideas?

You have some allocatable component deallocated/not yet allocated. Most probably. Anyway, it looks as a programming issue.

A.

Imagen de pourmatin85
Quoting - ArturGuzik

You have some allocatable component deallocated/not yet allocated. Most probably. Anyway, it looks as a programming issue.

A.

Then why does it work for lower amount of input elements!!

Imagen de ArturGuzik
Quoting - pourmatin85
Then why does it work for lower amount of input elements!!

This is not a proof the code is correct. Following all your posts I have an impression that the code may have some issues with allocating/matching sizes. However, it's just a impression/guess. In all posts you always report that the code doesn't work with 11,000, then 21,000 etc. Is that coincidence, only, or something with odd (number of elements) sizes?

The strange one is that x64/IA32 error thing. Can you describe it more clearly (in more detail)? What are the linking lines (libs) in both configurations? Sometimes this kind of error(s) may have its source in mixed lib interfaces linked.

A.

Imagen de pourmatin85

for x64: mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib
for IA32: mkl_solver.lib mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib

The numbers were just an example, I'm pretty sure it's not about being odd or even.

Imagen de pourmatin85
Quoting - ArturGuzik Quoting - pourmatin85
Then why does it work for lower amount of input elements!!

This is not a proof the code is correct. Following all your posts I have an impression that the code may have some issues with allocating/matching sizes. However, it's just a impression/guess. In all posts you always report that the code doesn't work with 11,000, then 21,000 etc. Is that coincidence, only, or something with odd (number of elements) sizes?

The strange one is that x64/IA32 error thing. Can you describe it more clearly (in more detail)? What are the linking lines (libs) in both configurations? Sometimes this kind of error(s) may have its source in mixed lib interfaces linked.

A.

for x64: mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib
for IA32: mkl_solver.lib mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib

The numbers were just an example, I'm pretty sure it's not about being odd or even.

Imagen de ArturGuzik
Quoting - pourmatin85

for x64: mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib
for IA32: mkl_solver.lib mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib

The numbers were just an example, I'm pretty sure it's not about being odd or even.

Well, it looks strange. The linker setting is fine, and you should have no problem with MKL itself (for Ia32 setting you can omitt mkl_solver.lib as it's part of mkl_core.lib).
I saw your other post (on OCC solver) and I believe you don't need to use OOC version to manage 20,000 elements model (I assume that mentioning 20,000 x 20,000 matrix you mean model with 20,000 elements or DOFs. correct?). I was using DSS solver (from MKL) for solving my own FEM model(s) with +40,000 elements on 2 GB RAM IA32 WinXP system without problem. The sparse CRS format eliminates problems with large (original) matrix bandwidth and a need for nodes reordering.

The things to do/check would be:

(1) if the code is complex, make sure any (orginal) matrix is not left unnecessarily allocated
(2) check the input (say by printing, if debugging mode is not an option) before call to the MKL, and make sure you can access all array elements
(3) insert IMPLICIT NONE statements in all routines
(4) set explicit interfaces
(5) set: /check:[no]uninit and /Qtrapuv trying to catch errors in the code.

A.

Imagen de pourmatin85
Quoting - ArturGuzik

Well, it looks strange. The linker setting is fine, and you should have no problem with MKL itself (for Ia32 setting you can omitt mkl_solver.lib as it's part of mkl_core.lib).
I saw your other post (on OCC solver) and I believe you don't need to use OOC version to manage 20,000 elements model (I assume that mentioning 20,000 x 20,000 matrix you mean model with 20,000 elements or DOFs. correct?). I was using DSS solver (from MKL) for solving my own FEM model(s) with +40,000 elements on 2 GB RAM IA32 WinXP system without problem. The sparse CRS format eliminates problems with large (original) matrix bandwidth and a need for nodes reordering.

The things to do/check would be:

(1) if the code is complex, make sure any (orginal) matrix is not left unnecessarily allocated
(2) check the input (say by printing, if debugging mode is not an option) before call to the MKL, and make sure you can access all array elements
(3) insert IMPLICIT NONE statements in all routines
(4) set explicit interfaces
(5) set: /check:[no]uninit and /Qtrapuv trying to catch errors in the code.

A.

Thanks alot A.
My problem is solved. actually, my stiffness matrix had some diagonal zero elements!!

Imagen de ArturGuzik
Quoting - pourmatin85 Thanks alot A.
My problem is solved. actually, my stiffness matrix had some diagonal zero elements!!

Glad to hear that.

Keep in mind that in solver there is an option (I believe) to check pivots info. You should also verify that your compressed matrix has non-zero diagonal elements. Then it would be apparent that your stiffness matrix is ...not a stiffness matrix.

A.

Inicie sesión para dejar un comentario.