Very large structs GCC ok, ICC seg fault, simple example.

Very large structs GCC ok, ICC seg fault, simple example.

We have a very large structure ~10GB. GCC does not seem to have a problem with it but ICC is seg faulting. I've boiled it down to simple assignment statements. Any ideas?  Files attached. If we don't allocate but declare the structure it works fine. Any problems with Intel allocating large amounts of memory?  We've tried moving things around in the structure definition but it continues to seg fault.

icc -mcmodel=medium -shared-intel -g  test.c -o test
gcc -mcmodel=medium -g test.c -o test

icc: Version 12.1.5.339 Build 20120612
gcc: gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)

GCC output:
Allocate the grid struct
sizeof of grids 9.393871GB
Address of grids->G2.rain.count 0x7f5d05143190
Address of grids->G2.d0.count 0x7f5d61d35190
Address of grids->G2.Nt.count 0x7f5d96d2d190
Address of grids->G2.dBZm.count 0x7f5f079a0190
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x7f5f079a0190
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7f5d61d35190
Going to assign d0.count
assigned G2.d0.count
Going to assign d0.mean
assigned G2.d0.mean
Going to assign d0.stdev
assigned G2.d0.stdev
Successful write

Intel ICC:
Allocate the grid struct
sizeof of grids 9.393871GB
Address of grids->G2.rain.count 0x7fc9e070f190
Address of grids->G2.d0.count 0x7fca3d301190
Address of grids->G2.Nt.count 0x7fca722f9190
Address of grids->G2.dBZm.count 0x7fcbe2f6c190
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x7fcbe2f6c190
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7fca3d301190
Going to assign d0.count
Segmentation fault

AttachmentSize
Downloadtext/x-csrc test.c1.98 KB
Downloadtext/x-chdr tk-3dpr-hdf5.h10.29 KB
50 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Did you set maximum stack limit?

If you want to avoid generated code bloat, you may require -O1 or -Os.

Hi everybody,

>>... If we don't allocate but declare the structure it works fine...

I understood the problem as follows:

- On a 64-bit system with Linux OS and at least 16GB of physical memory the application ( compiled with ICC ) crashes if ~10GB memory block is dynamically allocated with a CRT-function malloc?

- On the same system the application works as expected if the memory block is allocated from the stack.

Please confirm that you're on a 64-bit system.

Note: Just in case please check Virtual Memory settings. For example, two values are usually used, that is, Minimal Size and Maximum Size. So, in your case the Minimal Size has to be greater than 10GB and a recommended value is 16GB. Then, set Maximum Size to 24GB. But, I'm not sure that it will fix the problem with Intel C++ compiler. Anyway, it makes sense to try.

There is a bug in the codes. The processing continues even if a memory for the structure is not allocated and grids pointer is NULL:

...
printf( "Allocate the grid struct\n" );
grids = ( L3DPR_GRIDS * )malloc( sizeof( L3DPR_GRIDS ) );
if( grids == NULL )
{
printf( "Error return from malloc\n" );
return ( int )-1; // Added by SergeyK: Exit if the memory is Not allocated
}
...
printf( "Address of grids->G2.rain.count %p\n",grids->G2.rain.count );
...

Note: I could verify your codes with Intel C++ compiler for Windows. Please confirm if you're interested in that.

Have you run a simple test to verify if sizeof returns 32-bit or 64-bit?

printf("%d\n", sizeof(sizeof(L3DPR_GRIDS ) ));

Jim Dempsey

 

www.quickthreadprogramming.com

Thanks for the responses. I did add the exit if malloc failed but it always did succeed as the printf was never executed.
I also added the printf of the sizeof pointer and it is 8 bytes.  Still same situation, GCC works, ICC fails. New code and output attached. As far as I can tell this is pretty basic and no clue why Intel compiler is doing this.  I also added output from an older 11.1 version of ICC (Version 11.1    Build 20100806 Package ID: l_cproc_p_11.1.073) check out the addresses of some of the elements of the structure (they are the same address!!)  This makes no sense to me.

Info on machine:
cputime      unlimited
filesize     unlimited
datasize     unlimited
stacksize    unlimited
coredumpsize unlimited
memoryuse    unlimited
vmemoryuse   unlimited
descriptors  1024
memorylocked unlimited
maxproc      266240

Linux  2.6.32-220.17.1.el6.621g0000.x86_64 #1 SMP Wed May 16 19:27:42 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

GCC output:
Allocate the grid struct
sizeof of grids 9.393871GB
sizeof pointer to grids 8
Address of grids->G2.rain.count 0x2b8154cad190
Address of grids->G2.d0.count 0x2b81b189f190
Address of grids->G2.Nt.count 0x2b81e6897190
Address of grids->G2.dBZm.count 0x2b835750a190
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x2b835750a190
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x2b81b189f190
Going to assign d0.count
assigned G2.d0.count
Going to assign d0.mean
assigned G2.d0.mean
Going to assign d0.stdev
assigned G2.d0.stdev
Successful write

ICC output: Version 12.1.5.339 Build 20120612
Allocate the grid struct
sizeof of grids 9.393871GB
sizeof pointer to grids 8
Address of grids->G2.rain.count 0x2b3a9a790190
Address of grids->G2.d0.count 0x2b3a6d022010
Address of grids->G2.Nt.count 0x2b3a6d022010
Address of grids->G2.dBZm.count 0x2b3a6d022010
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x2b3a6d022010
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x2b3a6d022010
Going to assign d0.count
Segmentation fault (core dumped)

Version 11.1    Build 20100806 Package ID: l_cproc_p_11.1.073
Allocate the grid struct
sizeof of grids 9.393871GB
sizeof pointer to grids 8
Address of grids->G2.rain.count 0x7fd3e3e2b190
Address of grids->G2.d0.count 0x7fd3b66bd010
Address of grids->G2.Nt.count 0x7fd3b66bd010
Address of grids->G2.dBZm.count 0x7fd3b66bd010
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x7fd3b66bd010
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7fd3b66bd010
Going to assign d0.count
Segmentation fault

Attachments: 

AttachmentSize
Downloadtext/x-csrc test.c2.06 KB
Downloadtext/x-chdr tk-3dpr-hdf5.h10.29 KB

>>... I did add the exit if malloc failed but it always did succeed as the printf was never executed...

I will do verification on a 64-bit WIndows 7 Professional and I hope that Intel software engineers will be able to verify the test case on a 64-bit Linux. I'll post my results as soon as they are ready.

>>I also added the printf of the sizeof pointer and it is 8 bytes.

Its not size of pointer "sizeo(void*)". its the size of the return of sizeof "sizeof(sizeof(void*))"

sizeof has a return type of size_t. I've occasionally seen size_t defined/typedefed with "unsigned int" as opposed to "unsigned intptr_t" and on x64 platform where default "int" is 32-bit.

Jim Dempsey

www.quickthreadprogramming.com

Program was run with stack limit set to unlimited with no effect. My machine has MemTotal:     132089148 kB and I get the same result (I work with jkwi). Thanks!

Added:
printf("sizeof sizeof L3DPR_GRIDS %d\n", (int)sizeof(sizeof(L3DPR_GRIDS)));
printf("sizeof pointer to grids %d\n",(int)sizeof(grids));

Seems upload of files is changing case of filenames to all lower letters.

ICC Version 12.1.5.339 Build 20120612
Allocate the grid struct
sizeof of grids 9.393871GB
sizeof sizeof L3DPR_GRIDS 8
sizeof pointer to grids 8
Address of grids->G2.rain.count 0x7fa223056190
Address of grids->G2.d0.count 0x7fa27fc48190
Address of grids->G2.Nt.count 0x7fa2b4c40190
Address of grids->G2.dBZm.count 0x7fa4258b3190
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 0x7fa4258b3190
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7fa27fc48190
Going to assign d0.count
Segmentation fault

Attachments: 

AttachmentSize
Downloadtext/x-csrc test.c2.13 KB

Hi everybody,

Thanks for the test-case and it really allows to see what different C++ compilers can do!

>>...
>>Address of G2.d0.count 0x7fa27fc48190
>>Going to assign d0.count
>>Segmentation fault
>>...

1. I reproduced the same problem when there is attempt to access to ...G2.d0.count struct member on a Windows 7 Professional ( 64-bit / 16GB physical memory / 64GB virtual memory ) with Intel Parallel Studio XE 2013 ( Initial Release ).

2. I will post all my results and a test project for Visual Studio 2008 Professional Edition.

3. CRT-function malloc worked well and allocated more than 9GB of memory ( Release Configuration of the test application ).

>>...
>>...I did add the exit if malloc failed but it always did succeed as the printf was never executed.
>>...

4. The pointer wasn't initialized to NULL when it was declared. Please remember that CRT-functions, like malloc or calloc, don't set it to NULL if they fail to allocate a block of memory. Here are two examples:

[ Without initialization to NULL ( a possible bug ) ]
...
int *piData;
piData = ( int * )malloc( 8192 * sizeof( int ) );
if( piData == NULL )
return;
...

[ With initialization to NULL ]
...
int *piData = NULL;
piData = ( int * )malloc( 8192 * sizeof( int ) );
if( piData == NULL )
return;
...

Here are compilation results for Intel C++ compiler:

[ Intel C++ compiler - x64 - DEBUG ]

------ Build started: Project: MemTestApp, Configuration: Debug x64 ------
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment)
Stdafx.cpp
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment)
MemTestApp.cpp
Compiling manifest to resources... (Microsoft VC++ Environment)
Microsoft (R) Windows (R) Resource Compiler Version 6.1.6723.1
Copyright (C) Microsoft Corporation. All rights reserved.
Linking... (Intel C++ Environment)
xilink: executing 'link'
Embedding manifest... (Microsoft VC++ Environment)
Microsoft (R) Windows (R) Resource Compiler Version 6.1.6723.1
Copyright (C) Microsoft Corporation. All rights reserved.
xilink: executing 'link'
MemTestApp - 0 error(s), 0 warning(s), 0 remark(s)

[ Intel C++ compiler - x64 - RELEASE ]

------ Build started: Project: MemTestApp, Configuration: Release x64 ------
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment)
Stdafx.cpp
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment)
MemTestApp.cpp
Linking... (Intel C++ Environment)
xilink: executing 'link'
Embedding manifest... (Microsoft VC++ Environment)
MemTestApp - 0 error(s), 0 warning(s), 0 remark(s)

[ Intel C++ compiler - Win32 - DEBUG ]

------ Build started: Project: MemTestApp, Configuration: Debug Win32 ------
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [IA-32]... (Intel C++ Environment)
Stdafx.cpp
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [IA-32]... (Intel C++ Environment)
MemTestApp.cpp
C:\WuTemp\MemTestApp\MemTestApp.cpp(56) (col. 2): internal error: 04010002_1071
compilation aborted for .\MemTestApp.cpp (code 4)
MemTestApp - 1 error(s), 0 warning(s), 0 remark(s)

[ Intel C++ compiler - Win32 - RELEASE ]

------ Build started: Project: MemTestApp, Configuration: Release Win32 ------
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [IA-32]... (Intel C++ Environment)
Stdafx.cpp
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [IA-32]... (Intel C++ Environment)
MemTestApp.cpp
C:\WuTemp\MemTestApp\MemTestApp.cpp(56) (col. 2): internal error: 04010002_1071
compilation aborted for .\MemTestApp.cpp (code 4)
MemTestApp - 1 error(s), 0 warning(s), 0 remark(s)

Here are compilation results for Microsoft C++ compiler ( Unfortunately, it couldn't compile sources for all configurations ):

[ Microsoft C++ compiler - x64 - DEBUG ]

------ Build started: Project: MemTestApp, Configuration: Release x64 ------
Compiling...
cl : Command line warning D9035 : option 'Wp64' has been deprecated and will be removed in a future release
Stdafx.cpp
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(484) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(485) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(486) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(487) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(488) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(489) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(490) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(491) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(492) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(493) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(494) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(495) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(496) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(497) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(498) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(542) : error C2089: '' : 'struct' too large
MemTestApp - 16 error(s), 1 warning(s)

[ Microsoft C++ compiler - x64 - RELEASE ]

------ Build started: Project: MemTestApp, Configuration: Debug x64 ------
Compiling...
cl : Command line warning D9035 : option 'Wp64' has been deprecated and will be removed in a future release
Stdafx.cpp
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(484) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(485) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(486) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(487) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(488) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(489) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(490) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(491) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(492) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(493) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(494) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(495) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(496) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(497) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(498) : error C2089: '' : 'struct' too large
c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(542) : error C2089: '' : 'struct' too large
MemTestApp - 16 error(s), 1 warning(s)

Note: The same errors are displayed for Win32 configurations.

Here is Output of the test application:

64-bit Windows platform
Allocate the grid struct
sizeof of grids 9.393871GB
Address of grids->G2.rain.count 000000016C9EE1C0
Address of grids->G2.d0.count 00000001C95E01C0
Address of grids->G2.Nt.count 00000001FE5D81C0
Address of grids->G2.dBZm.count 000000036F24B1C0
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids->G2.dBZm.count 000000036F24B1C0
assigned G2.dBZm.count
assigend G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 00000001C95E01C0
Successful write
Press ESC to Exit...

Note: I commented out the following blocks because something goes wrong with accessing d0 struct members:
...
// printf( "Going to assign d0.count\n" );
// grids->G2.d0.count[ii][jj][kk][ll][mm] = ( __int32 )6; // Access Violation if uncommented
// printf( "assigned G2.d0.count\n" );

// printf( "Going to assign d0.mean\n" );
// grids->G2.d0.mean[ii][jj][kk][ll][mm] = five + plusVal; // Access Violation if uncommented
// printf( "assigned G2.d0.mean\n" );

// printf( "Going to assign d0.stdev\n" );
// grids->G2.d0.stdev[ii][jj][kk][ll][mm] = six + plusVal; // Access Violation if uncommented
// printf( "assigned G2.d0.stdev\n" );
...

Here is a test project for Visual Studio 2008 Professional Edition ( Intel C++ compiler is set ). Please let me know if you have any questions.

PS: This is a really good test case.

Attachments: 

AttachmentSize
Downloadapplication/zip memtestapp.zip5.87 KB

Thank you for the problem report and all your comments.  I will investigate this issue and get back to you.

--mark

Thanks for verifying this and all the additional info Sergey. Hopefully the project info will help Intel.

We will be waiting for a response from Intel.

Thanks again!!

Hi Sergey,

i have the code samples compile with Intel C++ 13.1 and gcc-4.6.3, then i have debug with gdb, idb, and the Allinea DDT debuggers

also i analyse with inspector XE 2013, advisor XE 2013, the tools report memory leak at line 62, i think you are right its a codeing error not

a compiler error

best regards

Franz Bernasek

>>... i think you are right its a codeing error not a compiler error...

I think this is a compiler error and Intel engineers should confirm if it is not. I don't see any problems with the test case.

Hi Franz,

>>...i analyse with inspector XE 2013, advisor XE 2013, the tools report memory leak at line 62...

Thank you very much for the note. Please find attached updated test case with a small fix. I added a call to free CRT-function at the end of main function:

...
if( grids != NULL )
{
free( grids );
grids = NULL;
}
...

Best regards,
Sergey

Attachments: 

AttachmentSize
Downloadapplication/zip memtestapp.u1.zip5.9 KB

This is a short follow up regarding the problem with Microsoft C++ compiler...

>>...
>>------ Build started: Project: MemTestApp, Configuration: Release x64 ------
>>Compiling...
>>cl : Command line warning D9035 : option 'Wp64' has been deprecated and will be removed in a future release
>>Stdafx.cpp
>>c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(484) : error C2089: '' : 'struct' too large
>>...

I reported the problem to Microsoft and this is a response:
...
...Visual C++ does not handle structures larger than 2GB. This appears to be a known issue...
...

Hi Sergey,

i have debug the sample ( icc execute ) with gdb, and proof it with valgrind now the segmentation error

is on:

0x400d85 <main+1649>;   movl $0x6,(%rax)

so i have made a dump with objdump an attach the listing, i'm not a pc assembler specialist ( better on Mainframe Assembler ) but i hope

the listing helps a little

best regards

Franz Bernasek

Sergey

now the zip file

best regards

Franz Bernasek

Attachments: 

AttachmentSize
Downloadapplication/zip dump.zip5.57 KB

Thanks.

>>...the segmentation error is on:
>>...
>>0x400d85 ; movl $0x6,(%rax)
>>...

and in the codes this is:
...
grids->G2.d0.count[ii][jj][kk][ll][mm] = ( __int32 )6;
...
Let's wait for a response from Mark-sabahi (Intel) since he is already investigating the problem.

I detected another problem with Intel C++ compiler when the struct was declared for automatic allocation and please take a look at compilation outputs:

------ Build started: Project: MemTestApp, Configuration: Release x64 ------
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment)
Stdafx.cpp
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment)
MemTestApp.cpp
Linking... (Intel C++ Environment)
catastrophic error: Local variable size exceeds supported maximum
xilink: error #10014: problem during multi-file optimization compilation (code 1)
xilink: error #10014: problem during multi-file optimization compilation (code 1)
MemTestApp - 3 error(s), 0 warning(s), 0 remark(s)

------ Build started: Project: MemTestApp, Configuration: Debug x64 ------
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment)
Stdafx.cpp
Compiling with Intel(R) C++ Compiler XE 13.0.0.089 [Intel(R) 64]... (Intel C++ Environment)
MemTestApp.cpp
catastrophic error: Local variable size exceeds supported maximum
compilation aborted for .\MemTestApp.cpp (code 1)
MemTestApp - 1 error(s), 0 warning(s), 0 remark(s)

Note: Updated test project ( update 2 ) is attached.

Attachments: 

AttachmentSize
Downloadapplication/zip memtestapp.u2.zip6.03 KB

FYI. Here is a similar test program using the same structure but in Fortran. Seems to work as expected.
I know this Fortran is using extensions and not strict F95/2003 but we are wedded to some legacy code for now.

ifort: Version 11.1    Build 20100806 Package ID: l_cprof_p_11.1.073

ifort -mcmodel=medium -shared-intel -fpp test.f90

Output:

Address of grids                  601820
 Size of grids            10086592896
Address of grids.G2.rain.count                2DD6F9A0
Address of grids.G2.d0.count                B255B9A0
Address of grids.G2.Nt.count                E75539A0
Address of grids.G2.dBZm.count                8A9619A0
 write some values to grid
 grids.G2.dBZ.stdev(ii,jj,kk,ll,mm) =    2.010000    
 Going to assign d0.count
Address of d0.count               B255B9A0
 Assigned d0.count
 grids.G2.d0.count =            6
 Successful write

Attachments: 

AttachmentSize
Downloadapplication/octet-stream test.f901.23 KB
Downloadtext/x-chdr tk-3dpr-hdf5.h12.54 KB

>>...Here is a similar test program using the same structure but in Fortran. Seems to work as expected...

Do you want me to verify the new test under Windows 7 Professional 64-bit?

>>>>c:\wutemp\memtestapp\Tk-3dpr-hdf5.h(484) : error C2089: '' : 'struct' too large
>>>>...
>>
>>I reported the problem to Microsoft and this is a response:
>>...
>>...Visual C++ does not handle structures larger than 2GB. This appears to be a known issue...
>>...

By the way, the problem is not fixed for three years ( 1st time it was reported in February 2010 ).

Thank you for the test case. I reproduced the problem and filed a report on this issue.  I will let you know as soon as I get an update from the compiler development team.

 

 

>>... I reproduced the problem and filed a report on this issue. I will let you know as soon as I get an update from
>>the compiler development team...

It is good news, Mark. Thanks.

I think limitations on a size of some structure(s) ( especially for automatic allocation ) have to be removed if an application is built for a 64-bit platform. I would also suggest an idea of a dynamic limitation(s) and it could be dependent on a total amount of memory ( physical + virtual ) available on the system.

  Mark and Sergey,

  Thank you very much for confirming the issue. I've passed this on to our Admins and they have gone through Premier Support as well.

  We have other test codes that use upwards of 20GB arrays with no issues, reading and writing to all array elements.
  Is the issue related to the size of the structure or the complexity of the structure with many arrays of mixed data types?

  Please give us any indication if there is a short vs. long term fix.

  We have machines with up to 128GB/256GB of physical memory and for us using these large structures is the most efficient way to consolidate our data. The data in the structure ends up in an internally compressed HDF5 file.  In addition, we have applications and libraries written in C, Fortran and mixed language that need to access these large structures.

  Thanks.

>>Is the issue related to the size of the structure or the complexity of the structure with many
>>arrays of mixed data types?

As we can see there is some issue with Intel C++ compiler and it is related to the size of the parent structure when it is greater than ~4GB and it is not related to the complexity of declaration of structures.

>>We have machines with up to 128GB/256GB of physical memory and for us using these large structures is
>>the most efficient way to consolidate our data.

Absolutely agree since that approach is very simple. I hope that your comments about computers with 128GB/256GB of physical memory will be taken into account because we live in times of GigaBytes and TeraBytes, not MegaBytes.

As a workaround in case of Intel C++ compiler I would suggest to split the primary structure into a couple of smaller structures and a simple test could verify if the workaround will work.

Using: icc --version
icc (ICC) 12.1.3 20120212

A similar program to jkwi's with the structure allocated on the stack and my ulimit set to unlimited works. Program attached and output is:

./test_noloop_static
location: 0
G2.dBZ.stdev[ii][jj][kk][ll][mm] = 2.001000
Address of grids.G2.dBZm.count 0x7fffb7f81c40
assigned G2.dBZm.count
assigned G2.dBZm.mean
assigned G2.dBZm.stdev
Address of G2.d0.count 0x7ffe12316c40
Going to assign d0.count
assigned G2.d0.count
Going to assign d0.mean
assigned G2.d0.mean
Going to assign d0.stdev
assigned G2.d0.stdev
Successful write

Attachments: 

AttachmentSize
Downloadtext/x-csrc test-noloop-static.c1.5 KB

Mark, please try to do a very simple test ( for 32-bit and 64-bit platforms ) and here are source codes:

#include "stdio.h"

typedef struct tagSmallDataSet
{
int iData[4][4][4][4][4][4][4][4][4][4];
} SmallDataSet;

void main( void )
{
SmallDataSet sds;

sds.iData[0][0][0][0][0][0][0][0][0][0] = 777;

printf( "%d\n", ( int )sds[0][0][0][0][0][0][0][0][0][0] );
}

Note: Added '.iData' and now it looks like: 'sds.iData[0]...'. Sorry about it.

>>Using: icc --version
>>icc (ICC) 12.1.3 20120212
>>
>>A similar program to jkwi's with the structure allocated on the stack and my ulimit set to unlimited works.
>>...

Unfortunately I don't have a chance to verify it with a 64-bit Intel C++ compiler version 12.1.3. In my tests on a 64-bit Windows platform I used version 13.0.0.089.

Sergey,

>>"Mark, please try to do a very simple test ( for 32-bit and 64-bit platforms ) and here are source codes:"

Your test case compiles and runs fine.

Thanks,
--mark

>>>>"Mark, please try to do a very simple test ( for 32-bit and 64-bit platforms ) and here are source codes:"
>>
>>Your test case compiles and runs fine.

Mark,

What version of Intel C++ compiler do you use? I received a private email from another IDZ user who confirms the problem with the latest test case. In essence, there is no need in GBs of memory in order to reproduce that problem for N-Dimensional structure ( where N is greater than 5 ).

I will also test that simpliest test with Intel C++ compiler version 8.1.038 and all the rest C/C++ compilers I have.

Attached is a new test project for Visual Studio 2008 Professional Edition. Intel C++ compiler XE 2013.0.0.089 compiles but executable fails ( an exception is thrown ) and take a look at sources.

Attachments: 

AttachmentSize
Downloadapplication/zip memtestapp.u3.zip6.32 KB

Hi everybody,

The following compilation error happens when a 26-D struct is declared ( size is 67108864 bytes ) as static and I would consider it as expected error ( a test with a 25-D struct worked ):

...>icl Test.cpp

Intel(R) C++ Compiler XE for applications running on IA-32, Version 12.1.3.300 Build 20120130
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.

Test.cpp
Test.cpp(8) (col. 9): catastrophic error: out of memory

compilation aborted for Test.cpp (code 4)

You need to use switches that control Heap or Stack commit and reserved values used by the linker (!).

#include "stdio.h"

typedef struct tagDataSet
{
// 2^26 = 67108864 - Default limit for 32-bit Intel C++ compiler
// 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
__int8 iData[2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2];
} DataSet;

DataSet ds = { 0x0 };

int main( void )
{
ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 77;
printf( "%d\n", ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] );
return ( int )0;
}

Note 1: In case of automatic allocation use /F compiler switch: /F - set the stack reserve amount specified to the linker
Note 2: In case of static allocation of a large struct a heap value ( as larger as possible ) has to be set for the linker and please do your own verifications and tests.

I will do one more test ( with a test case from the previous post ) on a 64-bit Windows 7 Professional with Intel C++ compiler XE 2013 version 2013.0.0.089 and post results.

Hi Sergey,

i have compile your sample with the Parallel Studio XE 2013 for Linux Compiler(  icc (ICC) 13.1.0 20130121
Copyright (C) 1985-2013 Intel Corporation.  All rights reserved.

linux-cuda:~ # icpc --version
icpc (ICC) 13.1.0 20130121
Copyright (C) 1985-2013 Intel Corporation.  All rights reserved.

linux-cuda:~ # )  under openSUSE 12.2  64 Bit Linux Kernel 3.4.30 ,  no problem

works , no compile error, no link error , executable runs result  77 was displayed.

#include "stdio.h"

typedef struct tagDataSet
{
// 2^26 = 67108864 - Default limit for 32-bit Intel C++ compiler
// 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
__int8 iData[2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2];
} DataSet;

DataSet ds = { 0x0 };

int main( void )
{
ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 77;
printf( "%d\n", ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] );
return ( int )0;
}

best regards

Franz

Thank you, Franz. As I promised here are results of two tests with Intel C++ compiler XE 2013 on WIndows 7 Professional 64-bit:

[ TEST 1 - Intel C++ compiler - 32-bit ] - No problems

..>icl.exe Test.cpp

Intel(R) C++ Compiler XE for applications running on IA-32, Version 13.0.0.089 Build 20120731
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.

Test.cpp
Microsoft (R) Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.

-out:Test.exe
Test.obj

[ Test Output ]
77

[ TEST - 2 Intel C++ compiler - 64-bit ] - No problems

..>icl.exe Test.cpp

Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 13.0.0.089 Build 20120731
Copyright (C) 1985-2012 Intel Corporation. All rights reserved.

Test.cpp
Microsoft (R) Incremental Linker Version 9.00.30729.01
Copyright (C) Microsoft Corporation. All rights reserved.

-out:Test.exe
Test.obj

[ Test Output ]
77

Attachments: 

AttachmentSize
Downloadtext/plain output.32bit.txt432 bytes
Downloadtext/plain output.64bit.txt450 bytes
Downloadtext/x-c++src test.cpp1.26 KB

Mark,

I'm not concerned about Intel(R) C++ Compiler XE ( 32-bit ) Version 12.1.3.300 Build 20120130 because it was just additional verification. Please keep everybody informed regarding status of the original problem with ~9GB structure.

Best regards,
Sergey

Hi Sergey, i have compile this sample under openSUSE 12.2 Linux 64 Bit with the ParallelStudio XE 2013 ( Compiler icc/icpc 13.1 )

no problems, no compile errors, runs normal the executable result output 77

#include "stdio.h"

typedef struct tagDataSet
{
// 2^26 = 67108864 - Default limit for 32-bit Intel C++ compiler
// 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
__int8 iData[2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2][2];
} DataSet;

DataSet ds = { 0x0 };

int main( void )
{
ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] = 77;
printf( "%d\n", ds.iData[0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0][0] );
return ( int )0;

best regards

Franz

The original problem reported has been resolved. We will let you know when a compiler update containing the fix is available. 

Thanks,
--mark

It is a very good news and thank you for the update.

The C++ compiler issue was resolved. The latest composer XE 2013 update 5 contains the fix. The Fortran compiler issue has been fixed in the 14.0 compiler schedule to release in September (schedule subject to change).

-mark

Mark,

is it possible this was "partially fixed"?  I got to this thread via google but in my case I have structs of structs of structs.  (That add up to ~16GB.)

I'll work on a reproducer but I basically have a top-level struct with 4 instances of a second-level struct (which take ~4GB each).  I can cleanly access fields in the first instance.  All three of the other instances seg fault.

 

I am using ICC 14.0.1 20131008 on Linux (Ubuntu 13.10)

The following code fails on ICC 14.0.1 20131008 (Linux) depending on if the #include of stdlib.h is commented out or not.  (It seg faults if the include is commented out.)

Does that make any sense???  I only figured that out because GCC complained of "incompatible implicit declaration of built-in function 'malloc'"

Thanks.

 

#include <stdio.h>
//#include <stdlib.h>

typedef struct _bar
{
    char doug[4L*1024L*1024L*1024L];
}bar;

typedef struct _foo
{
    bar bob[4];
}foo;

int main()
{
    foo* foo0=(foo*)malloc(sizeof(foo));
    foo0->bob[0].doug[0]=1;
    foo0->bob[1].doug[0]=1;
    printf("Hello world\n");
}

The problem seems to still be present in the Intel compiler 16.0. It is demonstrated by the following program:

#include <iostream>                                                                 
#include <complex>                                                                  
using namespace std;                                                                
                                                                                    
#define N 32768                                                                     
#define M 16384                                                                     
typedef float data[2][N][N];                                                        
typedef complex<float> data_complex[2][M][M];                                       
                                                                                    
int main(int argc, char **argv)                                                     
{                                                                                   
    data d1;                                                                        
    cout << "&d1[0][0][0]: " << &d1[0][0][0] << endl;                               
    cout << "&d1[1][0][0]: " << &d1[1][0][0] << endl;                               
                                                                                    
    data_complex d2;                                                                
    cout << "&d2[0][0][0]: " << &d2[0][0][0] << endl;                               
    cout << "&d2[1][0][0]: " << &d2[1][0][0] << endl;                               
                                                                                                                                                                      
    return 0;                                                            
}                  

compiled with 'g++' and 'icpc', with or without '-mcmodel=large'.

Hello,

I've came across this thread and would like to have a quick look.

Matthias, could you please provide me the following information:

$ icpc -v     [lower case 'v']
...

$ icpc -V   [upper case 'V']
...

$ g++ -v    [lower case 'v']
...

...and the exact output of the failing compiler?

It's also important to know whether your system is 32 or 64 bit. I was running your case on a random 32 bit system with 15.0.4 (which I just had at hand; do not have 16.0 for IA32 currently) and I got a proper error:

$ icpc -mcmodel=large large.cpp icpc: command line warning #10148: option '-mcmodel=large' not supported
large.cpp(7): error: array is too large
  typedef float data[2][N][N];
                ^

large.cpp(8): error: array is too large
  typedef complex<float> data_complex[2][M][M];
                         ^

compilation aborted for large.cpp (code 2)

I need to understand your configuration before doing more analysis.

Thank you & best regards,

Georg Zitzlsberger

Leave a Comment

Please sign in to add a comment. Not a member? Join today