The memory manager cannot access sufficient memory to initialize; exiting

The memory manager cannot access sufficient memory to initialize; exiting

Idetected a problem with 'scalable_allocator' and a complete test-case ( ~ 110 code lines )will be provided.
The 'scallable_allocator' also fails to allocate a memory for a last block ( see Tests 4 and 5 ).

[EDITED] Please see a Post #3 for updated descriptions of theseproblems:

http://software.intel.com/en-us/forums/showpost.php?p=191121

Here are results of a stresstest-case (32-bit / Release configuration )for a preliminary review:

>> Test 1 <<

Number of Memory Blocks: 4192
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.13 GB

[ CRT malloc ] All memory blocks are allocated - 31 ticks
[ CRT free ] All memory blocks are released - 16 ticks
Press ENTER to continue...

[ TBB scalable_allocator ] All memory blocks are allocated - 31 ticks
[ TBB deallocate ] All memory blocks are released - 0 ticks
Press ENTER to exit...

>> Test 2 <<

Number of Memory Blocks: 8192
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.25 GB

[ CRT malloc ] All memory blocks are allocated - 63 ticks
[ CRT free ] All memory blocks are released - 15 ticks
Press ENTER to continue...

[ TBB scalable_allocator ] All memory blocks are allocated - 46 ticks
[ TBB deallocate ] All memory blocks are released - 0 ticks
Press ENTER to exit...

>> Test 3 <<

Number of Memory Blocks: 16384
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.50 GB

[ CRT malloc ] All memory blocks are allocated - 141 ticks
[ CRT free ] All memory blocks are released - 31 ticks
Press ENTER to continue...

[ TBB scalable_allocator ] All memory blocks are allocated - 94 ticks
[ TBB deallocate ] All memory blocks are released - 16 ticks
Press ENTER to exit...

>> Test 4 <<

Number of Memory Blocks: 32768
Size of Memory Block : 32768 bytes
Total Amount of Memory : 1.00 GB

[ CRT malloc ] All memory blocks are allocated - 406 ticks
[ CRT free ] All memory blocks are released - 78 ticks
Press ENTER to continue...

[ TBB scalable_allocator ] All memory blocks are allocated - 94 ticks
[ TBB deallocate ] All memory blocks are released - 16 ticks
Error: [ TBB scalable_allocator ] Failed to allocate a memory block #32768 - SysError: 0
Press ENTER to exit...

>> Test 5 <<

Number of Memory Blocks: 49152
Size of Memory Block : 32768 bytes
Total Amount of Memory : 1.50 GB

[ CRT malloc ] All memory blocks are allocated - 609 ticks
[ CRT free ] All memory blocks are released - 94 ticks
Press ENTER to continue...

[ TBB scalable_allocator ] All memory blocks are allocated - 62 ticks
[ TBB deallocate ] All memory blocks are released - 16 ticks
Error: [ TBB scalable_allocator ] Failed to allocate a memory block #49152 - SysError: 0
Press ENTER to exit...

>> Test 6 <<

Number of Memory Blocks: 65536
Size of Memory Block : 32768 bytes
Total Amount of Memory : 2.00 GB

[ CRT malloc ] All memory blocks are allocated - 1328 ticks
[ CRT free ] All memory blocks are released - 234 ticks
Press ENTER to continue...
The memory manager cannot access sufficient memory to initialize; exiting

33 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Could you please provide us with the reproducer?

Quoting Alexandr Konovalov (Intel)Could you please provide us with the reproducer?

A short answer: Yes.

( ...in a couple of minutes... )

Hi everybody,

Quoting Sergey KostrovIdetected a problem with 'scalable_allocator' and a complete test-case ( ~ 110 code lines )will be provided.
The 'scallable_allocator' also fails to allocate a memory for a last block ( see Tests 4 and 5 )...

I decided to change definitions of these two problems:

>> Problem #1 <<

TBB 'scalable_allocator' doesn't outperform CRT 'malloc' when an application needs to
allocate more than ~1.54GB of memory in total ( not as one large block! )

>> Problem #2 <<

TBB 'scalable_allocator' fails completely after ~1.97GB of memory was allocated and
then released (!) by CRT 'malloc'. An application exits with a TBB error message:

The memory manager cannot access sufficient memory to initialize; exiting

I'd like to note that ~1.97GB of memory is currently needed for some algorithm on a 32-bit Windowsplatform.

Best regards,
Sergey

I delayedrelease ofthe test-case since I detected one issue( not TBB related ).I needed some time to changecodes and
to completere-testing. So, here are the source codes:

// Stress Tests for CRT 'malloc' and TBB 'scalable_allocator'

{

///*

	#if ( defined ( _WIN32_MSC ) || defined ( _WIN32_ICC ) )
	#define _TEST_CRTMALLOC									// Configuration Macros for Tests

	#define _TEST_TBBSCALABLEALLOCATOR

	/*

	Notes:

	   //	#define _TEST_CRTMALLOC							// Case 1

	   //	#define _TEST_TBBSCALABLEALLOCATOR
	   		#define _TEST_CRTMALLOC							// Case 2

	   //	#define _TEST_TBBSCALABLEALLOCATOR
	   //	#define _TEST_CRTMALLOC							// Case 3

			#define _TEST_TBBSCALABLEALLOCATOR
			#define _TEST_CRTMALLOC							// Case 4

			#define _TEST_TBBSCALABLEALLOCATOR

	*/														// Attention: Results are for the Case 4

															// ( Win32 / Release configuration )

	#define _SIZE_OF_MEMBLOCK			 8192				//			CRT					TBB

															//			malloc				scalable_allocator

//	const RTint _NUM_OF_MEMORYBLOCKS =  4192;				// 0.13GB - OK					OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  8192;				// 0.25GB - OK					OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 16384;				// 0.50GB - OK					OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 32768;				// 1.00GB - OK					OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 16384;		// 1.50GB - OK					OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 -  8192;		// 1.75GB - OK					Failed on  6996 mem blocks

	const RTint _NUM_OF_MEMORYBLOCKS = 65536;				// 2.00GB - Failed on 929 mem blocks
                                                            //                              The memory manager cannot access

															//								sufficient memory to initialize;

	/*																						exiting

	Notes:

	These results are for the Cases 3 and 4 ( Win32 / Release configuration )

	Total amount of memory that could be allocated with CRT 'malloc'             ~1.97GB

	Total amount of memory that could be allocated with TBB 'scalable_allocator' ~1.54GB ( ~0.43GB less )

	*/
	#if ( defined ( _TEST_CRTMALLOC ) || defined ( _TEST_TBBSCALABLEALLOCATOR ) )
	CrtPrintf( RTU("Number of Memory Blocks: %ldn"), ( RTint )_NUM_OF_MEMORYBLOCKS );

	CrtPrintf( RTU("Size of Memory Block   : %ld bytesn"), ( RTint )( _SIZE_OF_MEMBLOCK * sizeof( RTfloat ) ) );
	CrtPrintf( RTU("Total Amount of Memory : %.2f GBnn"),

			   ( RTfloat )( _SIZE_OF_MEMBLOCK * sizeof( RTfloat ) * _NUM_OF_MEMORYBLOCKS ) / 1024 / 1024 / 1024 );
	#endif
	RTfloat *pfData[ _NUM_OF_MEMORYBLOCKS ] = { RTnull };
	RTbool bErrorM = RTfalse;

	RTbool bErrorS = RTfalse;

	RTuint uiNumOfMemBlocksNotAllocatedM = 0U;

	RTuint uiNumOfMemBlocksNotAllocatedS = 0U;

	RTuint uiSysErrorM = 0U;

	RTuint uiSysErrorS = 0U;
	RTint t;
	while( RTtrue )

	{

		#ifdef _TEST_CRTMALLOC
		// Case 1 - CRT malloc

		g_uiTicksStart = SysGetTickCount();

		for( t = 0; t < _NUM_OF_MEMORYBLOCKS; t++ )

		{

			pfData[t] = ( RTfloat * )CrtMalloc( _SIZE_OF_MEMBLOCK * sizeof( RTfloat ) );

			if( pfData[t] == RTnull )

			{

				uiNumOfMemBlocksNotAllocatedM += 1;

			//	uiSysErrorM = SysGetLastError();

				bErrorM = RTtrue;

				continue;

			}

		}

		g_uiTicksEnd = SysGetTickCount();

		CrtPrintf( RTU("[ CRT malloc             ] All memory blocks are allocated - %4ld ticksn"),

				   ( RTint )( g_uiTicksEnd - g_uiTicksStart ) );

		g_uiTicksStart = SysGetTickCount();

		for( t = 0; t < _NUM_OF_MEMORYBLOCKS; t++ )

		{

			if( pfData[t] != RTnull )

			{

				CrtFree( pfData[t] );

				pfData[t] = RTnull;

			}

		}

		g_uiTicksEnd = SysGetTickCount();

		CrtPrintf( RTU("[ CRT free               ] All memory blocks are released  - %4ld ticksn"),

				   ( RTint )( g_uiTicksEnd - g_uiTicksStart ) );
		if( bErrorM == RTtrue )

			CrtPrintf( RTU("[ CRT malloc             ] Failed to allocate %5ld memory blocksn"),

					   uiNumOfMemBlocksNotAllocatedM );
		CrtPrintf( RTU("Press ENTER to continue...n") );

		CrtGetChar();
		#endif
		#ifdef _TEST_TBBSCALABLEALLOCATOR
		// Case 2 - TBB scalable_allocator

		g_uiTicksStart = SysGetTickCount();

		for( t = 0; t < _NUM_OF_MEMORYBLOCKS; t++ )

		{

			pfData[t] = scalable_allocator< RTfloat >().allocate( _SIZE_OF_MEMBLOCK );

			if( pfData[t] == RTnull )

			{

				uiNumOfMemBlocksNotAllocatedS += 1;

			//	uiSysErrorS = SysGetLastError();

				bErrorS = RTtrue;

				continue;

			}

		}

		g_uiTicksEnd = SysGetTickCount();

		CrtPrintf( RTU("[ TBB scalable_allocator ] All memory blocks are allocated - %4ld ticksn"),

				   ( RTint )( g_uiTicksEnd - g_uiTicksStart ) );

		g_uiTicksStart = SysGetTickCount();

		for( t = 0; t < _NUM_OF_MEMORYBLOCKS; t++ )

		{

			if( pfData[t] != RTnull )

			{

				tbb::scalable_allocator< RTfloat >().deallocate( pfData[t], _SIZE_OF_MEMBLOCK );

				pfData[t] = RTnull;

			}

		}

		g_uiTicksEnd = SysGetTickCount();

		CrtPrintf( RTU("[ TBB deallocate         ] All memory blocks are released  - %4ld ticksn"),

				   ( RTint )( g_uiTicksEnd - g_uiTicksStart ) );
		if( bErrorS == RTtrue )

			CrtPrintf( RTU("Error: [ TBB scalable_allocator ] Failed to allocate %5ld memory blocksn"),

					   uiNumOfMemBlocksNotAllocatedS );
		CrtPrintf( RTU("Press ENTER to exit...n") );

		CrtGetChar();
		#endif
		break;

	}
	#endif

//*/

}


Some small modifications in the source codes of the test will be needed, like:

CrtPrintf-> _tprintf or printf
CrtMalloc-> malloc
CrtFree-> free
CrtGetChar-> _gettchar or getchar
SysGetTickCount-> GetTickCount

RTtrue-> true or TRUE
RTfalse-> false or FALSE
RTnull-> NULL

RTbool-> bool or BOOL
RTint-> int
RTuint-> unsigned int
RTfloat-> float

RTU-> _T

or use a set of macros, like:

...
#define CrtPrintf _tprintf
...

Two global variables 'g_uiTicksStart' and 'g_uiTicksEnd' are declared as follows:

...
RTuint g_uiTicksStart = 0U;
RTuint g_uiTicksEnd = 0U;
...

Here are updated test results for the Case 4 when both configuration macros are defined:

...
#define _TEST_CRTMALLOC
#define _TEST_TBBSCALABLEALLOCATOR
...

>> Test 1 <<

     Number of Memory Blocks: 4192

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 0.13 GB
     [ CRT malloc             ] All memory blocks are allocated -   32 ticks

     [ CRT free               ] All memory blocks are released  -   15 ticks

     Press ENTER to continue...
     [ TBB scalable_allocator ] All memory blocks are allocated -   31 ticks

     [ TBB deallocate         ] All memory blocks are released  -    0 ticks

     Press ENTER to exit...

>> Test 2 <<

     Number of Memory Blocks: 8192

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 0.25 GB
     [ CRT malloc             ] All memory blocks are allocated -   62 ticks

     [ CRT free               ] All memory blocks are released  -   16 ticks

     Press ENTER to continue...
     [ TBB scalable_allocator ] All memory blocks are allocated -   47 ticks

     [ TBB deallocate         ] All memory blocks are released  -    0 ticks

     Press ENTER to exit...

>> Test 3 <<

     Number of Memory Blocks: 16384

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 0.50 GB
     [ CRT malloc             ] All memory blocks are allocated -  141 ticks

     [ CRT free               ] All memory blocks are released  -   31 ticks

     Press ENTER to continue...
     [ TBB scalable_allocator ] All memory blocks are allocated -   94 ticks

     [ TBB deallocate         ] All memory blocks are released  -    0 ticks

     Press ENTER to exit...

>> Test 4 <<

     Number of Memory Blocks: 32768

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 1.00 GB
     [ CRT malloc             ] All memory blocks are allocated -  328 ticks

     [ CRT free               ] All memory blocks are released  -   78 ticks

     Press ENTER to continue...
     [ TBB scalable_allocator ] All memory blocks are allocated -  750 ticks

     [ TBB deallocate         ] All memory blocks are released  -   16 ticks

     Error: [ TBB scalable_allocator ] Failed to allocate 15144 memory blocks

     Press ENTER to exit...

>> Test 5 <<

     Number of Memory Blocks: 49152

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 1.50 GB
     [ CRT malloc             ] All memory blocks are allocated -  594 ticks

     [ CRT free               ] All memory blocks are released  -  109 ticks

     Press ENTER to continue...
     [ TBB scalable_allocator ] All memory blocks are allocated - 1594 ticks

     [ TBB deallocate         ] All memory blocks are released  -    0 ticks

     Error: [ TBB scalable_allocator ] Failed to allocate 37891 memory blocks

     Press ENTER to exit...

>> Test 6 <<

     Number of Memory Blocks: 57344

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 1.75 GB
     [ CRT malloc             ] All memory blocks are allocated -  703 ticks

     [ CRT free               ] All memory blocks are released  -  156 ticks

     Press ENTER to continue...
     [ TBB scalable_allocator ] All memory blocks are allocated - 2016 ticks

     [ TBB deallocate         ] All memory blocks are released  -    5 ticks

     Error: [ TBB scalable_allocator ] Failed to allocate 52547 memory blocks

     Press ENTER to exit...

>> Test 7 <<

     Number of Memory Blocks: 65536

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 2.00 GB
     [ CRT malloc             ] All memory blocks are allocated -  891 ticks

     [ CRT free               ] All memory blocks are released  -  234 ticks

     [ CRT malloc             ] Failed to allocate   929 memory blocks

     Press ENTER to continue...
     The memory manager cannot access sufficient memory to initialize; exiting

My Development Environment:

OS : Windows XP 32-bit SP3
IDE: Visual Studio 2005 SP1
TBB: Version 4 Update 3
TBB: Version 4 Update 1

Here is a screenshot of the Windows Task Manager:

Alexandr,

I decided tostress-test a CRT 'malloc' function again. I wanted to understand if itwould experience a problem
similar to TBB 'scalable_allocator'. Here is output for 3 tests with CRT 'malloc' & 'free' functions executed one after another:

 ...
     Number of Memory Blocks: 65536

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 2.00 GB
     // Sub-Test #1

     [ CRT malloc             ] All memory blocks are allocated -  890 ticks

     [ CRT free               ] All memory blocks are released  -  235 ticks

     [ CRT malloc             ] Failed to allocate   929 memory blocks

     Press ENTER to continue...
     // Sub-Test #2

     [ CRT malloc             ] All memory blocks are allocated -  687 ticks

     [ CRT free               ] All memory blocks are released  -  235 ticks

     [ CRT malloc             ] Failed to allocate   931 memory blocks

     Press ENTER to continue...
     // Sub-Test #3

     [ CRT malloc             ] All memory blocks are allocated -  688 ticks

     [ CRT free               ] All memory blocks are released  -  250 ticks

     [ CRT malloc             ] Failed to allocate   929 memory blocks

     Press ENTER to continue...

     ...

As you can see CRT 'malloc' worked well andallocated all available memory for a 32-bit test application:

in 'Sub-Test #2' after it was released in 'Sub-Test #1'

and allocated all available memory

in 'Sub-Test #3' after it was released in 'Sub-Test #2'.

A screenshot is enclosed:

'Sub-Test #2' and 'Sub-Test #3' allocated all available memory faster then 'Sub-Test #1' in ~1.30 times and it isexpected.

Note: Pillars are different because the Windows Task Manager was lagging when rendering graphics during the test.

Here are results of amodified test with a CRT 'malloc' function( tested 4 times ) and TBB 'scalable_allocator':

     ...

     Number of Memory Blocks: 65536

     Size of Memory Block   : 32768 bytes

     Total Amount of Memory : 2.00 GB
     // Sub-Test #1

     [ CRT malloc             ] All memory blocks are allocated - 1546 ticks

     [ CRT free               ] All memory blocks are released  -  250 ticks

     [ CRT malloc             ] Failed to allocate   929 memory blocks

     Press ENTER to continue...
     // Sub-Test #2

     [ CRT malloc             ] All memory blocks are allocated -  672 ticks

     [ CRT free               ] All memory blocks are released  -  234 ticks

     [ CRT malloc             ] Failed to allocate   931 memory blocks

     Press ENTER to continue...
     // Sub-Test #3

     [ CRT malloc             ] All memory blocks are allocated -  672 ticks

     [ CRT free               ] All memory blocks are released  -  235 ticks

     [ CRT malloc             ] Failed to allocate   929 memory blocks

     Press ENTER to continue...
     // Sub-Test #4

     [ CRT malloc             ] All memory blocks are allocated -  672 ticks

     [ CRT free               ] All memory blocks are released  -  234 ticks

     [ CRT malloc             ] Failed to allocate   931 memory blocks

     Press ENTER to continue...
     // Sub-Test #5 - TBB 'scalable_allocator'

     The memory manager cannot access sufficient memory to initialize; exiting

     ...


A screenshot is enclosed:

Sergey,

Thank you for the report! I able to reproduce the issue locally, and belive it's 3rd party problem. I.e., it seems allocator from Microsoft Visual Studio failed to de-fragment memory when it got out of memory condition. As result, after system allocator failed and despite it released all the memory, subsequent allocation of 2MB via malloc or VirtualAlloc failed, but this is how TBB allocator finds memory to work with.

We are thinking about possible workarounds.

[cpp]#include
#include

const size_t SZ = 8192*4;
const size_t NUM_OF_BLOCKS = 2*1024LU*1024*1024/SZ; // 65536
void *ptrs[NUM_OF_BLOCKS];

int main()
{
for (size_t i = 0; i

Quoting Alexandr Konovalov (Intel)Thank you for the report! I able to reproduce the issue locally, and belive it's 3rd party problem...

Thank you, Alexandr! Did you do the investigation with the latest version of TBB v4 Update 5? Please confirm me.

I'll domy own investigation because I believe that there is a problem with TBB. I'll report my results as soon as
investigationis completed.

I'd like to note thatI'm still using TBB v4 Update 3.

Best regards,
Sergey

Quoting Sergey KostrovQuoting Alexandr Konovalov (Intel)Thank you for the report! I able to reproduce the issue locally, and belive it's 3rd party problem...

Thank you, Alexandr! Did you do the investigation with the latest version of TBB v4 Update 5? Please confirm me.

Attached isolated test case I created based on your code is not dependent on TBB or TBB allocator, system malloc/free are in use.

Quoting Alexandr Konovalov (Intel)Thank you for the report! I able to reproduce the issue locally, and belive it's 3rd party problem.

[SergeyK] Alexander, Did you debug tbbmalloc_debug.dll? Since my investigation is already in progress
I hold a neutral position and I don't blame any side until the investigation is completed.
Please take a look atmy next posts.

I.e., it seems allocator from Microsoft Visual Studio failed to de-fragment memory when it got out of memory condition.

[SergeyK] "Allocator" fromMicrosoft Visual Studio is not responsible for defragmentation of memory on any Windows platforms.
It is a responsibility of aVirtual Memory Manager ( VMM ). Please take a look at MSDN topic:
'The Virtual-Memory Manager in Windows NT'
So, first of all about your test-case. Youmodified / simplifiedmy2nd version that I haveposted ( see Post #4 ):

http://software.intel.com/en-us/forums/showpost.php?p=191122

At the beginning I had a processing until 1st error and there were'break' statementsinside of all 'for'-loops.
As you can see now I changed it and replaced all 'break' statements with 'continue' statements. It allowed to see
how many memory blocks 'malloc' or 'scalable_allocator' could not allocate.

Your "isolated" test-case reproduces all my numbers but in a different way ( see Post #6 ):

http://software.intel.com/en-us/forums/showpost.php?p=191124

Source codes of your modified test-case provided:

Here is Alexander's modified test-case ( see Post #12 ):

http://software.intel.com/en-us/forums/showpost.php?p=191610

...

// Sub-Test 10 - Stress Tests for CRT 'malloc'
const size_t _SIZE_OF_MEMBLOCK = ( 8192 * sizeof( RTfloat ) );

const size_t _NUM_OF_MEMORYBLOCKS = 65536;
RTfloat *pfData[ _NUM_OF_MEMORYBLOCKS ] = { RTnull };
// Sub-Test 1

CrtPrintf( RTU("Sub-Test 1n") );

for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ )

{

	pfData[i] = ( RTfloat * )CrtMalloc( _SIZE_OF_MEMBLOCK );

	if( pfData[i] == RTnull )

	{

		CrtPrintf( RTU("Allocated %ld Memory Blocks. Not Allocated %ld Memory Blocksn"),

				   i, ( _NUM_OF_MEMORYBLOCKS - i ) );

		break;

	}

}

for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ )

{

	if( pfData[i] == RTnull )

		break;

	CrtFree( pfData[i] );

}

CrtPrintf( RTU("All Memory Released - Press ENTER to continue...n") );

CrtGetChar();
// Sub-Test 2

CrtPrintf( RTU("Sub-Test 2n") );

for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ )

{

	pfData[i] = ( RTfloat * )CrtMalloc( _SIZE_OF_MEMBLOCK );

	if( pfData[i] == RTnull )

	{

		CrtPrintf( RTU("Allocated %ld Memory Blocks. Not Allocated %ld Memory Blocksn"),

				   i, ( _NUM_OF_MEMORYBLOCKS - i ) );

		break;

	}

}

for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ )

{

	if( pfData[i] == RTnull )

		break;

	CrtFree( pfData[i] );

}

CrtPrintf( RTU("All Memory Released - Press ENTER to continue...n") );

CrtGetChar();
// Sub-Test 3

CrtPrintf( RTU("Sub-Test 3n") );

for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ )

{

	pfData[i] = ( RTfloat * )CrtMalloc( _SIZE_OF_MEMBLOCK );

	if( pfData[i] == RTnull )

	{

		CrtPrintf( RTU("Allocated %ld Memory Blocks. Not Allocated %ld Memory Blocksn"),

				   i, ( _NUM_OF_MEMORYBLOCKS - i ) );

		break;

	}

}

for( size_t i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ )

{

	if( pfData[i] == RTnull )

		break;

	CrtFree( pfData[i] );

}

CrtPrintf( RTU("All Memory Released - Press ENTER to exit...n") );

CrtGetChar();

...


Here is a screenshot:

Once again, 'malloc' allowed to allocate ( 3 times )all availablememory ( ~1.97GB ) on a 32-bit Windows platform:

allocated -> released
allocated -> released
allocated -> released

without any errors.

Now, you mentioned some "workaround" in one of your posts. What did you mean?

Here is a small update for my test-case ( see Post #4 ):

...

	const RTint _NUM_OF_MEMORYBLOCKS =     1;				//32768 B

//	const RTint _NUM_OF_MEMORYBLOCKS =    16;				//  0.5KB

//	const RTint _NUM_OF_MEMORYBLOCKS =    32;				//    1KB

//	const RTint _NUM_OF_MEMORYBLOCKS =    64;				//    2KB

//	const RTint _NUM_OF_MEMORYBLOCKS =   128;				//    4KB

//	const RTint _NUM_OF_MEMORYBLOCKS =   256;				//    8KB

//	const RTint _NUM_OF_MEMORYBLOCKS =   512;				//   16KB

//	const RTint _NUM_OF_MEMORYBLOCKS =  1024;				//   32KB

//	const RTint _NUM_OF_MEMORYBLOCKS =  2048;				//   64KB

//	const RTint _NUM_OF_MEMORYBLOCKS =  4192;				//  128KB

//	const RTint _NUM_OF_MEMORYBLOCKS =  8192;				//  256KB

//	const RTint _NUM_OF_MEMORYBLOCKS = 16384;				//  512KB

//	const RTint _NUM_OF_MEMORYBLOCKS = 32768;				// 1.00GB

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 16384;		// 1.50GB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 -  8192;		// 1.75GB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536;				// 2.00GB

...

Please take it into account:

Quoting Sergey KostrovHere is a small update for my test-case ( see Post #4 ):
- collapse sourceview plaincopy to clipboardprint?

  1. ...
  2. constRTint_NUM_OF_MEMORYBLOCKS=1;//32768B

If that case is selected than amount of total memory allocated in GB is displayd as '0.00 GB'. Please don't pay attention because
this is aformatting issue of a'printf' CRT function used in the test-case.

This is how it looks like:
...
Number of Memory Blocks: 1
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.00 GB

[ CRT malloc ] All memory blocks are allocated - 0 ticks
[ CRT free ] All memory blocks are released - 0 ticks
Press ENTER to continue...
...

Quoting Sergey Kostrov

Once again, 'malloc' allowed to allocate ( 3 times )all availablememory ( ~1.97GB ) on a 32-bit Windows platform:

allocated -> released
allocated -> released
allocated -> released

without any errors.

Now, you mentioned some "workaround" in one of your posts. What did you mean?

Hello Sergey,Why memory spikes are different there in case the same memory size is allocated and there is the same limit of the memory showed? It looks strange.--Vladimir

Quoting Sergey KostrovNow, you mentioned some "workaround" in one of your posts. What did you mean?

I mean that when system allocators heap became fragmented and we cant allocate an object with size we want (say, 2MB), we still can allocate a smaller object that might be enough for TBB allocator bootstrap and some limited operations. I dont know is it useful for real-world applications or not.

Hi Vladimir,

Quoting Vladimir Polin (Intel)Why memory spikes are different there in case the same memory size is allocated and there is the same
limit of the memory showed?
It looks strange...

[SergeyK] Absolutely agree.
I'm not sure for 100% but it looks like a rendering issue of the Windows Task Manager. Please take a look at
my primary test-case for the problem and you will see that there is no pausebetween a sub-test that allocates
memory blocksand a sub-test that releases memory blocks. I would add a call to 'Sleep' Win32 API function
with a delay at least 1 second.

In a pseudo-code it would look like:

...
//Allocation of memory blocks
...

::Sleep( 1000 ); // Delay to allow the Windows Task Manager to display a graph properly

//Release of memory blocks
...

Best regards,
Sergey

Thank you, Alexander.

Quoting Alexandr Konovalov (Intel)Quoting Sergey KostrovNow, you mentioned some "workaround" in one of your posts. What did you mean?


I mean that when system allocators heap became fragmented and we cant allocate an object with size we want (say, 2MB),
we still can allocate a smaller object that might be enough for TBB allocator bootstrap and some limited operations. I dont know is it useful for real-world applications or not.

It is really hard to say what could be done ina real-world application with a2MB memory buffer.It looks like too small.
However, 2MB / sizeof( float ) = 524288 single-precisionfloating-pointvaluesinsome array.

Quoting Alexandr Konovalov (Intel)...I mean that when system allocators heap became fragmented and we cant allocate an object with size we want...

Alexandr, it has to be verified on your side. So far, I don't confirm this. I have some results of my investigation and I'll submit 3small reports.

Best regards,
Sergey

>> Test #1 - 1 memory block is allocated <<

...
Number of Memory Blocks: 1
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.00 GB

[ CRT malloc ] All memory blocks are allocated - 0 ticks
[ CRT free ] All memory blocks are released - 0 ticks
Press ENTER to continue...

DEBUG: scalable_allocator::allocate - 8192 elements - sizeof( type ) - 4
DEBUG: scalable_malloc - 32768 bytes
DEBUG: internalMalloc - 32768 bytes
DEBUG: doInitialization
DEBUG: initMemoryManager
DEBUG: ScalableMalloc Pool Granularity ( for VirtualAlloc use ) - 65536 bytes
DEBUG: ExtMemoryPool::init
DEBUG: ExtMemoryPool::initTLS
DEBUG: TLSKey::TLSKey
DEBUG: TlsAlloc OK - Index=15
DEBUG: Backend::bootstrap
DEBUG: Backend::addNewRegion - Raw size - 2097152 bytes
DEBUG: Backend::getRawMem - 2097152 bytes
DEBUG: getRawMemory - 2097152 bytes
DEBUG: MapMemory - 2097152 bytes
DEBUG: pureSignal
DEBUG: initBackRefMaster - BackRefMaster::bytes - 65536 bytes
DEBUG: initBackRefMaster - leaves - 4
DEBUG: initBackRefMaster - BackRefBlock::bytes - 16384 bytes
DEBUG: masterSize = ( BackRefMaster::bytes + leaves * BackRefBlock::bytes )
DEBUG: initBackRefMaster - masterSize - 131072 bytes
DEBUG: getRawMemory - 131072 bytes
DEBUG: MapMemory - 131072 bytes
DEBUG: MapMemory - VirtualAlloc - System Error - 0
DEBUG: ThreadId::init
[ TBB scalable_allocator ] All memory blocks are allocated - 0 ticks
DEBUG: internalFree
DEBUG: Backend::freeRawMem
[ TBB deallocate ] All memory blocks are released - 0 ticks
DEBUG: ExtMemoryPool::release16KBCaches
DEBUG: basic_tls::destroy
...

>> Test #2 -65536 memory blocksare allocated <<

...
Number of Memory Blocks: 65536
Size of Memory Block : 32768 bytes
Total Amount of Memory : 2.00 GB

[ CRT malloc ] All memory blocks are allocated - 1344 ticks
[ CRT free ] All memory blocks are released - 250 ticks
[ CRT malloc ] Failed to allocate 929 memory blocks
Press ENTER to continue...

DEBUG: scalable_allocator::allocate - 8192 elements - sizeof( type ) - 4
DEBUG: scalable_malloc - 32768 bytes
DEBUG: internalMalloc - 32768 bytes
DEBUG: doInitialization
DEBUG: initMemoryManager
DEBUG: ScalableMalloc Pool Granularity ( for VirtualAlloc use ) - 65536 bytes
DEBUG: ExtMemoryPool::init
DEBUG: ExtMemoryPool::initTLS
DEBUG: TLSKey::TLSKey
DEBUG: TlsAlloc OK - Index=15
DEBUG: Backend::bootstrap
DEBUG: Backend::addNewRegion - Raw size - 2097152 bytes
DEBUG: Backend::getRawMem - 2097152 bytes
DEBUG: getRawMemory - 2097152 bytes
DEBUG: MapMemory - 2097152 bytes
DEBUG: MapMemory - VirtualAlloc - System Error - 8
The memory manager cannot access sufficient memory to initialize; exiting
DEBUG: basic_tls::destroy
...

Note: System Error 8: Not enough storage is available to process this command ( ERROR_NOT_ENOUGH_MEMORY )

>> Test #3 -1024 memory blocksare allocated <<

...
Number of Memory Blocks: 1024
Size of Memory Block : 32768 bytes
Total Amount of Memory : 0.03 GB
...
DEBUG: scalable_malloc - 32768 bytes
DEBUG: doInitialization
DEBUG: initMemoryManager
DEBUG: ScalableMalloc Pool Granularity ( for VirtualAlloc use ) - 65536 bytes
DEBUG: ExtMemoryPool::init
DEBUG: ExtMemoryPool::initTLS
DEBUG: TLSKey::TLSKey
DEBUG: TlsAlloc OK - Index=15
DEBUG: Backend::bootstrap
DEBUG: Backend::addNewRegion - Raw size - 2097152 bytes
DEBUG: Backend::getRawMem - 2097152 bytes
DEBUG: getRawMemory - 2097152 bytes
DEBUG: MapMemory - 2097152 bytes
DEBUG: MapMemory - VirtualAlloc - 2097152 bytes - System Error - 0
DEBUG: initBackRefMaster - BackRefMaster::bytes - 65536 bytes
DEBUG: initBackRefMaster - leaves - 4
DEBUG: initBackRefMaster - BackRefBlock::bytes - 16384 bytes
DEBUG: masterSize = ( BackRefMaster::bytes + leaves * BackRefBlock::bytes )
DEBUG: initBackRefMaster - masterSize - 131072 bytes
DEBUG: getRawMemory - 131072 bytes
DEBUG: MapMemory - 131072 bytes
DEBUG: MapMemory - VirtualAlloc - 131072 bytes - System Error - 0
DEBUG: scalable_malloc - 32768 bytes
...
DEBUG: scalable_malloc - 32768 bytes
DEBUG: Backend::addNewRegion - Raw size - 4194304 bytes
DEBUG: Backend::getRawMem - 4194304 bytes
DEBUG: getRawMemory - 4194304 bytes
DEBUG: MapMemory - 4194304 bytes
DEBUG: MapMemory - VirtualAlloc - 4194304 bytes - System Error - 203
DEBUG: Backend::addNewRegion - Raw size - 4194304 bytes
DEBUG: Backend::getRawMem - 4194304 bytes
DEBUG: getRawMemory - 4194304 bytes
DEBUG: MapMemory - 4194304 bytes
DEBUG: MapMemory - VirtualAlloc - 4194304 bytes - System Error - 203
DEBUG: Backend::addNewRegion - Raw size - 4194304 bytes
DEBUG: Backend::getRawMem - 4194304 bytes
DEBUG: getRawMemory - 4194304 bytes
DEBUG: MapMemory - 4194304 bytes
DEBUG: MapMemory - VirtualAlloc - 4194304 bytes - System Error - 203
DEBUG: Backend::addNewRegion - Raw size - 4194304 bytes
DEBUG: Backend::getRawMem - 4194304 bytes
DEBUG: getRawMemory - 4194304 bytes
DEBUG: MapMemory - 4194304 bytes
DEBUG: MapMemory - VirtualAlloc - 4194304 bytes - System Error - 203
DEBUG: scalable_malloc - 32768 bytes
...

Note: System Error 203: The system could not find the environment option that was entered ( ERROR_ENVVAR_NOT_FOUND )

I'd like to note that System Error 203 is returned from 'GetLastError' Win32 API function and it looks very strange.
I wouldn't blame aVirtualAlloc function because something else is wrong with aBackend::addNewRegion method.

This isupdated version of 'MapMemory' function from 'MapMemory.h' header file:

void * MapMemory( size_t bytes )

{

     printf( "DEBUG: MapMemory - %ld bytesn", bytes );
//  /* Is VirtualAlloc thread safe? */

//  return VirtualAlloc(NULL, bytes, (MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN), PAGE_READWRITE);
     DWORD dwLastError = 0;
     void *pvMemory = ::VirtualAlloc( NULL, bytes, ( MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN ), PAGE_READWRITE );

     dwLastError = ::GetLastError();

     printf( "DEBUG: MapMemory - VirtualAlloc - %ld bytes - System Error - %ldn", bytes, dwLastError );
//   return VirtualAlloc(NULL, bytes, (MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN), PAGE_READWRITE);

     return ( void * )pvMemory;

}

Quoting TBB sources...
/*IsVirtualAllocthreadsafe?*/
returnVirtualAlloc(NULL,bytes,(MEM_RESERVE|MEM_COMMIT|MEM_TOP_DOWN),PAGE_READWRITE);
...

Inoticed that question in the 'MapMemory' function of theTBB and I haven't found anything that says 'VirtualAlloc' Win32 API functionis not thread safe.

Best regards,
Sergey

Quoting Sergey Kostrov...
DEBUG: MapMemory - VirtualAlloc - 4194304 bytes - System Error - 203
...
I'd like to note that System Error 203 is returned from 'GetLastError' Win32 API function and it looks very strange.
I wouldn't blame a 'VirtualAlloc' function...

I will post a test-case ( without TBB )for the 'VirtualAlloc' Win32 API function soon. The test-case is very simple and Icouldn't
reproduce the'203' system error.

I also would like to get some information from TBB software developers on the current state of your investigation. Thank you in advance.

Best regards,
Sergey

Quoting Sergey Kostrov...
DEBUG: MapMemory - VirtualAlloc - 4194304 bytes - System Error - 203
...
I'd like to note that System Error 203 is returned from 'GetLastError' Win32 API function and it looks very strange...

I wanted to reproduce a '203' system error with VirtualAlloc Win32 API function and all my attempts failed.

After a series of tests I can say that a last system error is '0', when amemory block is allocated, or '8' when
it can't allocate a memory block. It means, that something goes wrong in TbbMalloc library when scalable_allocator is used.

Here is a summary of my investigation:

When a size of the memory block is 16K VirtualAlloc cannot allocate more then 32415 memory blocks or 521,304KB ( ~0.50GB ) of memory

When a size of the memory block is 32K VirtualAlloc cannot allocate more then 32415 memory blocks or 1,039,944KB ( ~0.99GB ) of memory

When a size of the memory block is 64K VirtualAlloc cannot allocate more then 32415 memory blocks or 2,077,224KB ( ~1.98GB ) of memory

So,in my development environment ona 32-bit Windows XP with Microsoft C++ compiler ( VS 2005 Professional Edition )a maximum number of blocks
that VirtualAlloc could allocate is 32415.

Take into account that magic number will be different in your developmentenvironment because it depends on:

- Version of a 32-bit / 64-bit Windows OS
- C++ compiler
- Configuration of your project ( Release or Debug )
- Optimization options ( Disabled,Full or Custom, etc)
- Linker Settings:

How many dependent DLLs are mapped into the address space of your test application

On values set in Heap & Stack( Reserve \ Commit )fields ( please see ):

Configuration Properties -> Linker -> System ->

Heap Reserve Size = 0
Heap Commit Size = 0
Stack Reserve Size = 0
Stack Commit Size = 0

however, results should be consistent.

Quoting Sergey Kostrov...a test-case ( without TBB )for the 'VirtualAlloc' Win32 API function...

//	#define _TEST_CASE_1									// Only one macro has to be defined

//	#define _TEST_CASE_2

	#define _TEST_CASE_3
	#ifdef _TEST_CASE_1										// Test-Case 1

	#define _NUM_OF_ELEMENTS			4096				// Number of elements

//	const RTint _NUM_OF_MEMORYBLOCKS =     1;				//   16KB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    16;				// 0.25MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    32;				// 0.50MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    64;				//    1MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   128;				//    2MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   256;				//    4MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   512;				//    8MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  1024;				//   16MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  2048;				//   32MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  4192;				//   64MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  8192;				//  128MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 16384;				//  256MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 32768;				// 0.50GB -   353 memory blocks not allocated

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 16384;		// 0.75GB - 16737 memory blocks not allocated

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 -  8192;		// 0.88GB - 24929 memory blocks not allocated

	const RTint _NUM_OF_MEMORYBLOCKS = 65536;				// 1.00GB - 33121 memory blocks not allocated

//	When a size of the memory block is 16K last 4 cases cannot allocate more then 32415 memory blocks or

//  521,304KB ( ~0.50GB ) of memory

	#endif
	#ifdef _TEST_CASE_2										// Test-Case 2

	#define _NUM_OF_ELEMENTS			8192				// Number of elements

//	const RTint _NUM_OF_MEMORYBLOCKS =     1;				//   32KB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    16;				//  0.5MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    32;				//    1MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    64;				//    2MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   128;				//    4MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   256;				//    8MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   512;				//   16MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  1024;				//   32MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  2048;				//   64MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  4192;				//  128MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  8192;				//  256MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 16384;				//  512MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 32768;				// 1.00GB -   353 memory blocks not allocated

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 16384;		// 1.50GB - 16737 memory blocks not allocated

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 -  8192;		// 1.75GB - 24929 memory blocks not allocated

	const RTint _NUM_OF_MEMORYBLOCKS = 65536;				// 2.00GB - 33121 memory blocks not allocated

//	When a size of the memory block is 32K last 4 cases cannot allocate more then 32415 memory blocks or

//  1,039,944KB ( ~0.99GB ) of memory

	#endif
	#ifdef _TEST_CASE_3										// Test-Case 3

	#define _NUM_OF_ELEMENTS		   16384				// Number of elements

//	const RTint _NUM_OF_MEMORYBLOCKS =     1;				//   64KB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    16;				//    1MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    32;				//    2MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =    64;				//    4MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   128;				//    8MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   256;				//   16MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =   512;				//   32MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  1024;				//   64MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  2048;				//  128MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  4192;				//  256MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS =  8192;				//  512MB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 16384;				// 1.00GB - OK

//	const RTint _NUM_OF_MEMORYBLOCKS = 32768;				// 2.00GB -   353 memory blocks not allocated

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 - 16384;		// 3.00GB - 16737 memory blocks not allocated

//	const RTint _NUM_OF_MEMORYBLOCKS = 65536 -  8192;		// 3.50GB - 24929 memory blocks not allocated

	const RTint _NUM_OF_MEMORYBLOCKS = 65536;				// 4.00GB - 33121 memory blocks not allocated

//	When a size of the memory block is 64K last 4 cases cannot allocate more then 32415 memory blocks or

//  2,077,224KB ( ~1.98GB ) of memory

	#endif
	RTuint uiBytesToAllocate = _NUM_OF_ELEMENTS * sizeof( RTfloat );

	RTvoid *pvMemory = RTnull;

	DWORD dwLastError = 0;

	RTuint uiNumOfMemBlocksNotAllocated = 0U;
	CrtPrintf( RTU("Number of Memory Blocks         : %ldn"), ( RTint )_NUM_OF_MEMORYBLOCKS );

	CrtPrintf( RTU("Size of Memory Block            : %ld bytesn"), ( RTuint )uiBytesToAllocate );

	CrtPrintf( RTU("Total Amount of Memory requested: %.3f GBnn"),

			   ( ( RTfloat )uiBytesToAllocate * ( RTfloat )_NUM_OF_MEMORYBLOCKS ) / 1024.0f / 1024.0f / 1024.0f );
	for( RTint i = 0; i < _NUM_OF_MEMORYBLOCKS; i++ )

	{

		pvMemory = ::VirtualAlloc( NULL, ( SIZE_T )uiBytesToAllocate,

								   ( MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN ), PAGE_READWRITE );

		if( pvMemory == RTnull )

		{

			dwLastError = ::GetLastError();
			uiNumOfMemBlocksNotAllocated += 1;
			CrtPrintf( RTU("[ VirtualAlloc ] %ld bytes NOT allocated - Block #: %5ld - System Error - %ldn"),

					   uiBytesToAllocate, ( i+1 ), dwLastError );

		}

	}
	if( uiNumOfMemBlocksNotAllocated != 0U )

	{

		CrtPrintf( RTU("Error: [ VirtualAlloc ] Failed to allocate %5ld memory blocksn"),

				   uiNumOfMemBlocksNotAllocated );

	}
	CrtPrintf( RTU("Press ENTER to exit...n") );

	CrtGetChar();

Here is a screenshot for the Test-Case 3:

( for the last 7 cases... )

Quoting Sergey Kostrov...in my development environment ona 32-bit Windows XP with Microsoft C++ compiler ( VS 2005 Professional Edition )a maximum number of blocks
that VirtualAlloc could allocate is 32415.

Take into account that magic number will be different in your developmentenvironment because it depends on:

- Version of a 32-bit / 64-bit Windows OS
- C++ compiler

Here is example:The sametest-case built with 32-bitMinGW C++ compiler allocated 32,634 memory blocks, or 2,090,836KB /1.994GB of memory
on a 32-bit Windows XP.

发表评论

登录添加评论。还不是成员?立即加入