Aligning with new: for a beginner

Aligning with new: for a beginner

llevrel's picture

Hi,

I would like to align an array of class instances that's dynamically allocated as follows:

Myclass *foo=new Myclass[N];

I'll pad the class so that its instances are the size of a cache line. Now I obviously want the array elements to be cache aligned.

How do I achieve this? I'm rather new to C++, so don't be afraid of giving too much detail!

Thanks in advance

32 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Georg Zitzlsberger (Intel)'s picture
Best Reply

Hello,

there are two basic ways to achieve what you want but with different levels of complexity:

1. Aligning objects usually does make less sense on a high level view. Most likely you have some data members (e.g. arrays) you want to have aligned but not the object itself. For that you can use the __declspcec(align(…)) or __attribute__((aligned(…))) attributes when declaring the data members. Example:


class A {

  private:

    int some_member;

    __declspec(align(32)) int data[...];

 };

2. If you really know the layout of the object and you want to enforce a certain alignment for the base address of that object, you can use the "placement new" operator. The memory area you pass to it can be allocated with _mm_malloc(...) upfront. _mm_malloc(...) allows to specify the alignment. Or you call standard "malloc" here and enforce correct alignment manually.

Best regards,

Georg Zitzlsberger

llevrel's picture

Thank you very much (sorry for not responding earlier, for some reason I don't receive email notifications of follow-ups in my subscribed forum threads).

Now, suppose I declare:
class A {
private:
__declspec(align(32)) int first_member;
double other_member;
};

and define:
A* array=new A[...];

Will each element array[i] be 32-byte aligned?

Sergey Kostrov's picture

>>...Will each element array[i] be 32-byte aligned?
.
I think No. Don't forget that Microsoft compatible C++ compilers have an option /Zp[nn] that sets some alignment for all data types or members of a class / struct. A default value is 8-byte alignment. MSDN also recommends '...You should not use this option unless you have specific alignment requirements...'.

llevrel's picture

Thanks.

I did some tests (using 128-byte alignment instead of 32), and the result is rather strange: elements array[i] are not 128-byte aligned, but are 128-byte spaced! That is, class A has been padded to 128 bytes: long(&(array[i].first_member))%128 equals 16 for any i.

On the other hand, static allocation (A foo[10];) gives the expected result: long(&(array[i].first_member))%128 equals 0 for any i.

Since my goal is avoiding false sharing between array elements, this will do the trick if my class is smaller than 112 bytes (128-16) before padding..

Sergey Kostrov's picture

>>...That is, class A has been padded to 128 bytes...
.
That is absolutely expected result and I'm glad to see that static allocation ( on the stack ) resolves the problem. By the way, a C++ operator 'new' allocates a memory block from the heap.

Sergey Kostrov's picture

This is a test with a small C code...
.


void main( void )

{

     printf("Hello New Edit Controln");

}

llevrel's picture

Quote:

Sergey Kostrov wrote:I'm glad to see that static allocation ( on the stack ) resolves the problem.

Well, not exactly. Static allocation works as expected in the tests, but it's not an option in my case (I'm not using dynamic allocation for fun ;-) ).
However, as I stated, the 16-byte offset introduced by "new" will not harm, as long as my class uses less than 112 bytes.
Sergey Kostrov's picture

Quote:

Sergey Kostrov wrote:

This is a test with a small C code...
.


void main( void )

{

     printf("Hello New Edit Controln");

}


.
A backslash before 'n' in the 'printf' function was deleted. Is it a feature or a bug of the New Edit Control?
Sergey Kostrov's picture

>>...Well, not exactly. Static allocation works as expected in the tests, but it's not an option in my case...
.
Did you try to allocate a memory for your data with 'alloca' or '_alloca' CRT-function since it allocates memory on the stack? The function is also very fast.

llevrel's picture

Quote:

Sergey Kostrov wrote:
Did you try to allocate a memory for your data with 'alloca' or '_alloca' CRT-function since it allocates memory on the stack? The function is also very fast.

Here I need "long lasting" memory. But thanks anyway for the pointer, I didn't know alloca.
Sergey Kostrov's picture

There is also '_aligned_malloc' CRT-function. Please take a look.

Sergey Kostrov's picture

>>...I need "long lasting" memory...
.
Do you know that malloc-like CRT-functions ( malloc, calloc, alloca, etc ) could be used to create an instance of a C++ object? However, there is one problem in that case because a constructor won't be called. Let me know if you need some examples on how to do it.

llevrel's picture

>> some examples on how to do it.

Do you mean "how to call the constructor"? Then yes, I'd like an example.

Igor Levicki's picture

You cannot use __declspec(align) to align dynamically allocated class objects. You need to write placement new and delete operators which will allocate and free aligned memory. Search the forum, I have already explained it and gave examples long time ago.

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.
Sergey Kostrov's picture

>>...Do you mean "how to call the constructor"? Then yes, I'd like an example.
.
I don't call a C++ object constructor explicitly. I call a special method of the class that simulates a constructor. I'll post an example later.
.
Best regards,
Sergey

Sergey Kostrov's picture

>>...You need to write placement new and delete operators which will allocate and free aligned memory...
.
I did a quick investigation and I see that a C++ operator '::operator new( nSize )' has to be called in order to construct an object properly ( with Virtual Table initialized ).
Usually it looks like:
.


...

void * PASCAL CSomeObject::operator new( size_t nSize )

{

     return ::operator new( nSize );

}

...


.
and if you replace it with:
.

...

void * PASCAL CSomeObject::operator new( size_t nSize )

{

     return ::alligned_malloc( nSize );

}

...


.
an object won't be constructed completely and Virtual Table wont' be initialized.

Sergey Kostrov's picture

>>>>...Do you mean "how to call the constructor"? Then yes, I'd like an example.
>>.
>>I don't call a C++ object constructor explicitly. I call a special method of the class that simulates a constructor. I'll post an example later.
.


class CSomeObject

{

public:

   CSomeObject()

   {

      Init();

   };

   ~CSomeObject(){};
   void Init(){};

};

...

void main( void )

{

   CSomeObject *pSO = NULL;

   pSO = ( CSomeObject * )malloc( sizeof( CSomeObject ) );

   pSO->Init();

   if( pSO != NULL )

   {

      free( pSO );

      pSO = NULL;

   }

}


.
Take into account that the method of creating / deleting a C++ object with CRT-fuctions 'malloc' / 'free' has lots of limitations. It has to be used in a very simple cases when the C++ object is simple and doesn't have any virtual functions ( including a virtual destructor ). However, it is a very fast way to create thousands of small objects when there are time constraints ( for example in a real-time environment ).

Sergey Kostrov's picture

Why does the new editor add extra lines when I "embedd" C++ codes? Take a look at previous post at lines 2, 4, 6, etc, in my C++ example. Also, the editor changes a font to a smaller one.

Igor Levicki's picture

As i said, write placement new and delete operators and your constructors and destructors will work. Sergey, you are confusing him with unnecessary complexity which does not solve the problem properly. Placement new is described on wikipedia, just google for it.

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.
Sergey Kostrov's picture

>>...I would like to align an array of class instances that's dynamically allocated as follows:
>>
>>Myclass *foo=new Myclass[N];
.
Why wouldn't you consider an array of aligned pointers to objects of Myclass? Here is a pseudo-codes:
.


...

class Myclass

{

public:

     Myclass(){};

     ~Myclass(){};

};

...

align( some boundary ) Myclass *pFoo[ NumberOfElements ] = { NULL };

.

for( i = 0; i < NumberOfElements; i+=1 )

{

     pFoo[i] = ( Myclass * )new Myclass();

}

Sergey Kostrov's picture

>>...Sergey, you are confusing him with unnecessary complexity...
.
Sorry if I over-complicated the case.
Best regards,
Sergey

llevrel's picture

Quote:

Sergey Kostrov wrote:Why wouldn't you consider an array of aligned pointers to objects of Myclass?

The objects wouldn't be contiguous, and accessing N objects requires dereferencing N pointers, which is inefficient.
Thanks for your interest.

Thanks to Igor as well.

Sergey Kostrov's picture

>>...The objects wouldn't be contiguous...
.
Consider a very simple example with a built-in data type 'int':


...

	int *piData = NULL;
	piData = new int[32];
	for( int i = 0; i < 32; i++ )

	     piData[i] = 0x67452301;

...


.
and this is how the allocated and initialized memory looks like in a VS 2005 debugger:
.
0x01A7E8B8 01 23 45 67 01 23 45 67 01 23
0x01A7E8C2 45 67 01 23 45 67 01 23 45 67
0x01A7E8CC 01 23 45 67 01 23 45 67 01 23
0x01A7E8D6 45 67 01 23 45 67 01 23 45 67
0x01A7E8E0 01 23 45 67 01 23 45 67 01 23
0x01A7E8EA 45 67 01 23 45 67 01 23 45 67
0x01A7E8F4 01 23 45 67 01 23 45 67 01 23
0x01A7E8FE 45 67 01 23 45 67 01 23 45 67
0x01A7E908 01 23 45 67 01 23 45 67 01 23
0x01A7E912 45 67 01 23 45 67 01 23 45 67
0x01A7E91C 01 23 45 67 01 23 45 67 01 23
0x01A7E926 45 67 01 23 45 67 01 23 45 67
0x01A7E930 01 23 45 67 01 23 45 67 00 00
.
Very large chunks of memory allocated by C++ operators ( new / new[] ) or memory allocation CRT-functions ( malloc / calloc / _alloca ) could be fragmented however it will be done by a Virtual Memory Manager of an OS.

Sergey Kostrov's picture

Please follow a new thread on 'Intel C++ compiler' forum:
.
Forum topic: 'Aligned C++ objects created with a 'new' C++ operator'
.
Web-link: http://software.intel.com/en-us/forums/topic/328882

llevrel's picture

Quote:

Consider a very simple example with a built-in data type 'int':

...

	int *piData = NULL;
	piData = new int[32];
	for( int i = 0; i < 32; i++ )

	     piData[i] = 0x67452301;

...


This is absolutely not the same as your previous comment where "new" is inside the loop!
Sergey Kostrov's picture

>>...The objects wouldn't be contiguous...
.
Do you have a test-case that reproduces the problem? It would be nice to investigate it. Thanks in advance.

Sergey Kostrov's picture

>>>>...The objects wouldn't be contiguous...
>>.
>>Do you have a test-case that reproduces the problem? It would be nice to investigate it...
.
I'll post results of my investigation some time later. I've investigated it and reproduced.

Sergey Kostrov's picture

>>...The objects wouldn't be contiguous...

In order to verify it I've done a small modification in source codes of the CAlignedObject class:


class CAlignedObject

{

public:

	CAlignedObject( void )

	{

		for( RTint i = 0; i < 5; i++ )

			m_iData[i] = 0x77777777;

	};

...

public:

	RTint m_iData[5];

};
int main( void )

{

	RTint iSizeOfAO = sizeof( CAlignedObject );
	// Case 1

	CAlignedObject *pAO[2] = { NULL, NULL };

	pAO[0] = new( 32 ) CAlignedObject();

	pAO[1] = new( 32 ) CAlignedObject();
	// Case 2

	CAlignedObject *pAO1 = NULL;

	CAlignedObject *pAO2 = NULL;

	pAO1 = new( 32 ) CAlignedObject();

	pAO2 = new( 32 ) CAlignedObject();
	// Case 3

	CAlignedObject *pAO3 = NULL;

	pAO3 = new CAlignedObject[2]();
	return ( int )0;

}

Sergey Kostrov's picture

Three simple tests are done and here are results ( from a Memory window of VS 2005 ):
.
[ Test-Case 1 ]
...
0x02124440 77 77 77 77 77 77 77 77 77 77
0x0212444A 77 77 77 77 77 77 77 77 77 77
0x02124454 cd cd cd cd cd cd cd cd cd cd
0x0212445E cd cd cd cd cd cd cd cd cd cd
0x02124468 cd cd cd fd fd fd fd ab ab ab
0x02124472 ab ab ab ab ab fe 00 00 00 00
0x0212447C 00 00 00 00 ff 00 0f 00 ee 04
0x02124486 ee 00 b0 a1 12 02 78 01 12 02
0x02124486 19 00 10 44 12 02 00 00 00 00
0x02124490 98 33 60 00 9d 0f 00 00 3b 00
0x0212449A 00 00 01 00 00 00 55 00 00 00
0x021244A4 fd fd fd fd cd cd cd cd cd cd
0x021244AE cd cd cd cd cd cd cd cd cd cd
0x021244B8 a8 44 12 02 ed ed ed ed 77 77
0x021244C2 77 77 77 77 77 77 77 77 77 77
0x021244CC 77 77 77 77 77 77 77 77 cd cd
0x021244D6 cd cd cd cd cd cd cd cd cd cd
0x021244E0 cd cd cd fd fd fd fd ab ab ab
0x021244EA ab ab ab ab ab fe 00 00 00 00
0x021244F4 00 00 00 00 f0 00 0f 00 ee 04
...
.
[ Test-Case 2 ]
...
0x02124540 77 77 77 77 77 77 77 77 77 77
0x0212454A 77 77 77 77 77 77 77 77 77 77
0x02124554 cd cd cd cd cd cd cd fd fd fd
0x0212455E fd ab ab ab ab ab ab ab ab fe
0x02124568 00 00 00 00 00 00 00 00 0f 00
0x02124572 0f 00 46 07 19 00 00 45 12 02
0x0212457C 00 00 00 00 98 33 60 00 9d 0f
0x02124586 00 00 3b 00 00 00 01 00 00 00
0x02124590 57 00 00 00 fd fd fd fd 98 45
0x0212459A 12 02 ed ed ed ed 77 77 77 77
0x021245A4 77 77 77 77 77 77 77 77 77 77
0x021245AE 77 77 77 77 77 77 cd cd cd cd
0x021245B8 cd cd cd cd cd cd cd cd cd cd
0x021245C2 cd cd cd cd cd cd cd cd cd cd
0x021245CC cd cd cd cd cd cd cd fd fd fd
0x021245D6 fd ab ab ab ab ab ab ab ab fe
...
.
[ Test-Case 3 ]
...
0x0039FF2C 77 77 77 77 77 77 77 77 77 77...................................// Is that what you need?
0x0039FF36 77 77 77 77 77 77 77 77 77 77
0x0039FF40 77 77 77 77 77 77 77 77 77 77
0x0039FF4A 77 77 77 77 77 77 77 77 77 77
0x0039FF54 fd fd fd fd ab ab ab ab ab ab
0x0039FF5E ab ab 00 00 00 00 00 00 00 00
0x0039FF68 13 00 0d 00 ee 14 ee 00 10 02
0x0039FF72 39 00 10 02 39 00 ee fe ee fe
0x0039FF7C ee fe ee fe ee fe ee fe ee fe
0x0039FF86 ee fe ee fe ee fe ee fe ee fe
0x0039FF90 ee fe ee fe ee fe ee fe ee fe
0x0039FF9A ee fe ee fe ee fe ee fe ee fe
0x0039FFA4 ee fe ee fe ee fe ee fe ee fe
0x0039FFAE ee fe ee fe ee fe ee fe ee fe
0x0039FFB8 ee fe ee fe ee fe ee fe ee fe
0x0039FFC2 ee fe ee fe ee fe ee fe ee fe
...

llevrel's picture

Quote:

Sergey Kostrov wrote:[ Test-Case 3 ] Is that what you need?

Yes, it is.
Sergey Kostrov's picture

>>Forum topic: 'Aligned C++ objects created with a 'new' C++ operator'
>>.
>>Web-link: http://software.intel.com/en-us/forums/topic/328882
.
Did you have a chance to look at it? You could combine Jim's and my codes in order to create your own solution.

Login to leave a comment.