Complex Matrix alignement

Complex Matrix alignement

Sorry,i've errounesly posted on HPC forum, but i thinks this is the rigth section.

Dear Intel developers,

i'm trying to do cache alignement over struct array defined as is:

struct complex_32 {
    float32 r;
    float32 i;
};
typedef struct complex_32 complex32;
static complex32  **traces;
traces = (complex32 **)malloc( *num_elems * sizeof(complex32 *));
for (i = 0; i < *num_elems; i++) 
    traces[i] = (complex32 *)malloc( *num_samples * sizeof(complex32));

i want to align for 16 bytes. Which is the right syntax using __declspec(align(16))) ?

Actually, using _mm_malloc instead of malloc, the code crashes on forst _mm_load_ps intrinsic.

Thanks in advance for the help.

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

__declspec will affect only variables creatd on stack or in global scope. When you allocate memory using malloc() it has no slightest clue what is it for, so __declspec is not affect it.

Though, normally, malloc returns pointers to blocks that are 16-bytes aligned.

Using _mm_malloc instead your first malloc, where you allocate array of POINTERS, is not good idea. But probably second malloc may be replaced. I can not say for sure because I haven't found _mm_malloc in my help files.

Hi archie,thansk for the reply. My matrix has a global scope. You told that malloc return pointers alrready aligned. So, when i have to use _mm_malloc instead of standard malloc?

It's not question of scope, it's about how you create object of a class (structue). If you create object as a variable -- __declspec will work. For example:

complex_32 _i_ = { 0.0, 1.0 };

will use __declspec, because compiler knows the type of object to be created.

When u use void* malloc(size_t) -- the function can not know what for it does allocate memory. It just allocates it.

About _mm_malloc, as I said, I know nothing, so I just made wild guess based on its name.


beware that you can't count on malloc 16B alignment, see for example this thread :

http://software.intel.com/en-us/forums/showthread.php?t=74181&o=d&s=lr

I would never blindly rely on any assuptions about malloc, I've just said that it tends to return aligned pointers, but I understand it can change at any time. Anyway, it's trivial to request 16 bytes more than necessary, check alignment and adjust pointer if necessary.

Leave a Comment

Please sign in to add a comment. Not a member? Join today