Running in to some issues with Composer XE 13 Beta code generation for AVX.
Specifically, in release builds on x86 it's not 32-byte-aligning function-scope AVX data on the stack. This causes an illegal instruction when the program executes.
This issue does not occur:-
* When using the Microsoft C++ compiler for x86 or x64 (though they have their own issues...)
* When using the Intel C++ compiler for x64, provided AVX code generation is switched on in the options.
Compiler version: 2013_beta_0.060
OS: Windows 8 release preview
CPU: Sandy Bridge i5-2500
Compiler command line options (excerpt - removed some include paths):-
/GS- /Qftz /W3 /QxAVX /Gy /Zc:wchar_t /Zi /Ox /Ob1 /fp:fast /D "__INTEL_COMPILER=1300" /Zc:forScope /GR /arch:AVX /Gd /Oy /Oi /MT /EHsc /nologo /FAs /Ot
Excerpt from generated code:-
5518AD0D vmovdqu ymmword ptr [esp+550h],ymm5
5518AD16 vmovaps ymm0,ymmword ptr [esp+530h]
5518AD1F vmovaps ymm1,ymmword ptr [esp+550h]
Note that esp is 32-byte aligned at this point, but the offset addresses (530, 550 etc.) being generated are not.
Also note that the compiler appears to be generating a lot of unaligned loads/stores, even though the data are declared as being aligned.
The local variables are declared as follows:-
// MSVC style alignment
#define vecpre __declspec(align(32))
// AVX typedef for vector type
typedef __m256 vec_float;
// Aligned version
typedef vecpre vec_float align_vec_float vecpost ;
Anyone care to shed any light?