Is AVX enabled?

If we ask anyone who  uses or plans to use or just advertises the intrinsic compiler functions for SIMD support (MMX, SSE, AVX):  why do you do so, why it is good? The answer definitely will be something like this:"Intrinsics provide a C/C++ language interface to assembly instructions, so that we don't need to deal with assembler".

 Sounds more than good... but it is not true unfortunately. Whereas intrinsics do make the use of cpu specific enhancements easier significantly, they don't eliminate the need to do some asm programming  entirely. The problem is that intrinsics don't provide the fallback path for the systems without corresponding SIMD support - if the "intrinsics inside" program is executed on such CPU,  it crashes.  To prevent it one needs to  create the "generic code"  path and switch to\from it  depending on the host system SIMD support. And to detect this support trusty and reliably  it is necessary to use assembler! No other common solution is available yet... And the most burning issue here is the Advanced Vector Extensions support detection - the AVX is not widespread yet and requires the OS support. While the Inel AVX Programming Reference contains the asm pseudocode for AVX support detection, it is not enough. What is wanted by the most developers is some cut-and-paste  code that could be used as is even without  any assembler knowleage. And such code is available - see below. 

The asm syntax differs for the 32 and 64 bits, so you need to include the corresponding version in your project and in both cases in C\C++ code call the  isAvxSupported() function - to return 1 if  AVX is supported or zero otherwise 

 extern C" int isAvxSupported();

 AVXsupport = isAvxSupported();   // = one if supported and zero otherwise

 ----------------------------------------cpuid64.asm----------------------------------------------------
[shell]; CPUID Win64
.code

; int isAvxSupported();
isAvxSupported proc
xor eax, eax
cpuid
cmp eax, 1 ; does CPUID support eax = 1?
jb not_supported
mov eax, 1
cpuid
and ecx, 018000000h ;check 27 bit (OS uses XSAVE/XRSTOR)
cmp ecx, 018000000h ; and 28 (AVX supported by CPU)
jne not_supported
xor ecx, ecx ; XFEATURE_ENABLED_MASK/XCR0 register number = 0
xgetbv ; XFEATURE_ENABLED_MASK register is in edx:eax
and eax, 110b
cmp eax, 110b ; check the AVX registers restore at context switch
jne not_supported
mov eax, 1
ret
not_supported:
xor eax, eax
ret
isAvxSupported endp
END[/shell]
------------------------cpuid32.asm--------------------------
[shell].686p
.xmm
.model FLAT

; CPUID Win32
.code

; int isAvxSupported();
_isAvxSupported proc
xor eax, eax
cpuid
cmp eax, 1 ; does CPUID support eax = 1?
jb not_supported
mov eax, 1
cpuid
and ecx, 018000000h ;check 27 bit (OS uses XSAVE/XRSTOR)
cmp ecx, 018000000h ; and 28 (AVX supported by CPU)
jne not_supported
xor ecx, ecx ; XFEATURE_ENABLED_MASK/XCR0 register number = 0
xgetbv ; XFEATURE_ENABLED_MASK register is in edx:eax
and eax, 110b
cmp eax, 110b ; check the AVX registers restore at context switch
jne not_supported
mov eax, 1
ret
not_supported:
xor eax, eax
ret
_isAvxSupported endp
END[/shell]

For more complete information about compiler optimizations, see our Optimization Notice.

Comments

This code does not play well with the C calling convention on Win32/Win64 (CPUID stomps on EBX). In the 32-bit version, bracket the function with "push ebx / pop ebx". For the 64-bit version, bracket with "mov r10, rbx / mov rbx, r10". Note that there are two return paths, each of which need to restore EBX/RBX.


Sean Gies, thanks a lot for this addition. I've tested the code mentioned in post on several 32 and 64 bits platforms but haven't seen any problems. However if they are possible then your solution is very valuable - I plan to give this post link to any ISV asking "How to detect AVX correctly".


Victoria, you're welcome. It's one of those problems that can easily go unnoticed if the calling function happens to not use EBX/RBX after isAvxSupported returns. This was the case when I first used it, but then I tried an optimized build and the resulting executable did use RBX afterward, resulting in an invalid memory access.


I can confirm Sean's initial comment. I was getting access violations using the code above when calling isAvxSupported() from 64-bit release code on MSVC until I added the RBX register restore (I was not seeing a problem calling from 64-bit release code on ICC). 32-bit seemed to play nice across the board, but I added the EBX register restore just to be safe.