IPP causes invalid opcode exception at h9_ippsFFTGetSize_C_32fc

IPP causes invalid opcode exception at h9_ippsFFTGetSize_C_32fc

We are using IPP version 7.1.1.119 on 4th generation (Haswell) Core i7 processor under INtime (5) operating system.

We are using static linkage (#include <ipp_h9.h> before #include <ipp.h>).

A call to ippsFFTInitAlloc_C_32fc causes an invalid opcode exception. This occurs inside h9_ippsFFTGetSize_C_32fc function when trying to execute the les esp,edx instruction.

Note: When configuring IPP for AVX rather than AVX2 (using ipp_g9.h instead of ipp_h9.h) - everything works correctly. It so happens that  g9_ippsFFTGetSize_C_32fc does not compries that les instruction.

We verified that our processor supports AVX2 (ran the piece of code suggested by Intel for checking this).

Please advise.

Thanks,

Beni Falk

30 post / 0 nuovi
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione

http://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions says:

"The new instructions are encoded using what Intel calls a VEX prefix, which is a two- or three-byte prefix designed to clean up the complexity of current and future x86/x64 instruction encoding. The two new VEX prefixes are formed from two obsolete 32-bit instructions-Load Pointer Using DS (LDS-0xC4, 3-byte form) and Load Pointer Using ES (LES-0xC5, two-byte form)-which load the DS and ES segment registers in 32-bit mode. In 64-bit mode, opcodes LDS and LES generate an invalid-opcode exception, but under Intel® AVX, these opcodes are repurposed for encoding new instruction prefixes. As a result, the VEX instructions can only be used when running in 64-bit mode. The prefixes allow encoding more registers than previous x86 instructions and are required for accessing the new 256-bit SIMD registers or using the three- and four-operand syntax. As a user, you do not need to worry about this (unless you're writing assemblers or disassemblers)."

 I have the following questions:

1. I am currently compiling and running my code in 32-bit mode. Does it mean the I cannot profitably use IPP on AVX and AVX2?

2. As I wrote, when I have configured IPP for using AVX (rather than AVX2) the problem did not occur and everything seemed to work correctly. Given the above statement at Intel's site, how could it work? Or does IPP somehow switch the processor to 64 bit mode before performing the operation and switches it back afterwards? Please excuse me in advance if this is a dumb question.

Thanks,

 

Beni Falk

 

Citazione:

Beni F. ha scritto:
As a result, the VEX instructions can only be used when running in 64-bit mode. 

this paper is wrong about that (*), you can use AVX and AVX2 in both 32-bit and 64-bit modes, it's working in front of me as I type this text

* I signaled it here http://software.intel.com/en-us/forums/topic/279901 18 months ago, but for some reason it's still not fixed

Ritratto di iliyapolak

It is very strange that this error was not corrected.

Ritratto di iliyapolak

As bronxzv said you can use both AVX and AVX2 instructions set in protected mode and in long mode.Bear in mind that in 64-bit mode you have additional 8 YMMn registers and 8 gp 64-bit registers more at your disposal.

My problem is that I am using AVX2 via IPP (rather via manually crafted assembly code) and IPP crashes (at least while I am working in 32-bit mode).

Is there a way to work aroung this issue? Do Intel plan to issue a fix for IPP to address it, or does IPP in AVX2 mode mandate 64-bit mode (now and forever)?

Note: our problem occurred when trying to use FFT functions in IPP. I presume that some AVX2 instructions are available in 32-bit mode and some (the ones using VEX prefix) aren't. It is also logical to suppose that there are performance benefits to using some VEX instructions in conjunction with FFT (or else IPP wouldn't use them). Is it such a significant performance boost that Intel would not support using IPP FFT functions in 32-bit mode?

In my opinion the best approach would be for Intel to support both kinds of usage. Just my two cents.

 

Ritratto di iliyapolak

Can you somehow identify that instruction?Maybe with the help of debugger.

Citazione:

Beni F. ha scritto:
I presume that some AVX2 instructions are available in 32-bit mode and some (the ones using VEX prefix) aren't.

this is an erroneous assumption, as already explained the paper at your link is plain wrong about that and unfortunately, as you prove it here, very confusing for newcomers to AVX

btw what you describe looks much like a potential bug in IPP, I'll suggest to report it on the dedicated forum

@iliyapolak: as I wrote in my original post, the debugger shows the offending instruction as: les esp,edx

@bronzxv:

1. I also approached TenAsys (the vendor of the INtime operating system) with my problem and they wrote to me that AVX2 instructions that use the VEX prefix cannot execute in 32-bit mode. Seems that I am not the only one who got confused.

2. If the VEX instructions can in fact execute in 32-bit mode, why do I get an invalid opcode exception when hitting such an instruction in h9_ippsFFTGetSize_C_32fc?

3. If, as you say, this is a bug in IPP, where can I report it?

Thanks,

Beni Falk

Citazione:

Beni F. ha scritto:
3. If, as you say, this is a bug in IPP, where can I report it?

I said that it looks like a potential bug, you can report it here: http://software.intel.com/en-us/forums/intel-integrated-performance-primitives

OK, thanks.

Ritratto di iliyapolak

>>>@iliyapolak: as I wrote in my original post, the debugger shows the offending instruction as: les esp,edx>>>

sorry have not seen that.

Ritratto di iliyapolak

>>>les esp,edx>>>

Afaik les instruction was used to set up far pointers

ilyapolak - please see my post second from the top of this thread. I quoted there from an Intel site where they explain about the VEX prefix instructions.

Citazione:

Beni F. ha scritto:
@iliyapolak: as I wrote in my original post, the debugger shows the offending instruction as: les esp,edx

are you sure that your debugger has proper support for AVX2 instructions? it may be a legitimate crash due to an AVX2 instruction (for example an instruction that your CPU doesn't support, case in point TSX instuctions on K series CPUs) but the debugger is wrongly reporting it as a legacy LES ?

EDIT: can you tell us the value of the byte right after the leading 0xc5 of this "LES" ?

 

Ritratto di iliyapolak

Yes I have read your post.I cannot understand if invalid opcode exception was thrown by the processor when les esp,edx sequence was decoded or it was thrown during decoding some AVX instruction which encodes VEX prefix with the help of les hex value.In first case compiler could be responsible for the fault.

Ritratto di iliyapolak

>>>are you sure that your debugger supports AVX2 instructions?>>>

Interesting question.IIRC invalid opcode vector is 0x6 and cpu should prepare a trap frame where it saves an address of faulty instruction.

Citazione:

bronxzv ha scritto:

EDIT: can you tell us the value of the byte right after the leading 0xc5 of this "LES" ?

 

Citazione:

iliyapolak ha scritto:

Yes I have read your post.I cannot understand if invalid opcode exception was thrown by the processor when les esp,edx sequence was decoded or it was thrown during decoding some AVX instruction which encodes VEX prefix with the help of les hex value.In first case compiler could be responsible for the fault.

This is capture of screen whe exception occured.

Ritratto di iliyapolak

Citazione:

Itzhak B. ha scritto:

Quote:

bronxzvwrote:

EDIT: can you tell us the value of the byte right after the leading 0xc5 of this "LES" ?

 

Quote:

iliyapolakwrote:

Yes I have read your post.I cannot understand if invalid opcode exception was thrown by the processor when les esp,edx sequence was decoded or it was thrown during decoding some AVX instruction which encodes VEX prefix with the help of les hex value.In first case compiler could be responsible for the fault.

This is capture of screen whe exception occured.

Sorry I can not see any attached screenshot.

We are using Microsoft Visual Studio 2008 SP1. It is very likely not aware of the VEX instructions.

Instruction stream bytes where the invalid opcode exception occurred are the following: c4 e2 51 f7 d0 8d 0c d5 47 00 00 00 ... (I don't know where this instruction ends).

About the possibility of our CPU not supporting AVX2 instructions - we ran the code defined at

http://download-software.intel.com/sites/default/files/319433-014.pdf section 2.2.3 and it ran successfully.

Thanks,

Beni

 

Citazione:

Beni F. ha scritto:
Instruction stream bytes where the invalid opcode exception occurred are the following: c4 e2 51 f7 d0 8d 0c d5 47 00 00 00 ... (I don't know where this instruction ends).

thanks, this looks like a 3-byte prefix VEX encoded instruction, I'll try to understand which one it is

btw I see that LES opcode is 0xc4, not 0xc5 as mentioned in the paper, one more error in this damned paper

Citazione:

Beni F. ha scritto:
Instruction stream bytes where the invalid opcode exception occurred are the following: c4 e2 51 f7 d0 8d 0c d5 47 00 00 00 ... (I don't know where this instruction ends).

from the 2nd byte this is a x0f38 prefix group, and from the 3rd byte there is an additional 0xf3 prefix, it looks like a 32-bit (since W==0) SARX instruction (*) but I may be wrong, by far the best will be to use an up to date debugger

SARX is a BMI2 instruction with another cpuid feature flag than AVX2 instructions (AFAIK, please verify), so you must also ensure that your targets (CPU and OS + BIOS) have proper support for BMI2 before to run this code path and run a fallback path otherwise

* see "SARX r32a,r/m32,r32b" in the Intel Architecture Instruction Set Extensions Programming Reference, it's at page B-21 in the August 2012 edition I have

Ritratto di Mark Charney (Intel)

Close. It is actually HSW's SHLX instruction in the BMI2 extension. The embedded prefix in VEX.pp=1 which we denote as "66".

Citazione:

Mark Charney (Intel) ha scritto:
Close. It is actually HSW's SHLX instruction in the BMI2 extension. The embedded prefix in VEX.pp=1 which we denote as "66".

thanks, I stand corrected, it looks like I mixed my mind with VEX.pp=2 (0xF3)

Ritratto di sirrida

Have you checked whether the processor has BMI2 enabled? AVX2 is a different set. As far as I know not all Haswell processors support BMI2 (which is a pity).

I am going to check it - thanks.

Hi Beni,

One quick question,  how did you get this HSW system? Did you purchase it or did you get it from under Intel's development program?

I am trying to find out if you have early systems or not.

We got a COM Express board from Advantech (a SOM-5894).

Thanks,

Beni

 

Hi Beni,

Board does not tell which processor are you using, did you get the processor from advantech too? :

http://www.advantech.com/products/SOM-5894/mod_0E302D80-4F19-406B-B540-733C6924C3A6.aspx

Board supports Embedded Intel ® Core™ i7/i5/i3/Celeron ® processor.

 

 

Yes we did. We got the processor with the board. It is a Haswell Core i7. According to the datasheet on the site (http://downloadt.advantech.com/ProductFile/PIS/SOM-5894/Product%20-%20Datasheet/SOM-5894_DS(06.21.13)20130624090857.pdf) it must be Core i7-4700EQ. There is no marking on the processor (at least not one that I can see) and BIOS does not say much.

 

Accedere per lasciare un commento.