Intel ISA Extensions

CPUID Signature Values of DisplayFamily_DisplayModel - A new Appendix is Needed in all Intel Manuals

Have you ever tried to find all CPUID Signature Values of DisplayFamily_DisplayModel in Intel Manuals?

It takes some time to find a table with all these codes and I'd like to make a Feature Request:

Please add a new Appendix in all Intel Manuals with CPUID Signature Values of DisplayFamily_DisplayModel. It would be nice to see codes for all Intel CPUs.

Thanks in advance.


Latency of a General purpose MOV instruction on Intel CPUs

Hi everybody,

I'd like to hear from Intel engineers that Latency of a General purpose MOV instruction on any Intel CPUs is 1 clock cycle. For example, I've completed a set of tests for Intel(R) Pentium(R) 4 CPU 1.60GHz and my numbers are as follows:

[ Intel C++ compiler - DEBUG ]
Overhead of Assignment: 1.091 clock cycles

Moving/merging __m128d values to __m256d ones


I have some code where at some point, after doing SSE3 computations with __m128d-typed values, I need to:

a) store a __m128d value into one of the two halves of a __m256d value (not cast it!)

b) paste two __m128d values side-by-side into a __m256d value 

Are there any AVX intrinsics to perform these operations? 

Thanks in advance, 


8086/8088 core


I am a grad student in EE. I am looking for 8088/8086 core for my project. Project is to design a bus monitor which can interface with either 8088/8086. I searched on but could not find. I am looking for one which has all pins exposed - ALE, RD, WR, DEN, Addr/Data etc. 

Please can someone point me in the right direction. 



with _mm256 instruction, does it matter to use -xAVX to compiler?

I wrote some AVX instructions like :

    __m256 x0 = _mm256_load_ps(f);
    __m256 y0 = _mm256_load_ps(f+8);
    __m256 z0 = _mm256_add_ps(x0, y0);
    _mm256_store_ps( s, z0);

When I compiler this, does it matter whether I use -xAVX compiler option or not? I am using icpc 2013 on Linux


64-bit bug in Visual C++? mov R8d,imm not completley defined

The Intel  documentation does not specify wether  mov R8d , -1  will also zero the high dword of R8, or leave it intact.

 The Microsoft Visual C++  (2010)  translate the C line  a = myfunc(par1, par2, 3) ; into

         mov RCX, par1 ; mov RDX, par2 ;   mov R8b, 3 ;    call myfunc;    move qword ptr [a], RAX

IF the behaviour is implementation-dependent, some processors may crash....

IF the high dword is set to zero when moving to the low dword, why not say it clearly?

Adding consecutive large numbers

I am trying to write a simple assembly code in asm using the AVX instructions. I have seen a problem rising up while adding large numbers. The code is here:

__asm__ __volatile__(
"movl $0, %%r9d\n\t"
"movl $4, %%r10d\n\t"
"leal (%%eax, %%r9d, 1), %%edx\n\t"
"vbroadcastss (%%edx), %%ymm0\n\t"
"leal (%%eax, %%r10d, 1), %%edx\n\t"
"vmovups (%%edx), %%ymm1\n\t"
"vaddps %%ymm0, %%ymm1, %%ymm2\n\t"
"vmovups %%ymm2, (%%edx)"
: "=a"(x) : "a"(x));

Suscribirse a Intel ISA Extensions