Intel ISA Extensions

.S help


Iwrote pure.S asm code file for C++ file, which means that a.s has .S pattern of .bss, .data, .text and .globl sections followed by .globl function definitions using instructions. Here is the original file whose .S file is a.s.

To confirm the error if any within a.s file I tried checking this.S file using "as -o a.o a.s" which doesn't generate any errors. I am using ICC-v11.0 on Linux x86_64 with GNU-syntax for .S file.

I have to incorporate thisa.s file alongwith other multiple c/c++ files as a single package.

Opcode semantics


Firstly, I apologize if this is the wrong forum; I could not find any other more relevant.

I'm looking for clarification in regards to a statement made that asserts there is a 1-cycle difference between the instructions:

0x3B (cmp reg, mem)
0x39 (cmp mem, reg)

As the two are functionally equivalent, I assume it would have to have something to do with the decoding circuit logic, but would like clarification if this statement reigns true in the first place.

Intel C++ : _mm256_set1_ps suboptimal ?

I'm in the process of porting a (huge) piece of code from SSE to AVX, looking at the ASM generated by the compiler (Intel C++ Pro 11.1 build #38 IA32 / Windows) I have just remarked that _mm256_set1_ps spits outthis convoluted sequence :

movss xmm0, DWORD PTR [edi+eax*4]

unpcklps xmm0, xmm0

movlhps xmm0, xmm0

vinsertf128 ymm1, ymm0, xmm0, 1

instead ofthemuch simpler :

vbroadcastss ymm0, DWORD PTR [edi+eax*4]

did I miss something or is it simply something that should be improved in a forthcoming version of the compiler ?

Deprecate x87 FPU?

Why hasn't the x87 FPU been deprecated? Wouldn't it be better to map those old opcodes to new and imroved FPU instructions? I'm no hardware engineer, but it may even be nice if a bit in a control register could determine which opcodes were available. I'm assuming something like this has never been done because it can't be done without adding latency?

CPU Instruction counter register

Hi guys!

Our company is planning to buy Vtune and we played sufficient time with trial version.

The tool is great, but sometimes we don't need the all info statistic profiling gives.

There should be some CPU register, some MSR, i believe, to count executed instructions.
Does anybody knows how to access it or point to some document about the details?

Thanks, in advance

Generating Prefetch Instructions in AVX code...

I've used the Intel 11.1 compiler to generate AVX code. Unfortunately I also find that there are no software prefetch instructions issued in that code. With SSE 4.2 sw prefetch was used.. switching from SSE 4.2 to AVX.. all software prefetches disappeared. Is there a way to get these generated in AVX as they were in SSE 4.2? If so please let me know. Thanks for any feedback...


Subscribe to Intel ISA Extensions