Intel® AVX and CPU Instructions

P4 stalls for >240 uSec

Background: We have an in-house realtime OS developed from 386,486,pentium upwards as the available PCIMIG boards change


Problem: We have come to evaluate a P4 board as our previous Celeron board is going EOL. When we try to use the COM2 port at 115K it misses characters. I've tracked it down to the Input instruction sometimes taking over 240 microseconds. Interrupts are disabled since we use this feature to download new software versions serially

SSE 4.1 instructions - DPPS/EXTRACTPS

I have been trying to use the Intel DPPS instruction with either EXTRACTPS or BLENDPS. Essentially I have a loop in which


x1 = dot-product(y1,z1)
x2 = dot-product(y2,z2)
x3 = dot-product(y3,z3)

x4 = x1/(sqrt(x2)*sqrt(x3)

I can do x1,x2,x3 with the DPPS instruction and then use extractps. So 3 DPPS with 3 EXTRACTPS. Turns out I did not get any improvement in performance. To use lesser number of EXTRACTPS, I used BLENDPS.

x1_sse = dpps(y1,z1,241)
x2_sse = dpps(y2,z2,242)
x2_sse = blendps(x1_sse,x2_sse, 2);

Detailed info about FTZ & DAZ

Hi there.


When I started programming SSE, I was always wondering why there were operations that seemed to do the same thing, afterall both MOVAPS, MOVAPD, MOVDQA should result in the same thing, loading 128bits, right?


Then I found more detail about FTZ & DAZ, and realized that DAZis forcing to zero(I think) when loading data, or at least before operations (but which ones?), and I then realized how bad it would be to load integers using the float versions of the MOVs.

Is there any methods to see contents in MMX and XMM registers?

Recently, I met a hard problem. My debug dll worked fine, but my release dll failed to show correct image. After some hard working, I found the problem was that a mul of float got wrong result, one of the operator was a const(acturally it was a const float div another const float) and it was just put in XMM register. With debugger, I confined the other operator is correct, so I want to see the operator in XMM register, but I don't know how. Can any one show me the method?

Where can I find comparison between SSE3 and SSE4 instruction set??

Hi all,


I want to compare SSE3 instruction set of Intel's core 2 duo processor(orPresscot processor)and SSE4 instruction set of Intel's Penryn processor. I was able to find the details individually but I couldn't find the comparison between them( also I dont know to compare it). It would be great If anyone could provide the comparison or a place to find such comparisons. I have a short period(less than 24 hours) to do this, So I believe thatI will get the response from you asap.


Thanks

CPU temperature Pentium 4

Hi,


My first post in this forum! I'm developing a Delphi 7 application used in a machine for inspecting food. We use a standard computer that uses an Intel Pentium 4 2.8 GHz processor.


We have had a few reports that in very hot factories (or where cooling has not been applied correctly) the computer overheats. I'd like to integrate a CPU temperature monitor into the application but can't get it to work.

sse execution units in core duo

I have read at various places that all intel processors before Core 2 Duo (including Core Duo) have 64-bit floating point execution units. (I am not talking about the x87 FPU). Due to this, the sse instructions using 128-bit operands are split into two with 64-bits handled at a time.

Regarding this, I have the following questions:

a. Is this true?

b. Assuming it is true, won't it mean that there is no speed advantage with instructions like addpd as compared to addsd (as the addpd instruction is split into two anyway) ?

Regards
Gautam

Software consequences of extending XMM to YMM

The
extension of registers to the double size has happened several times in the
history of the x86 ISA. Every time registers are extended to a larger size we
have the problem with partial register access and false dependencies when
legacy instructions write to the lower part of the register.

The
solutions to this problem seen hitherto are the following:

页面

订阅 Intel® AVX and CPU Instructions