SSE3 critique

SSE3 critique

Hello,

I am writing a paper about interval arithmetic using SSE2 instructions which is part of my library for exact real number computations, and while doing it I realized SSE3 could have been quite helpful if it were done slightly differently.

My exact question is: I am curious why did Intel prefer to include a addsub instruction instead of multiplication with one of the arguments negated, i.e. something like

mulpnpd xmm1,xmm2

giving xmm1.1 * xmm2.1, (-xmm1.0) * xmm2.0

Using this the addsubpd instruction would not be needed to compute complex multiplications and divisions.

What I believe to be more important, however, is the behavior of Intel's sample SSE3 code for complex multiplication when the rounding mode is set to something other than rounding-to-nearest. More specifically, the SSE3 complex multiplication code would not compute upper bounds for the product when the rounding is to +inf, nor lower bounds for -inf, because the rounding of the multiplication that computes the substracted component would be rounded incorrectly.

This would not be the case if a mulpn instruction were available instead of addsub, because the result of the multiplication would be rounded the correct way. A mulpn would also be very useful for single or double precision interval arithmetic using the SIMD registers.

Does anyone know why Intel preferred addsub to this?

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Greetings from Intel Software Network Support. We will check with our engineering contacts and let you know what we find out.

Regards,

Lexi S.
Intel Software Network Support
http://www.intel.com/software/
email: ISN.support@intel.com

Our engineering contacts responded:

The addsub was added for complex arithmetic. It seemed more natural to handle the "-" with an add type of instruction, rather than a mul as described above. Interval arithmetic was not a factor at all in the decision to add this instruction, but significant improvement in math libraries were obtained with these instructions, confirming that they are useful.

We are always looking for new instructions and feedback to make our architectures better suited to our customers' needs. If you would like to write up your requestwith a bit more detail and send it to us here, we would be glad to forward the information to ourarchitects to consider the request for future architectures. We would also need to know what you want to use it for.

Regards,

Lexi S.
Intel Software Network Support
http://www.intel.com/software/
Contact us

Message Edited by intel.software.network.support on 11-15-2005 11:18 PM

I know this is an old post but I am curious to hear if the author has updated his code. There is an instruction BLENDVPD in SSE 4.1 which makes conditional selection of double precision values easier.

-- Regards, Igor Levicki If you find my post helpfull, please rate it and/or select it as a best answer where applies. Thank you.

Login to leave a comment.