Dear Intel users, i need to do many times an intraregister sum with intrinsic. For example: x += a+ a + a + a and a should be _m128 type. How can i do that? Which is the faster way? Thanks in advance!
For more complete information about compiler optimizations, see our Optimization Notice.