Itanium multimedia intrinsics and the memory stack

Itanium multimedia intrinsics and the memory stack

I have an application where I am streaming in some image data, operating on it and streaming out some output. I am using the instrinsics to expand packed 8-bit values to packed 16-bit values. As soon as I start calling _m_from_int64, the output assembly shows that the values going into those calls are explicitly stored on the memory stack. So if I have:

m1 = _m_punpckhbw( _m_from_int64(i3), _m_from_int64( (__int64)0));

I see the registers holding m1, i3 and 0 (ie r0) being stored onto the stack. I am also seeing some sequences like:

st8 [r8]=r11 ;;
ld8 r9=[r8]

No idea if this is a related issue...

If I insert dummy instructions and dont make any calls to the multimedia instrinsics, there is no explicit storage to the stack.

Anyone have any ideas as to what is causing this. I could go ahead and code the whole thing in assembly, but that would be a bit of a hassle. (how I wish ICC supported inline assembly for Itanium :( ).

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.