I have an application where I am streaming in some image data, operating on it and streaming out some output. I am using the instrinsics to expand packed 8-bit values to packed 16-bit values. As soon as I start calling _m_from_int64, the output assembly shows that the values going into those calls are explicitly stored on the memory stack. So if I have:
m1 = _m_punpckhbw( _m_from_int64(i3), _m_from_int64( (__int64)0));
I see the registers holding m1, i3 and 0 (ie r0) being stored onto the stack. I am also seeing some sequences like:
st8 [r8]=r11 ;;
No idea if this is a related issue...
If I insert dummy instructions and dont make any calls to the multimedia instrinsics, there is no explicit storage to the stack.
Anyone have any ideas as to what is causing this. I could go ahead and code the whole thing in assembly, but that would be a bit of a hassle. (how I wish ICC supported inline assembly for Itanium :( ).