I'm trying to optimize C++ expression template code, and for the compiler to be able to optimize away the expression templates, they need to be inlined. At O3, the compiler will by default punt on inlining complicated expressions, leaving some intermediate function calls, and killing performance. As I understand it, the function-specific directive "__forceinline" and the statement-specific "#pragma forceinline recursive" should force the compiler to inline the function call, but I've tried using these and it still leaves the function calls. (It does read them, because if I intentionally mis-spell them, I get an error.) Using the compiler option "-inline-forceinline" doeswork, so the compiler is technically capable of inlining the call, but this of course inlines the entire code which is not usable in practice. Can anyone give any hints as to what might be preventing the inlining of the calls? This is w icpc 11.1. Thanks, /Patrik
"force inline" doesn't?