Is any version of ICC capable of software pipelining loops for x86/x64? Currently, I'm doing it manually, but this is a well known method for decades, so I think it should be in the compiler.
Is there some option to turn it on? I looked in the man page and see it mentioned under IA-64, but nothing under x86.
My loops consist of SIMD intrinsic functions without any branches other than the loop condition, so this should be amenable to software pipelining.