>>...cleanup is not needed if I only use lower halves of ymm registers in my AVX code...
Since your code is already AVXed than penalties should not occur. However, in a thread:
Forum topic: AVX transition penalties and OS support
Web-link: software.intel.com/en-us/forums/topic/364851
there is a link to a Pdf document Avoiding AVX to SSE Transition Penalties and it describes how it could be verified with VTune, or with Intel Software Development Emulator, and please take a look. If it is critical for your processing than a verification in a Disassembler is needed in order to confirm that there are no any SSE instructions.




Need to vzeroupper if 128-bit operations are used?
Hi,
Its is known that a runtime penalty is ensued when I switch from AVX instructions to SSE unless I use vzeroupper/vzeroall to clean the upper halves of the ymm registers before the switch. Am I correct assuming that the cleanup is not needed if I only use lower halves of ymm registers in my AVX code (i.e. VEX-encoded SSE code)?