Hi,
Its is known that a runtime penalty is ensued when I switch from AVX instructions to SSE unless I use vzeroupper/vzeroall to clean the upper halves of the ymm registers before the switch. Am I correct assuming that the cleanup is not needed if I only use lower halves of ymm registers in my AVX code (i.e. VEX-encoded SSE code)?



