Penalty for 256-bit loads and stores with cache line splits

Penalty for 256-bit loads and stores with cache line splits

imagem de jeremyweek

Hi,

I was wondering what the penalty, in clock cycles, is for doing 256-bit loads and stores when there is
a cache line split?

Thanks!

-Jeremy

2 posts / 0 new
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.
imagem de Tim Prince

For Sandy Bridge, the compilers avoid a cache line split on AVX-256 by always splitting explicitly into AVX-128 instructions, which are expected to be faster in that case. You would have to write intrinsics to test it. Your guess about other AVX CPUs is as good as mine.

Faça login para deixar um comentário.