Does ICC 14 generate BMI instructions?

Does ICC 14 generate BMI instructions?

Does anyone know if ICC 14 can transform (x >> 12) & 0x3     into _bextr_u32(x, 12, 2)    ?

I tried compiling it with icc -mcore-avx2   but it didn't transform.  How profitable is it to do so?    2 instructions, 2 cycles latency   vs  1 instruction, 2 cycles latency.

Also what is there an analogue of bextr_u32  for inserting contiguous bits into another word?  (e.g.      a | ((b & 0xff) << 8)  )

It seems that instruction would need 4 operands, which isn't implemented, but what about just filling all the upper bits (e.g.  a | (b << 8)  )

 

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.