According to Gen8.pdf,
'These units can SIMD execute up to four 32-bit floating-point (or integer) operations, or SIMD execute up to eight 16-bit integer or 16-bit floating-point operations.'
It seems that INT16 can achieve 2x peak throughput compared with INT32.
In Gen8.pdf, the table shows that for HD Graphics 5300, 32b integer IOPS = 192 IOP/cyc. Then, does it mean 16bit integer IOPS = 192*2 IOP/cyc?
Is my understanding right?