I hope that Intel engineers are going to read this, and improve integer SIMD instructions on future CPUs.
1. PSRLB/PSLLB/PSRAB/PSLAB -- they do not exist. How about adding them?
2. All SIMD shifts -- it is not possible to shift each packed byte/word/dword/qword by different amount.
- Why did you make another SIMD register as a count parameter for those instructions if we cannot specify more than one shift value?!? You could have simply used GPR for the count parameter!
PSRLW xmm0, eax would be more usefull than PSRLW xmm0, xmm2, not to mention that you could use xmm2 to have 8 different shift counts, one for each word in the destination.
Because of someone's mistake, now we will never have proper SIMD shift instruction -- behavior of PSRLW xmm0, xmm2 can never be changed. If we want more usefull shift we will need another instruction, and the current one will stay forever as a dead weight in the x86 instruction set.
3. SIMD bit manipulation (PBSETB/W/D/Q, PBCLRB/W/D/Q, PBTSTB/W/D/Q) -- it would be nice to have an instruction which can set, clear, or test different bit in each packed byte, word, dword, or qword.
xmm0 = 0 0 0 0 (dword)
xmm1 = 4 3 2 1 (dword)
PBSETD xmm0, xmm1
xmm0 = 0x10, 0x08, 0x04, 0x02 (dword)
To be continued.