I read somewhere that the new processors include special instructions for small lookup tables. Is there a way to optimize the following simple operation:
float data[10] = {0, ...9}
unsigned int idx[10] = {2,3,5,0,...9} // Arbitrary permutation of 0..9
float result[10];
result = data[idx]
I have to do this operation often and it takes quite a bit of time in a 'for' loop. Currently
for (int i=0;i<10;i++) result[i]=data[idx[i]];


