BT, BTS and BTC instructions are fast again in Core 2. Could your compiler guys impleament those bit test intrinsics? I think BT instruction should be impleamented at least.
The intrinsics benifits are quite obvious. Suppose we want to test if bit i is set in an integerbitmap, we usually do this in C/C++:
if (bitmap & (1 << i))
The problems of the above C test are
1. more intructions genereated and
2. register cl is needed, thus increasingregister pressure. And moreregister swap/save instructions often neededbecause rcx/ecx is often used as an function parameter.
Another plus for _bit_test(integer, index) is that it reduces code size.
One additional suggestion to the compiler optimization:
Sometimes(not always) bitmap & (1 << i) should be compiledas a BT instruction.