I have been implementing some algorithms from "The Art of Multiprocessor Programming" in TBB. I have found in this book, however, that it relies sometimes upon the Java volatile keyword. The best way to mimick this functionality in TBB, it seems, is to have a spin-lock wrapped around the variable that would have been volatile in Java.
I have found that the limited size of atomic variables is a problem with even simple algorithms. For instance, if I want to atomically alter two variables I need to pack them into a single atomic. What I have done so far, is use spin-locks around the variables that I would normally want to alter atomically. How well would this scale as the number of threads/cores increases?
Is software transactional memory worth looking at, performance wise? Can it really offer any better performance than spin-locks?
Do I have other alternatives to atomic/spin-locks in writing fast parallel algorithms?