I've made studying of Intel TSX performance - its abort cases and comparison with spin lock. The study with reference to source code is available at http://natsys-lab.blogspot.ru/2013/11/studying-intel-tsx-performance.html .
I see some performance gain for TSX in comparison with spin lock. However I stll have few of questions:
1. I see huge jump of transactional aborts when transaction work set reaches 256 cache lines (Figure 1). This is 16KB which is just only a quarter of L1d cache. The workload is single threaded, running strictly on one CPU core. Also I didn't use HyperThreading. I know that the cache has 8-way associativity, however I used static memory allocations (which are continous in virtual memory and likely continous physically) and it's unlikely that I have too many collisions in cache lines to get only 16KB of available cache memory for the transaction. So how the spike can be explained? Also Figure 3 (dependency of execution time on transaction size) shows strange fluctuation around 256 cache lines. Why execution time firstly quickly raises, then jumps down and after that smoothly increases again? There is no other such fluctuations on the graph.
2. Test with overlapping data has shown very confusing results: TSX performance doesn't show peformance increasing for transactions with small data contention in comparison with transactions which modify the same data set on different CPUs. Probably I'm wrong with my fallback code and TSX transaction always finds the lock acquired. But I did special steps (retry transaction on _XABORT_RETRY, _XABORT_CONFLICT and _XABORT_EXPLICIT with my abort code events) to retry transaction more times instead of fallback to spin lock. In this scenario I'd expect smaller number of aborts for transactions with no data ovelapping in comparison with transactions with tight contention, but abort rates are the same. Why?
3. And finally, I was experimenting with fallback condition a lot and results in the post are the best. How can I reduce number of transaction aborts (especially I'm interested in conflict aborts for non-overlapping or with very small data overlapping transactions) and increase overall transactional concurrency?
Many thank in advance,