More background please

More background please


I set up the Intel STM compiler on WindowsXP inside the VS2005 IDE, and got the first demo to build by selecting the 'OpenMP support' and de-selecting the 'C++ Exceptions' (/EHsc). I also added /Qtm_enabled to the additional options box from the command line section.

The second example didn't work like that, complaining about a missing semi-colon at line 290, and so I gave up on it. To be honest, those macros at the start of example #2 look absolutely mental.

Anyway, what I'd like to say is, "Please tell us what's going on here". I know it's early release, and the animated cars are inspiring, but tell me more about STM. (I've been reading it over for months.) Do the atomic (__tm_atomic) sections differ semantically, from say an OpenMP critical section? I'd love to see a second animated graphic, to shed some light on a few of the high-level implementation details (and of course more written material).

More practically, in the first example I replaced both x->TxnAddOne() and y->TxnAddOne() with xx = xx + 1; The thing is, although it still returns 'passed', I'm not convinced that I may not have just got lucky. What I'm saying now is, must I use both the __tm_atomic section and a __declspec(tm_callable) method, or is the __tm_atomic section enough on its own?


3 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

For the practical one, I think you will learn more about this release if you read the manual more carefully. A __tm_atomic section must be used to mark a transaction. And only tm_callable functions can be called inside a __tm_atomic section. I believe this is why it currently cannot support library functions, non of which is marked as __tm_atomic. Programs should conform to this rule because we don't want a function that may actually sleep to appear inside an atomic section. I think this is somewhat like you cannot call a function that may cause sleep inside an interrupt handler in Linux kernel. However, for compatibility, I'm sure Intel will solve this problem in some ways.

For the concept of Transactional Memory, maybebetter referencescome fromthose papers related to it. The general idea in my mind is that, a thread may block when there are contentions for a lock or a critical section; in contrast, when we are using TM, a thread will continue despite other threads, and will reverse all the changes after the transaction if it detects corruptions. This approach will almost certainlyresult inperformance penalties in certain conditions (maybe when there are heavy contentions), but it willbring a clearer programming interface and much lighter mental burden for parallel programmers. For example, it saves the programmers from the chaos of managing locks, which easily cause dead locks and may still suffer data race problems if used unwisely.

I'm just talking about what transactional memory is in my mind. I hope it will not mislead you in any case. However, I think the flash may be adjusted in order to show the advantages of TM more obviously because it is still a little bit vague when we compare locks and transactions in that flash.

You are getting a correct behavior.If there is no function call inside the __tm_atomic region. You don't need the __declspec(tm_callable). __declspec_tm_callable should be used for functions called inside __tm_atomic. In this case, if you have

__tm_atomic {

xx = xx + 1;


You will get a correct value for xx.

Xinmin Tian (Intel)

Leave a Comment

Please sign in to add a comment. Not a member? Join today