Serialized transactions when using threadprivate directive

Serialized transactions when using threadprivate directive

Consider the following example:

def.h
#include
#include

# define TRANSACTION __tm_atomic {
# define TRANSACTION_END }
# define TM_CALLABLE __attribute__((tm_callable))
# define TM_PURE __attribute__((tm_pure))
# define TM_UNKNOWN __attribute__((tm_unknown))

TM_CALLABLE
void SetValue(int cnt);

work.c
#include "def.h"

int myvar;
#pragma omp threadprivate(myvar)

TM_CALLABLE
void SetValue(int cnt)
{
myvar = cnt;
}

example.c
#include "def.h"

extern int myvar;
int testvar;
#pragma omp threadprivate(myvar, testvar)

void SetTest(int val)
{
testvar = val;
}

int main()
{
int cnt=0;
int i;

omp_set_num_threads(4);

#pragma omp parallel for
for (i=1; i<4; i++)
{
TRANSACTION
SetValue(cnt);
TRANSACTION_END

TRANSACTION
SetTest(cnt);
TRANSACTION_END
}

return 0;
}

During the compilation I get these messages:

icc -Qtm_enabled -openmp -o example.o -c example.c
example.c(25): (col. 3) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.
icc -Qtm_enabled -openmp -o work.o -c work.c
tx_warning: work.c(9): a non tm_callable/tm_pure function '__kmpc_global_thread_num' called inside __tm_atomic section
tx_warning: work.c(9): a non tm_callable/tm_pure function '__kmpc_threadprivate_cached' called inside __tm_atomic section
icc -Qtm_enabled -openmp -o example example.o work.o

After the program has finished this is the statitics gethered by the runtime:

TRANSACTION TOTALS

Source is line 32 in function main in example.c

: Min Mean Max Total
Transactions : 3
BytesWritten : 4 4.00 4 12

Source is line 28 in function main in example.c

&nb
sp; : Min Mean Max Total
Transactions : 3
SerialTransitions : 1 1.00 1 3
BytesRead : 8 10.67 12 32
BytesWritten : 4 4.00 4 12

GRAND TOTAL (all transactions, all threads)
: Min Mean Max Total
Transactions : 6
SerialTransitions : 0 0.50 1 3
BytesRead : 0 5.33 12 32
BytesWritten : 4 4.00 4 24

It is clear that transactions which call the function from the work.c are serialized.
I just wanted to make a point for the future versions. Also if by any means it is possible to get the fixed version any time soon it would be great.

6 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Thank you for your report.

This is known issue that already fixed but, unfortunately,after release of current What-If 2.0 compiler. So in the next version this problem will not appear.

By now you can avoid this problem using following codeas workaround:

/* General macro definitions */
#define _V(var) (*_##var##P())

#define V_DEF(type, var)
type var;
TM_PUREtype* _##var##P(void) { return &var; }

/* Thread-private variable declarations */
V_DEF(int, cnt);
#pragma omp threadprivate(cnt);
#definecnt _V(cnt)

After thisyou can use 'cnt' as usual variable and everything should work. I don't really like this solution much, but at least it solves the problem. More than this, accessor calls probably will be inlined and during inlining special care will be taken over TM_PURE attribute on inlined code, so calls associated with thread-private access will escape execution inside transaction and thus transaction will notget serialized.

Thanks for the reply. There is just a problem with macro definitions you provided. The way you suggested compiler will issue an error because you are using variable var before you declare it as a threadprivate.

So this is the correct way:

int cnt;

#pragma omp threadprivate(cnt)

TM_PURE type* _##cnt##P(void) { return &cnt }

#define cnt _V(cnt)

Yes, you're right - I missed that point about uses of cnt.Yourcode is not absolutely correct either, but you caught the idea.The correct way will be:

/*General macros*/
#define _VP(var) _##var##P()
#define _V(var) (*_VP(var))
#define V_DEF(type, var)
TM_PURE type* _VP(var) {return &var; }

/* Variable declaration*/
int cnt;
#pragma omp threadprivate(cnt)
V_DEF(int, cnt);
#define cnt _V(cnt)

or without macros:

int cnt;
#pragma omp threadprivate(cnt)
TM_PURE int* P_cnt_(void) {return &cnt;}
#define cnt (*P_cnt_())

As I said, I don't like this much: 4 lines instead of 2 for each theradprivate varible. But it is just workaround and happily in the next TM compiler version this will not be needed.

Regards,
Serge

This code will compile nice and work without serializing the transactions, but the bigger problem than writting 4 lines of code instead 2 is that you don't keep track of changes for threadprivate variables, which in case of transaction retry can lead to wrong execution.

As you said we are waiting for the next release, and I hope it's going to be soon :)

Thanks!

Did you verify this?

It is strange: the only action under TM_PURE isretrieveing of address of theradprivate variable that shall not change inside the transaction even upon retry. However write into this variable is done in usual transactional manner and thus monitored by TM library andshould be rolled back upon retry by the library. This pseudo code should help you understand my point:

Assume cnt as macro defined above:

/* Original code */
__tm_atomic {
cnt = x;
}

/*Preprocessed code*/
__tm_atomic {
*(_cntP()) = x; /*_cntP() is TM_PURE*/
}

/* Equivalent code with temporary address: close to internal representation in compiler */
__tm_atomic {
int* temp = _cntP();
*temp = x;
}

/* And finally pseudo-code after TM compiler */
TxBegin();
int* temp = _cntP(); /*Nothing is processed: temp is local, _cntP() is TM_PURE */
TxWriteIntValue(temp /*address*/, x /*value*/); /* temp is local, but its derefernce is not, so instrumented */
TxCommit();

I will check this tomorrow, but at first glance the code should work OK, if not so this might be indication of some other flaw in compiler.

Thank you,
Serge.

发表评论

登录添加评论。还不是成员?立即加入