RTM: Finding memory address at which transaction was aborted

RTM: Finding memory address at which transaction was aborted

If a page is marked Copy-on-Write, and I try to write to it inside of a transaction, the transaction aborts.  If I know the address at which it aborted, there is a trivial fix:

int v = addr[0];
__sync_bool_compare_and_swap(addr, v, v); // Force CoW.

And then retry the transaction.  Again, this only works if I know the cacheline on which the transaction was aborted.  Is there a way to find this?  Is it in a performance counter or something?

12 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Three options that I can think of:

a) prior to issuing each instruction that may cause a transaction abort, write the address that may abort into a memory location that won't abort. Somewhat of the same philosophy of a try/catch. Should the transaction abort __sync_fetch_add(loc,0); // RMW

b) prior to entering the transaction region (and inside your retry code) performs something like

     __sync_fetch_add(locA,0); // RMW
     __sync_fetch_add(locB,0);
     ...
     ... start transaction (if abort, go back/redo __sync_fetch_add's, then retry transaction)

c) start transaction, then if abort, perform the __sync_fetch_add's of b) and retry transaction

You will have to determine if choice a), b) or c) is more efficient.

Jim Dempsey

Hi,

before writing to a page you can printf its address using tsx_printf (it escapes transaction also working for transactions that are aborted). You need a processor with Skylake architecture for that.

Roman

Hi Jim and Roman,

(a) may be a workable possibility, but it won't catch all cases.  Same with tsx_printf (which is very clever, btw; I just read your blog post about it).

Let me provide some context:  I'm developing a compiler that allows the code:

xbegin;
​// Do some transactional stuff.
xcommit;

If the transaction aborts, it will simply try again until it succeeds.  However, the "do some transactional stuff" may call out to C or C++, which contains code that the compiler didn't generate and can't hook.  This is expected to be a common scenario in the language.  So it could be walking a graph or some such, and the memory is non-trivial to find outside of the transaction.

Best Reply

There is no perf counter or register with the memory access address that causes the abort. I think the best you can do is to retry the transaction body under a global lock instead of using TSX. In the TSX abort handler you can check the abort status (in EAX register) if the abort was persistent (RETRY bit = 0).

Roman

 

Roman,

tsx_printf is an interesting hack. I would have done something different.

#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>

#define TSX_PRINTF_BUF_PADD 1024
#define TSX_PRINTF_BUF_LEN (1024*1024)
char tsx_printf_str[TSX_PRINTF_BUF_LEN+TSX_PRINTF_BUF_PADD];
int  tsx_printf_fill = 0;
bool tsx_printf_wrapped =  false;

int tsx_printf(const char* format, ...)
{
    va_list list;
    va_start(list, format);
    int ret = vsnprintf(str+tsx_printf_fill, TSX_PRINTF_BUF_PADD, format, list);
    va_end(list);
    if((tsx_printf_fill += ret) == TSX_PRINTF_BUF_LEN)
    {
        tsx_printf_wrapped = true;
        int j = TSX_PRINTF_BUF_LEN;
        tsx_printf_fill = 0;
        for(int i = 0; i < ret; ++i)
        {
            tsx_printf_str[tsx_printf_fill++] = tsx_printf_str[j++];
        }
    }
    return ret;
}

void tsx_printf_dump()
{
    if(tsx_printf_wrapped)
    {
        tsx_printf_str[TSX_PRINTF_BUF_LEN] = 0;
        printf(&tsx_printf_str[tsx_printf_fill]);
    }
    tsx_printf_str[tsx_printf_fill] = 0;
    printf((&tsx_printf_str);
    tsx_printf_fill = 0;
    tsx_printf_wrapped =  false;
}

Jim Dempsey

Okay, yeah.  I was afraid of that.  Thanks Roman.

Jim,

changes to your tsx_printf memory buffer will be lost in case of an abort (e.g. if it happens after tsx_printf). The Intel processor trace records instruction control flow also in aborted transactions. My tsx_printf is (mis-)using it allowing the output data survive aborts.

Best regards,

Roman

Quote:

jimdempseyatthecove wrote:

a) prior to issuing each instruction that may cause a transaction abort, write the address that may abort into a memory location that won't abort. Somewhat of the same philosophy of a try/catch. Should the transaction abort __sync_fetch_add(loc,0); // RMW

Hi Jim,

I've been thinking more about this.  Is there a way to specify the memory I don't want added the transaction, or is there pre-defined memory that I can use?  Do you have a link with information on how this would work?

Roman, oops. You are right.

William, either all memory access is transactional or not. The instruction trace hack is not backed out. You could potentially use the instruction trace information in your transaction abort handler. This should be documented in a systems programmer manual. Perhaps Roman could provide a link. Converting the address into source code line number would be up to you to figure out.

Jim Dempsey

Hey Jim,

This was what I thought.  I must have misunderstood your first comment.

Thanks!

Setting up processor trace recording and reading the results directly from hardware is only possible in the kernel (ring 0) and not from user space. I am not sure if it is practical for this use case (compiler).

Roman

Leave a Comment

Please sign in to add a comment. Not a member? Join today