Multithreaded Fortran application - data race --> Threads seem to share local variables.

Multithreaded Fortran application - data race --> Threads seem to share local variables.

Hello,

we are currently facing a possible data race using ifort 12.1 in a multithreaded (PThreads) Fortran application, that we cannot track down to its origin. As the issue does not show for other compilers (e.g. gfortran), we assume a possible compiler issue.

Attached to this post, there is a small example program, that demonstrates the problem.

The interesting part of the code resides in main.f90:

  function walk_worker_thread(arg) bind(c)
    [...]
    type(c_ptr), value :: arg
    integer :: count, my_core_id

    type(t_threaddata), pointer, volatile :: my_threaddata

    integer*8 :: addr1, addr2

    call c_f_pointer(arg, my_threaddata)

    addr1 = transfer(c_loc(my_threaddata), addr1)
    write(*,'(a,1z16)') "initial address", addr1

    my_core_id       = THREAD_UNIQUE_NUMBER
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    my_threaddata%id = my_core_id

    count = 0
    do while(my_threaddata%id .eq. my_core_id)
       count = count + 1
    end do
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    addr2 = transfer(c_loc(my_threaddata), addr2)

    write(*,'(a,i8,2(a,i4))') "WRONG STATUS after ", count, &
         " iterations; my core-id:", my_core_id, &
         ", but found ", my_threaddata%id
    write(*,'(a,2z16)') "modified addresses", addr1, addr2
    [...]
end function

As the name suggests, this function is invoked as a thread several times with different argument pointers as we will see in the output after "initial address" (here for two threads, detailed output files are included in the attachment):

initial address 10AD060
initial address 10AD064
WRONG STATUS after 26545 iterations; my core-id: 145, but found 146
modified addresses 10AD060 10AD064

Since both threads are called with different arguments, their variables my_threaddata should point to different storage locations. Hence, the code between the comment lines should be an infinite loop and the "WRONG STATUS"-output should never appear.
As it appears anyway, we can even see what presumably went wrong: Obviously the pointer, my_threaddata itself is modified by the concurrent thread. Since it is a local variable for the function walk_worker_thread(), in our understanding, this should be impossible.

Do you see an obvious mistake or could it be that the function c_f_pointer() is not thread-safe?

As we did not trust the transfer()-construct for determining the addresses, we also used a C-function for doing so (i.e. we called c_printptr(c_loc(variable)) and printf-ed its argument) with the same results, we are pretty sure that the construct works and the printed addresses are correct.

Thank you in advance,

Mathias

Some additional information:
Compiler Version:
ifort (IFORT) 12.1.0 20110811

System:
Linux curie50 2.6.32-71.24.1.el6.Bull.23.x86_64

Output using gfortran/gcc:

initial address B01B34
initial address B01B30
[infinite loop as expected]

 

 

Content/Usage of the attached code:Sourcecode:

 

 

 

  • pthreads.f90, pthreads.c - C-wrapper for pthreads functions and Fortran interface declarations
  • main.f90 - main Program, including thread-function

Makefile usage:

 

 

 

COMP=intel make # compiles using ifort/icc, runs the program and outputs detailed information
COMP=gnu   make # dito for gfortran/gcc

Example Output:

 

 

 

  • output.intel - output of our test runs using the Intel compiler
  • output.gnu - dito for GNU compiler (execution had to be manually terminated due to the infinite loop)

 

AttachmentSize
Download intel-F90C-pthreads.tgz2.78 KB
8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

interesting - you're probably right about the choice of c_f_pointer. -threads SHOULD choose thread safe libraries. If you instead mod the makefile to
FC = ifort -openmp

we'll get the expected behavior. openmp does indeed select the thread safe FRTL. I am not sure why -threads didn't do this, that could be a bug. I'll continue to investigate, but for now use -openmp instead of -threads.

ron

Best Reply

found it, actually has nothing to do with library choice.

You need

FC = ifort -threads -auto

-auto Causes all local, non-SAVEd variables to be allocated to the run-time stack.

-recursive option will also work, as it set AUTOMATIC as well.

ron

Ron,

thank you very much for elaborating the problem. In fact, -threads -auto completely solved the issue for our example as well as for the complete code that is significantly larger and showed the problem in several functions.

We already have spent two days in trying to find the error and were very unhappy in only being able to find the symptoms. - Maybe, we should have been asking you earlier :-)

Just to further understand the details: I again read the ifort-manpage and think I understood, what the -auto-auto flag does. However, I could not find a hint about what happens if the flag is not given. The manpage says

-auto         Causes all local, non-SAVEd variables to be  allocated  on  the
              run-time  stack (same as -automatic or -nosave). The default is
              -auto-scalar.  However, if you specify -recursive  or  -openmp,
              the default is -automatic.

-auto-scalar  Causes  allocation of scalar variables of intrinsic types INTE-
              GER, REAL, COMPLEX, and LOGICAL to the run-time stack. This  is
              the default. However, if you specify -recursive or -openmp, the
              default is -automatic.

              You cannot  specify  -save,  -auto,  or  -automatic  with  this
              option.

Where are a function's local variables (non-intrinsic type) variables stored in the case of -autos-scalar? - Are they put into some global storage position like save variables?

Again, thank you for your immediate and very constructive help,

Mathias

auto-scalar (the default) puts nonSAVEd scalar variables on the stack, or maybe only in registers, so for those variables each thread has its own instance. To the extent that the doc implies those variables under the control of auto-scalar would be treated differently by -auto, I don't think that's correct.

I would interpret the documentation as that "variables of intrinsic types INTEGER, REAL, COMPLEX, and LOGICAL" are handled in the same way with both options.

What in my interpretation is important but unclear is the question, what (for -auto-scalar) happens to local variables for that are not of this type, e.g. "type(t_threaddata), pointer :: myvar" or "type(t_threaddata) :: myvar" and what happens to small (non-allocatable) arrays, e.g. "integer, dimension(3) :: myvar".

Is there a more or less general answer?

-auto will put local variables into stack storage. And since each thread has it's own stack area, this makes them private to the thread. I like to think of it as the opposite of SAVE and -save. I think the docs are just really old and out of date wrt F03 data structures and types.

ron

Thank you again for the clarification - Hence, my conclusion is to simply use -auto whenever I am dealing with a multithreaded application.

Mathias

Login to leave a comment.