Access violation in multithreaded program

Access violation in multithreaded program

Hello there,

I have a problem with a multithreading program. In that program, I use some subroutines and functions all over the place (like converting a string to an integer, displaying a real number in an edit box...)

Sometimes I get access violations, I think because that the function or subroutine foo is being called by two different threads at the same time.

Does anybody had the same problem or has a hint, what I could do? Could it be that, because the argument is a reference, that in a second call of foo the reference of the first foo is altered?

Thanks in advance,
Markus

14 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

A likely cause of such problems comes with failure to declare a procedure RECURSIVE. If that declaration is not used, attention is required to be certain that /Qauto or options which imply it are set. You might check whether SAVE is used in an incompatible way.
If variables are undefined, /Qauto might expose a failure even with single thread.
Another possibility might be buggy use of ENTRY.

Marcus,

>>Sometimes I get access violations, I think because that the function or subroutine foo is being called by two different threads at the same time.

This situation could cause data corruption but not an access violation.

Access violations occur when a program references (virtual) memory that is not mapped to the appliction.

Most often the reference is that of an uninitialized pointer/reference or attempted reuse of a pointer/reference than has gone out of scope .AND. the memory location holding the pointer/reference was subsequently modified to contain what looks like a pointer/reference to an address that is not mapped.

Is your FORTRAN program calling C/C++ code where the argument(s) is(are) assumed to be NULL terminated?
If you do, and if the FORTRAN string does not contain a NULL, then nasty things will happen.

In the access violation report you usually see the location of the instruction causing the fault, and the location the data it was attempting to access. You may also get an opportunity to Debug. When a fault occurs, write down the two locations, then try to Debug. If you enter the debugger but receive "No source available..." then look at the call stack. Hopefully you can find the nearest level with source, set focus to that level (double click on line in call stack). Then try to look at the statement(s) at/preceeding the return address of the call. Something may show up funny.

If nothing is obvious then add some assert statements for the arguments (e.g. test to assure LOC(arg) is reasonable, and NULL terminated args are in fact NULL terminated.

Jim Dempsey

www.quickthreadprogramming.com

Thanks for the suggestions.

There are no RECURSIVE functions or subroutines, no SAVE and no ENTRY, it is not a mixed language project and I didn´t use /Qauto, the project setting was for default local storage.

I just had an access violation and I made a screenshot of the debugger. The write statement is beeing called very often and nothing happens. But sometimes it creates an access violation. So I think something else in my project goes wrong and it shows in that line.

Also I did notice that this happens, when I let my project run in the background and begin to do something else (like posting here). There are a few lines, where an access violation happens, and it is always a read or write statement.

Markus

Edit: Now I had this error message:

First-chance exception at 0x006d23e8 in BCT Monitoring Tool.exe: 0xC0000005: Access violation reading location 0xfeeefefa.
HEAP[BCT Monitoring Tool.exe]: HEAP: Free Heap block 70f56d8 modified at 70f5e38 after it was freed
Windows has triggered a breakpoint in BCT Monitoring Tool.exe.

This may be due to a corruption of the heap, which indicates a bug in BCT Monitoring Tool.exe or any of the DLLs it has loaded.

This may also be due to the user pressing F12 while BCT Monitoring Tool.exe has focus.

The output window may have more diagnostic information.

附件: 

附件尺寸
下载 access-violation.jpg625.21 KB

write will have a single buffer for each open file, so if multiple threads write to the same file, there is a race condition. In fact, if one thread fills the buffer and it begins to flush to disk, it seems very bad things can happen if another thread uses the buffer. So, if you are using a low level threading method, you would put writes in a critical region. auto-parallel or OpenMP might accomplish this automatically.

Tim, the read and write statements are on a variable, not a file:

subroutine setSequenceUpdateDrives(SequenceNumber)

use globaleVariablen
use iflogm

implicit none

integer(kind=2) SequenceNumber
logical(kind=4) l
character*255 text
include 'resource_online.fd'

write(text,'(i)') SequenceNumber
l=dlgSet(dlgTabPort20102, IDC_EDIT_StatusPort20102, trim(adjustl(text)))

return
end subroutine setSequenceUpdateDrives

The access violation occurs in the line "write(text,'(i)') SequenceNumber".

Still you have a race condition on the text variable, since you didn't declare RECURSIVE or set /Qauto, and you would depend on the implementation of internal write being thread safe even if you take reasonable precautions. If you are setting a shared variable here, you would need to establish atomic access to it anyway.

Do You compile and link your sources with Multithread "runtime libraries" (/threads)? use RECURSIVE or set /Qauto as it was in previous responses.

Thanks again for all the suggestions. I got a little bit further, but I couldn´t eliminate the access violation, although it got "better" which means I ran my program and the routine got called over eleventhousand times before the access violation occurred.

I set the /Qauto flag and /threads too, it was only set to Debug QuickWin (/libs:qwin /dbglibs). I declared my subroutines as recursive and set /recursive in the project properties.

Basicly, my program receives data via WinSock every 2 seconds. There are 4 WinSock Threads listening to different ports. The data are being sent from a .NET program. In a WinSock Thread I do this (always with different structures and different subroutines for displaying):

! got data in EA_Telegramm_816 struct
write(port,'(a,i6,x,i4,a,5(i2,a),i4)',iostat=iError) 'Message received, #', EA_Telegramm_816%DBX240, EA_Telegramm_816%DBX280,".",EA_Telegramm_816%DBX300,".",EA_Telegramm_816%DBX320,"-",EA_Telegramm_816%DBX340,":",EA_Telegramm_816%DBX360,":",EA_Telegramm_816%DBX380,".",EA_Telegramm_816%DBX400
flush(port)
call setSequenceUpdateComm(EA_Telegramm_816%DBX240)
! waiting again

recursive subroutine setSequenceUpdateComm(SequenceNumber)
use globaleVariablen
use iflogm
implicit none
integer(kind=2) SequenceNumber
logical(kind=4) l
character*255 textComm
include 'resource_online.fd'
write(textComm,'(i)') SequenceNumber
l=dlgSet(dlgTabPort20100, IDC_EDIT_StatusPort20100, trim(adjustl(textComm)))
return
end subroutine setSequenceUpdateComm

I´m logging some data, port is a text file. Then I want to display a counter of my dialog.

Putting the flush statement helped. Removing the write statement helps very much. But there are still access violations in the subroutines setSequenceUpdate_xxx and sometimes in other subroutines or functions, where I use a write or read statement.

Another suggestion by TimP was to do some OpenMP stuff, which I haven´t used before. I tried

!$OMP ATOMIC WRITE
write(textDrives,'(i)') SequenceNumber
l=dlgSet(dlgTabPort20102, IDC_EDIT_StatusPort20102, trim(adjustl(textDrives)))

but this doesn´t help.

Am I doing this atomic right or do I have to do something else or different?

Thanks in advance,
Markus

Markus,

The description of your experience leads me to suspect that a call to a system function or C/C++ routine is passing a reference verses value or address of pointer/discriptor verses that pointed to/array addres described by discriptor
.AND.
such incorrect argument usage is functional for the call however it corrupts something elsewhere in your code (inside the library function for write/read).

This can happen quite easily if you do not use the provided interface modules (or if there is an error in one of the interface declaratins in said module).
IOW should you write your own interface, or use none (FORTRAN default calling parameters), then you run the risk of getting something wrong.

Example, what is the interface for dlgSet(dlgTabPort20100, IDC_EDIT_StatusPort20100, trim(adjustl(textComm)))?

Is it expecting an ASCIIZ string pointer?

Jim Dempsey

www.quickthreadprogramming.com

Hi Jim,
dlgSet is a QuickWin function, the arguments are okay (integer, integer, character).

But I use system functions to create the threads... I made a mistake with the CreateThread function, this would explain it.

I´m running a test overnight now. I´ll tell you tomorrow what happened.

Thanks,
Markus

Getting the CreateThread right helped a lot, but I still have to remove the write statement that logs into a text file. Tonight the program ran for 14 hours without crashing.

Here is how I implemented it now:


subroutine startThread

    integer(INT_PTR_KIND()) :: threadID_PortComm

    integer(INT_PTR_KIND()) ThreadHandle_PortComm

    integer(INT_PTR_KIND()), PARAMETER :: securityComm = 0

    integer(INT_PTR_KIND()), PARAMETER :: stack_sizeComm = 0

    integer(kind=2) portComm
    interface

        integer(kind=4) function WinsockComm(port)

!DEC$ ATTRIBUTES STDCALL, ALIAS:"_winsockcomm" :: WinsockComm

            integer(kind=2), pointer :: port

        end function

    end interface

!...

            ThreadHandle_PortComm = CreateThread(securityComm, stack_sizeComm, WinsockComm, loc(portComm), CREATE_SUSPENDED, ThreadID_PortComm)

            i = SetThreadPriority(ThreadHandle_PortComm, THREAD_PRIORITY_BELOW_NORMAL)

            i = ResumeThread(ThreadHandle_PortComm)

end subroutine
    integer(kind=4) function WinsockComm(port)

!DEC$ ATTRIBUTES STDCALL, ALIAS:"_winsockcomm" :: WinsockComm

        integer(kind=2), pointer :: port

! ...

! still have to exclude this write statement

        !write(port,'(a,i6,x,i4,a,5(i2,a),i4)',iostat=iError) 'Message received, #', EA_Telegramm_816%DBX240, EA_Telegramm_816%DBX280,".",EA_Telegramm_816%DBX300,".",EA_Telegramm_816%DBX320,"-",EA_Telegramm_816%DBX340,":",EA_Telegramm_816%DBX360,":",EA_Telegramm_816%DBX380,".",EA_Telegramm_816%DBX400

        write(textComm,'(i)') EA_Telegramm_816%DBX240

        l=dlgSet(dlgTabPort20100, IDC_EDIT_StatusPort20100, trim(adjustl(textComm)))

! ...

    end function WinsockComm

Markus

Does your WinsockComm thread perform blocking I/O (e.g. read that waits for data), or does it perform polling I/O (compute loop waiting to see arrival of data). It would be better to perform the blocking I/O (read socket with wait for data or timeout (timeout large)). Then in this case you would not set the priority low, you could make it above normal since it will be waiting almost all of the time.

Jim Dempsey

www.quickthreadprogramming.com

I´m such a fool...

The problem is (or better was) that I accessed the QuickWin Dialog from another thread. Putting the dlgSet(...) out of the Winsock Thread into the Main Thread solved my issue.

Markus

发表评论

登录添加评论。还不是成员?立即加入