odd problem

odd problem

lklawrie的头像

Large project -- getting an access violation with certain compiler settings.

If try to put it in debugger or add any checks (i.e. array bounds, etc), it runs.  Adding /fpe:0 and it runs. 

I'd like to say it's a compiler issue but?

Linda
44 帖子 / 0 new
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项
bmchenry的头像

You need to provide some sample error messages. It appears to work in 'debug' only, so could it be you've set a release directory in a location that is for admin only and you aren't running with admin privileges? Or some other resource you’ve specified isn’t available to you?

lklawrie的头像

No, you misunderstand.  It does run in Debug, but it also runs in "release" if I add, say, /check:bounds or /check:uninit.

The error from traceback (added but problem still exists):

forrtl: severe (157): Program Exception - access violation
Image              PC                Routine            Line        Source
Console1.exe       0000000140EDFF3D  SIMAIRSERVINGZONE        5408  SimAirServingZones.f90
Console1.exe       0000000141195E0C  SIZINGMANAGER_mp_         521  SizingManager.f90
Console1.exe       00000001411A72E8  SIMULATIONMANAGER         186  SimulationManager.f90
Console1.exe       00000001411BFCB0  MAIN__                    421  EnergyPlus.f90
Console1.exe       00000001412BE0DC  Unknown               Unknown  Unknown
Console1.exe       00000001412980FF  Unknown               Unknown  Unknown
kernel32.dll       000000007737652D  Unknown               Unknown  Unknown
ntdll.dll          00000000774AC521  Unknown               Unknown  Unknown

Linda
lklawrie的头像

FYI - as we wait for the window with the error to be posted.  This was on x64.  On win32, same settings do not produce an error.

Linda
Steve Lionel (Intel)的头像

Adding any option that changes the sequence of instructions, or layout of data in memory, can "move" a problem. Each of the options you mention significantly affects the code. A Debug configuration disables optimizations changing things much more.

You at least have the line of code where the error occurs. You might be able to add some instrumentation of your own (aka PRINT statements) to see if you can figure out what exactly the problem is. An uninitialized variable would be my guess. Running the program under Intel Inspector XE's memory analysis might provide a clue.

Steve
lklawrie的头像

As soon as you add a print statement the problem goes away.  I will try inspector.  Note -- this is only the x64 version, not the win32 version (same compiler).

My other post didn't get posted (that showed the line of code where it happened?  Is it lost in cyberspace?

Linda
Steve Lionel (Intel)的头像

Your post may have been held for moderation - though I can't see it myself.

There are a couple of different strategies I use when presented with a case like this.  One is to do "divide and conquer" with optimization settings, disabling optimization on half the sources and see if the problem goes away, and then refining so I have the minimal number of sources where optimization needs to be on to see the problem. I then try a debug configuration with just those sources set for optimization, and turn off the other run-time checks.

Other times I'll look at the instruction stream where the access violation occurs by running under the debugger in Release mode - this sometimes gives me a clue as to what is going on. I recognize that this won't be usable by many people.

Steve
lklawrie的头像

Moderated post finally showed up.  Not worth too much to try to figure out a compiler problem with the procedures you've suggested as we can just turn on the /check:uninit and the problem goes away.

It also goes away if the code is rewritten at that point to be more explicit (though same code is in previous lines).  But that's hard to tell developers to do -- write it one way in part and differently in another.

Linda
mecej4的头像

Quote:

lklawrie wrote:just turn on the /check:uninit and the problem goes away

I'd be more cautious, and conclude that the pesky bug(s) just went underground instead of "going away". The bug may come back in the future and do havoc.

iliyapolak的头像

Access Violation types of bugs can manifest at random times it dependes when your code will reference for example unaccessible memory or will try to call unmapped code in some dll.

lklawrie的头像

Sure, okay - mecej4 and ilyapolak -- I would love to track it down (if it is a bug in the code) but it runs without error on win32, and seems elusive if you can't find out what the problem is -- print statements don't show it, etc.  What would you suggest?

FYI, using other compilers to try to turn it up did not help either.

Linda
iliyapolak的头像

Who is catching Access violation error?Is this werfault.exe process?

iliyapolak的头像

If you really would like to investigate that issue then I recommend to instal application verifier and perform stress test of your app.

lklawrie的头像

See above for the traceback from the execution.  Application verifier?  Does that have another name?

Linda
iliyapolak的头像

Quote:

lklawrie wrote:

See above for the traceback from the execution.  Application verifier?  Does that have another name?

Traceback is not complete.Aplication Verifier is the actual name of software recommended by me.If you want give it try:)

app4619的头像

I thought the Visual Studio application verifier only work with VC++ projects?

jimdempseyatthecove的头像

>> This was on x64.  On win32, same settings do not produce an error.

When the code runs as x32 but not (well) as x64 then one of the causes is use of wrong sized arguments to library calls. In particular use of "integer" in place of "integer(HANDLE)" or "integer(someOtherSizeForType)". Many of these errors in programming can be found with the gen interfaces warn interfaces compiler options.

Jim Dempsey

www.quickthreadprogramming.com
iliyapolak的头像

Quote:

app4619 wrote:

I thought the Visual Studio application verifier only work with VC++ projects?

What VS application verifier are you talking about?

Application Verifier is standalone application and it is used afaik to perform stress tests of software at MS.

iliyapolak的头像

Quote:

jimdempseyatthecove wrote:

>> This was on x64.  On win32, same settings do not produce an error.

When the code runs as x32 but not (well) as x64 then one of the causes is use of wrong sized arguments to library calls. In particular use of "integer" in place of "integer(HANDLE)" or "integer(someOtherSizeForType)". Many of these errors in programming can be found with the gen interfaces warn interfaces compiler options.

Jim Dempsey

Should not that be handled by WOW64 layer?

lklawrie的头像

FYI, Inspector sees the access violation at the line but does not give much information:

P2: Error: Invalid memory access
 P2.3: Invalid memory access: New
  C:\Working\EnergyPlus\SourceCode\EnergyPlus\SimAirServingZones.f90(5411): Error X4: Read: Function SIMAIRSERVINGZONES_mp_UPDATESYSSIZING: Module C:\Users\lklawrie\Documents\Visual Studio 2008\Projects\Console2\x64\Release\Console2.exe
  Code snippet:
   5409              FinalZoneSizing(CtrlZoneNum)%CoolZoneRetTempSeq(TimeStepIndex) = &
   5410                FinalZoneSizing(CtrlZoneNum)%CoolZoneTempSeq(TimeStepIndex) + RetTempRise * &
  >5411               (1.d0/(1.d0+TermUnitSizing(CtrlZoneNum)%InducRat))
   5412            END IF
   5413            RetTempRise = FinalZoneSizing(CtrlZoneNum)%HeatZoneRetTempSeq(TimeStepIndex) - &

  Stack (1 of 1 instance(s))
  >Console2.exe!SIMAIRSERVINGZONES_mp_UPDATESYSSIZING - C:\Working\EnergyPlus\SourceCode\EnergyPlus\SimAirServingZones.f90:5411
   Console2.exe!SIZINGMANAGER_mp_MANAGESIZING - C:\Working\EnergyPlus\SourceCode\EnergyPlus\SizingManager.f90:521
   Console2.exe!SIMULATIONMANAGER_mp_MANAGESIMULATION - C:\Working\EnergyPlus\SourceCode\EnergyPlus\SimulationManager.f90:186
   Console2.exe!MAIN__ - C:\Working\EnergyPlus\SourceCode\EnergyPlus\EnergyPlus.f90:421
   Console2.exe!main - C:\Users\lklawrie\Documents\Visual Studio 2008\Projects\Console2\x64\Release\Console2.exe:0x00000000012F3877
   Console2.exe!_tmainCRTStartup - f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crt0.c:266
   kernel32.dll!BaseThreadInitThunk - C:\windows\system32\kernel32.dll:0x000000000001652B
   ntdll.dll!RtlUserThreadStart - C:\windows\SYSTEM32\ntdll.dll:0x000000000002C51F

P1: Critical: Unhandled application exception
 P1.4: Unhandled application exception: New
  C:\Working\EnergyPlus\SourceCode\EnergyPlus\SizingManager.f90(521): Critical X5: Exception: Function SIZINGMANAGER_mp_MANAGESIZING: Module C:\Users\lklawrie\Documents\Visual Studio 2008\Projects\Console2\x64\Release\Console2.exe
  Code snippet:
   519 
   520      IF (NumSizingPeriodsPerformed > 0) THEN
  >521        CALL UpdateSysSizing(EndSysSizingCalc)
   522        SysSizingRunDone = .TRUE.
   523      ELSE

  Stack (1 of 1 instance(s))
  >ntdll.dll!MD5Final - C:\windows\SYSTEM32\ntdll.dll:0x00000000000943B6
   ntdll.dll!_C_specific_handler - C:\windows\SYSTEM32\ntdll.dll:0x00000000000185A6
   ntdll.dll!RtlDecodePointer - C:\windows\SYSTEM32\ntdll.dll:0x0000000000029D09
   ntdll.dll!RtlUnwindEx - C:\windows\SYSTEM32\ntdll.dll:0x00000000000191AA
   ntdll.dll!KiUserExceptionDispatcher - C:\windows\SYSTEM32\ntdll.dll:0x0000000000051273
   Console2.exe!SIZINGMANAGER_mp_MANAGESIZING - C:\Working\EnergyPlus\SourceCode\EnergyPlus\SizingManager.f90:521
   Console2.exe!SIMULATIONMANAGER_mp_MANAGESIMULATION - C:\Working\EnergyPlus\SourceCode\EnergyPlus\SimulationManager.f90:186
   Console2.exe!MAIN__ - C:\Working\EnergyPlus\SourceCode\EnergyPlus\EnergyPlus.f90:421
   Console2.exe!main - C:\Users\lklawrie\Documents\Visual Studio 2008\Projects\Console2\x64\Release\Console2.exe:0x00000000012F3877
   Console2.exe!_tmainCRTStartup - f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crt0.c:266
   kernel32.dll!BaseThreadInitThunk - C:\windows\system32\kernel32.dll:0x000000000001652B
   ntdll.dll!RtlUserThreadStart - C:\windows\SYSTEM32\ntdll.dll:0x000000000002C51F

I added a check for allocated TermUnitSizing in the IF Block but of course then the whole code runs. (actually, printed not just a check)

/warn:interfaces did not yield anything.

Linda
iliyapolak的头像

You have access violation bug and Inspector will not give you the clearest picture.You need to know what is the faulting ip and stop on faulting ip and inspect thread context content. 

lklawrie的头像

"faulting ip"?  Please be clear in what you are requiring/suggesting.  And why is it fine in Win32 but not x64?

Linda
Steve Lionel (Intel)的头像

Different architectures mean different memory layout, different instructions, etc. "faulting IP" means the instruction pointer of where the error occurred. Technically, you have that already as its in the traceback.

So now we know that the source line of the error is:

  5409              FinalZoneSizing(CtrlZoneNum)%CoolZoneRetTempSeq(TimeStepIndex) = &
   5410                FinalZoneSizing(CtrlZoneNum)%CoolZoneTempSeq(TimeStepIndex) + RetTempRise * &
  >5411               (1.d0/(1.d0+TermUnitSizing(CtrlZoneNum)%InducRat))

but we don't know which part of this statement is the problem. realistically, it will be either FinalZoneSizing(CtrlZoneNum)%CoolZoneRetTempSeq(TimeStepIndex) or TermUnitSizing(CtrlZoneNum)%InducRat). What I would do, as an experiment, is declare a temporary variable, say, temp, and assign TermUnitSizing(CtrlZoneNum)%InducRat)) to it before this statement, then substitute temp in the expression. Do you still get the error? Does the error move? If the error still exists but doesn't move, then FinalZoneSizing(CtrlZoneNum)%CoolZoneRetTempSeq(TimeStepIndex)  is probably the problem - you'd have to figure out which part of this is wrong.

Steve
lklawrie的头像

There is no longer an error in that case.  now what?  And, as I've said before, if you don't change the statement but print something before it (didn't try printing after) -- the file runs.

Linda
Steve Lionel (Intel)的头像

If I had the program here, I'd step through the instructions in the debugger to see what it is doing. It's not something I could easily walk you through.

Steve
lklawrie的头像

Where would you like me to put it?  I can zip up the object/source/project.  It will be quite large. 

I have to backtrack on my previous statement -- it does now terminate where it says:

temp=termunitsizing

I had inadvertently added a "if (allocated(termunitsizing)) stop" statement above that -- and it caused it to run.

maybe i can work with it further.

Linda
lklawrie的头像

Still no help -- now inspector says it fails there but the debugger information is not helpful (due to the optimization, I'm sure).

Where would you like a zip file, Steve?

Linda
Steve Lionel (Intel)的头像

Please use Intel Premier Support and ask that the issue be assigned to me.

Try building with /standard-semantics (Fortran > Language > Enable F2003 Semantics)

Steve
lklawrie的头像

I am building with F2008 semantics.  I did manage to get the compiler / debugger to show that the index of that structure was 0 -- however it is inside a specific loop with that index as a loop control variable.

It may take a while to make it so you can see it -- but to me, it seems like a compiler bug.

Linda
iliyapolak的头像

Quote:

lklawrie wrote:

"faulting ip"?  Please be clear in what you are requiring/suggesting.  And why is it fine in Win32 but not x64?

Sorry for not providing enough expalnation.Faulting ip = instruction pointer which caused some fault or exception.

iliyapolak的头像

Please proceed exactly as Steve advised.Run your program under debugger(first source-level) step in on every instruction and inspect the memory beign read/written pay attention to any pointer dereferences.If this will not be helpful machine code level debugging with the help of application verifier should be used to test your app.

Sergey Kostrov的头像

>>...Large project -- getting an access violation with certain compiler settings...

Linda,

The thread is already 5-day-old and why woudn't you post a complete set of compiler settings for a review?

lklawrie的头像

Issue 698745 posted.  Steve, it's set to run with break at the place where it will most likely crash.

No one asked for compiler settings. Compiler settings:

Compiler:
/nologo /debug:minimal /O2 /module:"x64\Release\\" /object:"x64\Release\\" /Fd"x64\Release\vc90.pdb" /traceback /libs:static /threads /c
pasted to command line window (from others)
/nologo /fpp /stand:f08 /Qdiag-disable:5268 /fpscomp:none /nogen-interfaces /F8388608 /DWINDOWS /O2 /DNDEBUG

Linker:
/OUT:"x64\Release\Console3.exe" /INCREMENTAL:NO /NOLOGO /MANIFEST /MANIFESTFILE:"C:\Users\lklawrie\Documents\Visual Studio 2008\Projects\Console3\x64\Release\Console3.exe.intermediate.manifest" /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG /PDB:"C:\Users\lklawrie\Documents\Visual Studio 2008\Projects\Console3\x64\Release\Console3.pdb" /SUBSYSTEM:CONSOLE /IMPLIB:"C:\Users\lklawrie\Documents\Visual Studio 2008\Projects\Console3\x64\Release\Console3.lib" /STACK:8388608

Linda
Steve Lionel (Intel)的头像

Thanks - I got it. I will get to it soon.

Steve
Sergey Kostrov的头像

>>...Adding /fpe:0 and it runs...

Thanks for the command line options and I'd like to confirm that you have Access Violations in Release Configuration only.

I think your workaround is very interesting and why did you decide to use 0?

Steve Lionel (Intel)的头像

/fpe:0 changes the choice of instructions used. 3 is the default and the only other choice is 1, which hardly anyone bothers with. (2 was suppported with DEC compilers.) It does seem that most anything that changes the instruction stream makes the problem "go away" based on what Linda has said to date.

Steve
iliyapolak的头像

So Linda do you have av exception in debug or in release mode?

lklawrie的头像

only in release mode, only with the particular compiler settings shown.

Linda
iliyapolak的头像

Strange because rigorous stack checking is relaxed in release mode.Have you tried to step-in through the code?

Sergey Kostrov的头像

>>... Adding /fpe:0 and it runs...

Steve, I see the following in Fortran compiler help:

...
/fp: name
enable floating point model variation
except[-] - enable/disable floating point semantics
fast[=1|2] - enables more aggressive floating point optimizations
precise - allows value-safe optimizations
source - enables intermediates in source precision
strict - enables /fp:precise /fp:except, disables contractions and enables pragma stdc fenv_access
...

and I don't see any numeric values similar to what Linda uses.

Sergey Kostrov的头像

>>... I don't see any numeric values similar to what Linda uses...

Sorry, I missed it... Please ignore my previous post.

iliyapolak的头像

Linda any updates related to your project?

lklawrie的头像

Steve is looking at it -- including the assembler from the module that is showing the access violation.  He called it "odd" as well.

Linda
Steve Lionel (Intel)的头像

It's defying easy analysis, I can tell you that much.... But I CAN reproduce it with a build from sources, so that helps...

Steve

登陆并发表评论。