Debugging floating point registers

Debugging floating point registers

Is there a way in the latest debugger (V 12.0) to set a watchpoint on a register? I'm having stack corruption, and want to break when register $f7 is updated, but "watch $f7" produces a syntax error.


10 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

I believe the answer is NO, you can't trap on a register, but I'll check with an expert. It's easy to hook memory to generate a trap on access, but I don't believe that mechanism is applicable to registers.

I'll get back to you on this.

I am afraid I don't have good news for you on this topic either.

With one exception watchpoints can only be set on regular expressions, which implies that the location you want to set a watchpoint at has to be memory mapped. Thus most any device register would work fine for a watchpoint, but not a floating point register or processor register.

The exception to my statement are certain registers for which predefined convenience variables are defined. Those are $fp, $pc, $ps, $sp. There is no such variable for f7.

I guess you could define it yourself by assigning the f7 register contents to a variable and then set a breakpoint on that variable, but the problem is that you will need to force the constant variable update yourself .... not a good solution. Can you set a watchpoint in the routine that does the f7 update?

Try running with uninitialized variables check and subscript out of bounds checking. It wouldn't hurt to run gen-interfaces/warn-interfaces too.

Stack corruption can also be caused by calling differences (when calling cross language and/or libraries other than shipped with your language (in this case IVF)). And, in this case, the error may not be noticed until some time well past the error.

You can also salt your program with tests containing inline assembly (conditionally compiled such that you can easily remove or modify the diagnostic code later). Make the function a PURE logical function, and do not modify any registers other than the return eax/rax. Expect the problem to go away or move when you attempt to look at it (Heisenbug principal).

Jim Dempsey

I finally tracked down the issue, but I guess the kernel (or something else) is really at fault here. I was performing an instruction that was generating a floating point underflow, and after examing the FPU status register, I could indeed confirm that the error flag and underflow exception flags were both set after the instruction passed. HOWEVER, the kernel did not raise a SIGFPE until the next floating point instruction, which in my particular case, was well after the memory addresses and variables in question had been overwritten. (Because the instruction where the SIGFPE was being raised was not going to cause an exception, I was led to believe there was something else at work.)

So, after hours of debugging, my question I guess is why is the check on the floating point status register apparently being done before a floating point instruction rather than after it? It would seem to me that the usual place where this register is going to be changed by the processor will be after an instruction. Looking back, I guess this hasn't been an issue in the past because in our programs there is no shortage of floating point operations, and you usually don't have to go back more than 4 or 5 lines of code to find the offending line, but in this case the pointers which were being dereferenced and multiplied had changed since the actual fault, thus creating headaches.

Note: The compiler flags for the main object are:
-extend-source 132 -assume nounderscore -assume nobscc -align dcommons -static-libgcc -zero -fp_port -save -c -fpe0 -ftz -prec_div -fp-stack-check -ccdefault fortran -traceback -xSSE2 -axSSE2 -g -debug full -debug-parameters -check bounds -O0 -m32

The link flags are:
-static-libgcc -Wl,-d -Wl,--sort-common -Wl,-export-dynamic -m32

Compiler version 12.0.0


What you may be seeing here is the error may not have been noticed (signaled) until the pipeline empties and which may coincidentally occured at the next floating point instruction due to it having a dependency on register being referenced (and which was the results register of the instruction that generated the error).

What happens if you insert non SSE using delay diagnosticcode between the instruction causing the error and the instruction using the results of the error. The delay has to be sufficiently long enough to covercache level latencyor RAM latency that may be delaying the execution of the instruction causing the error. IOW delay longer than the time it takes for the dependency to be satisfied. Make sure whatever you insert there does not compile into using SSE registers that could cause. If the delay loop is long enough, what this will do is tell you if the SIGFPE occures when detected or when next instruction is executed. This information may be helpful in generating defensive code should you decide to take that route.
Do not use instructions that may cause asynchronizing event.

Jim Dempsey

This was all diagnosed in the debugger (which has pretty sweet register debugging tools, by the way), and very repeatable, so I doubt it was coincidence. One time I actually stepped over the instruction that triggered the underflow, then went to lunch and debugged the registers after I got back, and hours later stepped to the instruction that triggered the SIGFPE. Unless CPU cache and/or instructions are on a per-process basis (in which case debugging wouldn't have any effect on timing)....

Also, I had changed the code between the triggering event and the line that the kernel FPE'd on several times, (though never inserting any new FP operations......mostly print statements, etc), and it always signaled at the same point, which was the first FP instruction after the underflow (FLDS in this case).

It should be noted that the result was actually valid (and correct) in the floating point result register......the value (1E-64) was just too small to store in a single precision variable.

>>It should be noted that the result was actually valid (and correct) in the floating point result register......the value (1E-64) was just too small to store in a single precision variable

Are you talking FPU registers or SSE registers?

The FPU code computes internally with greater precision. Only on store, will the underflow/overflow be reported. Whereas SSE computes only in the precision of the source/destination.

Jim Dempsey

Don't know. Whatever $f1 is. I'm green when it comes to register debugging.

IA32 and Intel64 have two different floating point instruction sets (three if you count MMX)

FPU aka FP87 is a stack structured instruction set that is a carryover from the 8087 FPU. It is used for two reasons. 1) Old legacy code uses it, and 2) the internal computations are performed with 80-bit internal format (higher precision).

The newer instruction set was MMX, but is now various revision levels of SSE, and just now being extended to AXV.

MMX, SSE and AVX are register oriented instruction sets andmore importantly are SIMD (Single Instruction Multiple Data). Your compiler options shown in earlier post indicates you are using SSE2 options which will generally use the SSE2 instructions but may fall back on FPU instructionsunder some circumstances (or some libraries may use the FPU instruction set).

The FPU stack is observed in the debugger using $fn where n is the stack level
The SSE regisers are observed, depending on the debugger, as $xmm0 through $xmm15
If AVX is available then alternatively $ymm0 through $ymm15

The FPU stack holds 80-bit temps (one value per entry)
The SSE xmmn registers holds 128 bits of information as 2 doubles, or 4 floats, or 2 quad words, or 4 dwords, or 8 words, or 16 bytes. The AVX ymm registers are twice as wide as the SSE registers.

Jim Dempsey

Leave a Comment

Please sign in to add a comment. Not a member? Join today