Break on NaN in gdb
Published on
Recently I had to debug a case where, somewhere during a
numerically-intensive computation (solving an ordinary differential
equation), a value would become NaN
(“not a number”). This
happens, for example, when taking a logarithm or a square root of a
negative number, dividing 0 by 0, stuff like that. However, I had no
idea where and why NaNs appeared in this particular program.
So here I’ll show how to detect this using gdb, the GNU debugger. Here is the program that we will be debugging:
/* nan.c */
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
void f1(double *y) {
*y -= 2.0;
}
void f2(double *z) {
*z = sqrt(*z);
*z += 1.0;
}
int main() {
double *x = malloc(sizeof(double));
*x = 5e4;
int i;
for (i = 0; i < 1000; i++) {
(x);
f1(x);
f2}
("%f\n", *x);
printf(x);
free}
Compile it with
gcc -g -lm -Wall -pedantic -onan nan.c
and confirm that it produces a NaN:
% ./nan
-nan
Now, start gdb and proceed to the point where the storage for our
double
value is allocated:
Reading symbols from ./nan... (gdb) start Temporary breakpoint 1 at 0x4011cd: file nan.c, line 15. Temporary breakpoint 1, main () at nan.c:15 15 double *x = malloc(sizeof(double)); (gdb) next 16 *x = 5e4;
At this point we can do two useful things.
First, we turn on the program execution log, so that we can go back in time once we encounter a NaN.
(gdb) set record full stop-at-limit off (gdb) record full
Second, we set a watchpoint that will trigger once *x
becomes NaN. But how do we express that?
If we were programming in C, we would use the isnan()
function. But isnan()
is not a C function, it is a
preprocessor macro and is by default not available in gdb. And even if I
make it available (by compiling the program with -g3
), it
still doesn’t work, at least on my system:
(gdb) macro expand isnan(1.0) expands to: __builtin_isnan (1.0) (gdb) print isnan(1.0) No symbol "__builtin_isnan" in current context.
Luckily, there are a few workarounds. Perhaps the simplest and the most reliable one is to exploit the fact that NaN is the only floating-point value not equal to itself. Therefore we can set a conditional watchpoint like this:
(gdb) watch *x if *x != *x Hardware watchpoint 2: *x
Some other options are:
Figure out the underlying C function used to implement the
isnan
macro. On my system, this seems to work:(gdb) p ((int (*)(double))__isnan)(sqrt(-1.0)) $1 = 1 (gdb) p ((int (*)(double))__isnan)(sqrt(1.0)) $2 = 0
However, it may differ on your system/compiler/standard library.
Wrap the standard
isnan
macro in a C function in your program to make it available inside gdb:int myisnan(double x) { return isnan(x); }
(Naming your wrapper function
isnan
might work too. In particular, it doesn’t necessarily lead to an infinite recursion because the innerisnan
call will be expanded by the preprocessor. However, the C standard explicitly says (section 7.1.3) that the standard macro names are reserved identifiers, and redefining these identifiers results in undefined behavior.)You can try to inspect the bit pattern of the floating-point number directly. See e.g. this answer by Paul Pluzhnikov. Note, however, that IEEE 754 NaNs do not have a fixed bit pattern as Paul appears to assume (they may have an arbitrary fraction part), so unless you know exactly what NaN you are expecting, you have to be more careful.
In any case, once we’ve started recording the execution log and set a watchpoint, we are ready to restart the program and wait for the condition to trigger:
(gdb) continue Continuing. Hardware watchpoint 2: *x Old value = -0.19219211672498604 New value = -nan(0x8000000000000) f2 (z=0x4052a0) at nan.c:11 11 *z += 1.0;
Now we know that NaN was produced in f2
right before
line 11. If this is not enough to diagnose the bug, we can use the
execution log to go back in time. Let’s say we want to find out what the
value of *x
was before the last f1
call.
(gdb) tbreak f1 Temporary breakpoint 3 at 0x40115e: file nan.c, line 6. (gdb) reverse-continue Continuing. Temporary breakpoint 3, f1 (y=0x4052a0) at nan.c:6 6 *y -= 2.0; (gdb) print *y $1 = 1.807807883275014