Debugging tips for segmentation faults?

Tanktalus has asked for the wisdom of the Perl Monks concerning the following question:

Does anyone have some general tips for debugging segmentation faults that they'd like to share? Most of the time, perl -d is sufficient for debugging, but here, since the debugger and debuggee are in the same process, that segmentation fault kinda brings both down at the same time.

And my manager is kinda finicky that we don't check in anything that causes a segmentation fault :-)

I'm trying to narrow it down, but as I said, I can usually use perl -d, so I can much more easily get stack traces at warnings and the like, so I don't have nearly the experience at dealing with these as I'd like. I'm sure I'm doing something wrong, but I must be doing it wrong in such a way that the perl virtual machine gets really confused...

Update: And it doesn't help that the program hangs on filesystem calls every once in a while... (e.g., we have a driver that runs over NFS which has proved less than perfectly reliable on Linux where I'm testing this... as if NFS wasn't unreliable enough...)

Update 2: I should have said so at the beginning, but Anonymonk is reminding me so I'll say that I've had similar experiences moving from perl 5.6.1 to perl 5.8.0 solving a number of problems I had integrating XML::Twig into this same project (that requirement then went away). So I'm now testing/developing/going into production with perl 5.8.6. It took us nearly a year to get them to upgrade to 5.8.6, I'm not going to be able to convince them to move up again for at least another year, possibly two. However, I'm pretty convinced that I should be able to work around this segfault as this section of code was working just two weeks ago (last time I tested it), so I must have done something to twist perl's understanding of what to do.

Further, to hv's comment, I don't get core. I'm running as root. And my code won't work without root privileges. :-) But compiling a debugging perl may be an option. Thanks.

Comment on Debugging tips for segmentation faults?

Replies are listed 'Best First'.
Re: Debugging tips for segmentation faults? by hv (Prior) on Jul 12, 2005 at 22:44 UTC
I love segfaults - anything that dumps core gives you a major headstart for debugging. Before you start, think about what the program was asked to do: try to be clear in your mind what inputs it was given, and what it should have been trying to do. If you have information on screen grab it now before it scrolls off. If the inputs are sufficiently constrained that you feel any problems should be reproducible, try reproducing it: do it now, while the steps are fresh in your mind. Have a perl built with debugging symbols; I usually build with: `./Configure -des -Doptimize='-g -O2'` [download] .. but setting just '-g' will make the code a bit easier to step through. Keep the build tree handy, so that `gdb` can find the source files. If you can afford to, use this as the default perl - if you can't reproduce the problem it can be frustrating having a core file only for a non-debug build, but the debug build will be bigger and slower¹. Point `gdb` at the core file, examine the stack trace to get an idea of where we've come from, then look at the current source line and examine the variables it's accessing: the direct cause is usually obvious, but this line is rarely where the bug is. (In my experience, the actual bug is mostly one or two steps of indirection away: "over there we allow the pointer to become garbage; when we get here we dereference it" is commonest.) Assuming we're not looking at the line with the bug on it, if the crash is not reproducible it now gets difficult: consider instrumenting the code to show before/after values anywhere the bad data might get modified, in the hope that next time the problem occurs it will reveal more information about the cause. Or just examine such areas of code, and hope to understand enough to be able to spot where there is an incorrect assumption or bad logic. If it is a reproducible crash, you can save time now by cutting the test case down. You're not trying to create a minimal test case at this point (I've wasted many hours forgetting that), just trying to make it easy to get to the point where the crash occurs (or some relevant point before that) - try to hardcode data that was coming from external sources (and in particular anything from the keyboard), and try to cut out irrelevant code that takes more than a few seconds. Then use breakpoints, and inspect the code and data, to zero in on the cause of the problem. I can't give much specific advice here - the path you take will depend entirely on what you discover, and you'll need to have (or develop) a good understanding of C, and the way C is used in the perl interpreter, and possibly XS if you're trying to debug modules using it. One trick I have found useful is to set a breakpoint on a problem function, set it to ignore many times, then run the test case and see where it stops: `(gdb) break S_regmatch Breakpoint 1 at 0x810d0be: file regexec.c, line 2272. (gdb) ignore 1 100000 Will ignore next 100000 crossings of breakpoint 1. (gdb) run [...] (gdb) info break Num Type Disp Enb Address What 1 breakpoint keep y 0x0810d0be in S_regmatch at regexec.c:2272 breakpoint already hit 1713 times ignore next 98287 hits` [download] Now you know how many times the break point was reached, you can set the ignore count to 1 or 2 less than that to stop it at a useful point. Hope some of this is helpful, Hugo ¹: C code doesn't run slower just because you have debugging symbols around, but `Configure` will spot `-g` in the C flags, and turn on `-DDEBUGGING` which has a fair amount of overhead. I don't know how to stop it doing that.	[reply] [d/l] [select]
Re^2: Debugging tips for segmentation faults? by Steve_p (Priest) on Jul 13, 2005 at 03:34 UTC
Adding the `-g` flag turns on `-DDEBUGGING` immediately following the `Configuration` check for the optimize flags. If you want compile without -DDEBUGGING, you'll have to Configure by hand and remove the `-DDEBUGGING` from the cflags. If you're brave (or experienced in dealing with the config.sh file), you can edit the config.sh by hand and remove the -DDEBUGGING from it after completing a run of `Configuration`.	[reply] [d/l] [select]
Re: Debugging tips for segmentation faults? by Anonymous Monk on Jul 12, 2005 at 23:00 UTC
First thing to do, try several different versions of perl to see if its a bug that's already been fixed in recent releases, or a regression which needs re-fixing. That process usually smokes out about 95% of perl bugs for me. YMMV.	[reply]
Re: Debugging tips for segmentation faults? by Zaxo (Archbishop) on Jul 13, 2005 at 03:31 UTC
You can run `strace perl foo.pl` for a quick and dirty look at what kind of thing is going wrong, and where. That probably won't get you all the answers, but it'll likely get you looking in the right places. After Compline, Zaxo	[reply]
Re: Debugging tips for segmentation faults? by Steve_p (Priest) on Jul 13, 2005 at 03:38 UTC
Further, to hv's comment, I don't get core. I'm running as root. And my code won't work without root privileges. :-) But compiling a debugging perl may be an option. Thanks. You probably have your ulimits set to not produce a core file. Before testing your script, set your ulimits (assuming `bash`) with `> ulimit -c unlimited` [download]	[reply] [d/l] [select]
Re^2: Debugging tips for segmentation faults? by Tanktalus (Canon) on Jul 13, 2005 at 13:08 UTC
Thanks - although I've run that once now, too, and didn't get a core (at least, not one that I can find). I'm really not sure where it may have ended up. If it was created at all.	[reply]
Re: Debugging tips for segmentation faults? by BUU (Prior) on Jul 13, 2005 at 03:16 UTC
I should have said so at the beginning, but Anonymonk is reminding me so I'll say that I've had similar experiences moving from perl 5.6.1 to perl 5.8.0 solving a number of problems I had integrating XML::Twig into this same project (that requirement then went away). So I'm now testing/developing/going into production with perl 5.8.6. It took us nearly a year to get them to upgrade to 5.8.6, I'm not going to be able to convince them to move up again for at least another year, possibly two. However, at the very least, you should try it with these new versions. If you find that it works in one version but breaks in the previous version, then you have a much smaller area to examine for changes. It would probably be much easier to try to diff say, perl5.8.5 and perl 5.8.6 to see exactly what bug you've hit. Or possibly not.	[reply]


Welcome to the Monastery
	PerlMonks