http://qs321.pair.com?node_id=143111


in reply to Trouble with mod_perl, Apache::Session::Postgres, and DBI

I'll start off by admitting that I typically use MySQL, and that I'm only starting to get into mod_perl and CGI in Perl. That said, I had a segfault problem just yesterday with a non-obvious solution. This is a story. I have no idea if this will help you, but I'll tell it because it just might.

My system is Debian GNU/Linux of no particular version, having been upgraded in bits and pieces since it was installed eons ago. (Package names I mention here will be the Debian packages.) The script in question worked fine from the command line, but causeed an irritatingly uninstructive "[notice] child pid 32570 exit signal Segmentation fault (11)" to appear in Apache's error log when run under mod_perl. Other scripts ran just fine under mod_perl, and even the very one causing trouble ran fine if I removed the DBI->connect() call.

I started my journey at uh... segfault. mod_perl, Apache::DBI and other atrocities., which suggests compiling Apache with --disable-rule=EXPAT. Unfortunately, that didn't seem to help me at all. I think this one's worth looking at anyway, though, as it seems to be a common problem. It's common enough that it's documented in the Apache sources.

Following this lead, I started recompiling everything that could be possibly related to this problem. It sure looked like some kind of library mismatch problem. I recompiled mod_perl, I recompiled libmysqlclient10, I recompiled Apache a few more times, and so on. I stopped at recompiling libc6 because that was a bit of a serious undertaking, but I didn't think that would help anyway.

Next, I scoured the web a bit more, using Google. I turned up mod_perl guide: Debugging mod_perl, which showed me a few more things to try.

I ran apache with the -X option, so I could observe what it was doing. I ran tail -f on the Apache logs. The server started up, I connected and requested the troublesome page, and then the server segfaulted. Okay, that was not helpful. Then I ran it under strace to see what it was doing. Unfortunately, it made 160k of output, and nothing was obviously wrong near the end. That wasn't of much help.

I fired up apache under gdb, as suggested by the guide. The last call was to db_open in libdb.so.3. Well, that didn't tell me anything really interesting, I thought, because my script was only segfaulting when I called DBI->connect().

I did another web search, though, and discovered that installing libdb3 causes libdb2 programs to break on my type of system, in particular causing things like libns-db (which allows /etc/passwd and /etc/shadow to be databases) to break. Now, my system doesn't use database passwd or shadow, but I thought perhaps the libdb2/3 problem was affecting Apache and friends. First, I tried uninstalling libdb3, but that didn't work because libapache-mod-ssl depended on it. Hrmm, interesting. I tried upgrading both libdb3 and libapache-mod-ssl, but they were both the newest version. Even more interesting. Obviously libdb2 was the newest version, being one of the first things I tried upgrading. Well, let's just see what happens if I upgrade libns-db. .... oh, it's installing. Hm.

Run apache.
No segfault.
Do the dance.

That was a completely, totally, absolutely, unexpected solution.

So, now we return to this problem. Have you tried running the script from the command line? How about under Apache, without mod_perl? Have you strace/truss'd apache -X, and did it tell you anything interesting? Did you try gdb (or your favourite debugger) on apache -X to see what it was doing right at the end? Some of these just might give you the clue you need to solve this one.

Good luck!