Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

debugging apache response status 104 for cgi script

by blahblahblah (Priest)
on Oct 20, 2006 at 18:09 UTC ( [id://579664]=perlquestion: print w/replies, xml ) Need Help??

blahblahblah has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm looking for advice on debugging an error in a cgi script. First, some background info: the script in this case is accepting utf8 data in many languages, including Japanese. Several Japanese users are experiencing occasional errors where they get no response from the server. At first we were inclined to blame the network, but now we know that other language users on the same network do not see the errors. Also, when the users swap machines the problems follow the Japanese users exclusively. And when the user goes back and re-types the same info into the form, it works on the next submission.

From the apache logs and from my own logging, I'm inclined to believe that this might be an apache error. The access_log shows this:

clientip - - [20/Oct/2006:16:53:56 +0000] "POST /...myscript.pl HTTP/1 +.1" 104 617 0
this breaks down as:
LogFormat "%h %t \"%r\" %>s %b %T" %...h: Remote host %...t: Time, in common log format time format %...r: First line of request %>..s: Status %...b: Bytes sent, excluding HTTP headers. %...T: seconds taken to serve the request.
Aside from the date/time, the line is always the same. The first interesting thing to me was the 104 status code. Then I noticed that 617 bytes is the same size as the standard "500 internal server error" page. There is nothing in the error_log when these errors occur.

I write a log file at the start of my script, but when this 104 error happens the log file does not get written. So maybe this is an apache problem more than a perl problem? The 104 code would at least make me think that apache is getting a little confused.

My boss is still focused on ruling out the network. He wants the user to try yet another client machine, bypassing the proxy server. If that also fails, I want to have a next step in mind. We already verified (with a local proxy on the client's machine) that the request looked ok. Is there a way in apache to make it log the entire request before any processing occurs? Then we could definitely rule out the network.

How would you debug this problem? Could I attach strace to every apache process -- and if so, how could I make sense of the massive amounts of resulting data? Is there some apache option that might help here?

I found only 1 similar problem when searching google for the 104 status code: http://mail-archives.apache.org/mod_mbox/httpd-users/200302.mbox/%3C04aa01c2d76b$e890a020$08071bce@WS4%3E But there was no resolution in that thread.

Since I don't know what's relevant as far as versions, here's all the info I could think of mentioning:

[root@xxx4 debugging]# uname -a Linux xxx4.xxx.net 2.4.20-021stab028.17.777-enterprise #1 SMP Tue Jul +19 19:31:27 MSD 2005 i686 i686 i386 GNU/Linux [root@xxx4 debugging]# /usr/sbin/httpd -l Compiled in modules: core.c prefork.c http_core.c mod_so.c [root@xxx4 debugging]# /usr/sbin/httpd -V Server version: Apache/2.0.51 Server built: Feb 22 2005 19:17:50 Server's Module Magic Number: 20020903:9 Architecture: 32-bit Server compiled with.... -D APACHE_MPM_DIR="server/mpm/prefork" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D HTTPD_ROOT="/etc/httpd" -D SUEXEC_BIN="/usr/sbin/suexec" -D DEFAULT_PIDLOG="run/httpd.pid" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_LOCKFILE="run/accept.lock" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="conf/mime.types" -D SERVER_CONFIG_FILE="conf/httpd.conf" [root@xxx4 debugging]# /usr/sbin/httpd -v Server version: Apache/2.0.51 Server built: Feb 22 2005 19:17:50 perl -V Summary of my perl5 (revision 5.0 version 8 subversion 1) configuratio +n: Platform: osname=linux, osvers=2.2.12-20, archname=i586-linux uname='linux mars 2.2.12-20 #1 mon sep 27 10:25:54 edt 1999 i586 u +nknown ' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultipl +icity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FIL +E_OFFSET_BITS=64 -I/usr/include/gdbm', optimize='-O2', cppflags='-fno-strict-aliasing -I/usr/include/gdbm' ccversion='', gccversion='egcs-2.91.66 19990314/Linux (egcs-1.1.2 +release)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=1 +2 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', + lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lposix perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc -lposix libc=/lib/libc-2.1.2.so, so=so, useshrplib=false, libperl=libperl. +a gnulibc_version='2.1.2' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynami +c' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: USE_LARGE_FILES Built under linux Compiled at Feb 18 2004 10:50:23 @INC: /usr/my_perl/lib/5.8.1/i586-linux /usr/my_perl/lib/5.8.1 /usr/my_perl/lib/site_perl/5.8.1/i586-linux /usr/my_perl/lib/site_perl/5.8.1 /usr/my_perl/lib/site_perl

Thanks,
Joe

Update: added readmore tags

Replies are listed 'Best First'.
Re: debugging apache response status 104 for cgi script
by shmem (Chancellor) on Oct 20, 2006 at 22:37 UTC
    I'd use a network sniffer like wireshark, ethereal or tcpdump and capture the packets of interest on the server, possibly into a ring buffer. tail -f $access_log | grep 104 should then suffice to stop capturing in a timely manner to have the whole client-server conversation in the buffer.

    BTW, are you using strict and warnings in your perl code? (obligatory question :-)

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      Thanks for the advice. It took some effort, but we were able to get ethereal going to monitor the server traffic. There is a definite pattern to the failures. Every one of them coincides with a TCP RST command, like this:
      client > server HTTP POST server > client ACK client > server RST
      I don't know enough about networking to interpret what that means, but I can see a one-to-one correlation between those RST's and the 104 status codes in our log.

      My best guess is that some firewall/antivirus/router/whatever in between their citrix and our server is examining the content of the traffic and deciding to kill the session because the Japanese UTF-8 data looks malicious.

      Does that seem like a reasonable theory? Anyone else know TCP well enough to offer other theories?

      Thanks,
      Joe

        Your theory is reasonable, but I can't tell whether it's accurate without having the logs. Next would be a network log at the client machine, which will be even more difficult than the server side sniffing, I guess. But now you should focus on determining the guilty party. Is it really the original client sending the RST, or is it some other component?

        Some hints.. you talk of Several Japanese users in the OP, do they come along the same network path? A traceroute to the client could tell you. The ethereal gui allows you to look at the packet's content and at various header fields. Are the sequence numbers like what is expected? Does the ACK from the server carry any payload? It shouldn't, and specially there shouldn't be any UTF-8 data present. What's the time difference between the ACK and the client RST? Maybe the client is running into a timeout?

        Again, without having the capture it's difficult to tell.

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: debugging apache response status 104 for cgi script
by blahblahblah (Priest) on Nov 07, 2006 at 15:01 UTC
    Just an update, in case someone else has the same problem and stumbles across this node. It did in fact turn out to be some sort of network problem, though our customer's network people didn't explain any specifics to us. Shmem, thanks for all your help.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://579664]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-25 08:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found