http://qs321.pair.com?node_id=17704
Category: Web Stuff
Author/Contact Info Les Howard les@lesandchris.com
Description: This program is designed to read from an apache logfile pipe. It automatically compresses the data and splits it into files each containg one day's worth of data. The directory to write the log files to should be set in the environmental variable LOG_DIR. Log files are named access_log_YYYY_MM_DD.N.gz. Here is how I have apache configured to call this program.

CustomLog |/path/to/websplit.pl combined

#!/usr/bin/perl -w

use strict;

use Time::ParseDate;
use Time::CTime;
use Compress::Zlib;

my $out_dir=$ENV{LOG_DIR};

my $cur_str='';
my $outlog;

my $line;

$SIG{TERM}= sub {
  #handle apache closing by closing the .gz file properly;
  if(defined $outlog){
    $outlog->gzclose();
    undef $outlog;
  }
};

  while (defined($line=<STDIN>)){
    chomp $line;
    if($line=~/\[(\d+)\/(\w+)\/(\d+)\:/){
      #lines must contain an extractable date
      my ($dd,$mm,$yy)=($1,$2,$3);
      $dd=~s/^0//g;
      my $str="$dd $mm $yy";
      if($str ne $cur_str){
    # new day, rotate to next file
        $cur_str=$str;
        my $dt=parsedate($str);
        my $ts=strftime("%Y_%m_%d",localtime($dt));
        $ts=~s/\s/0/g;
        # close current log (If necessary)
        $outlog->gzclose() if defined $outlog;
    #find a unique name for the new log    
        my $ct=1;
        my $fn="$out_dir$ts.0.gz";
        while(-f $fn){
          $fn="$out_dir$ts.$ct.gz";
          $ct++;
        }
        $outlog=gzopen($fn,"a");
      }
      $outlog->gzwrite($line."\n");
    }else{
      # unrecognized line.... there should probably be some error chec
+king here
    }
  }

  if(defined $outlog){
    $outlog->gzclose();
  }
Replies are listed 'Best First'.
RE: Apache log splitter/compressor
by Anonymous Monk on Jun 12, 2000 at 19:29 UTC
    All the standalone programs are great, and this one is no exception. However, it would be nice to rewrite these as an Apache module, preferably a C version for the lusers out there that haven't installed mod_perl (what are they thinking?!). Anybody know if there is a decent package either in C or Perl?
RE: Apache log splitter/compressor
by jjhorner (Hermit) on Jun 12, 2000 at 19:32 UTC

    lhoward, some AM below requested this in mod_perl format.

    Do you mind if I rewrite for use in mod_perl? Or are you going to do it?

    J. J. Horner
    Linux, Perl, Apache, Stronghold, Unix
    jhorner@knoxlug.org http://www.knoxlug.org/
    
      Feel free to rewrite it in mod_perl format. I currently have no plans to rewrite it in mod_perl myself. I originally wrote this outside of mod_perl because I did not do mod_perl at the time of its original development.

      One feature I would love to add to this program is real-time reverse-DNS lookups of browsing hosts. The tricky part about that is you have to do it in a way that doesn't block the webserver (so it isn't slowed down).

        I sometimes have issues determining when something should be programmed or scripted. If I need my nightly logs dns checked, should I write the program to do it automagically, or should I just set a cron job to do logresolve every night.

        I guess I'm not the only person who has ever had this issue.

        J. J. Horner
        Linux, Perl, Apache, Stronghold, Unix
        jhorner@knoxlug.org http://www.knoxlug.org/