http://qs321.pair.com?node_id=695105

kabeldag has asked for the wisdom of the Perl Monks concerning the following question:

Hey all. I'm just curious,

Webalizer is a great log statistics tool for a webserver, but I am wondering whether there is a Perl module (or combo of Perl modules) that does what Webalizer does, but with more detail?

A lot of organizations and individuals do a great deal of log parsing (in this case web server log parsing), and they have their reasons for that. However, Is there a platform-independent Perl solution to providing Web Server Statistics in a fashion that equals or perhaps does more than Webalizer? I'm not talking about a fully blown CMS, no. A log watcher that provides methods for statistical purposes either on a single site or multiple sites that lends scalability -- in terms of the number of sites/hosts and CGI integration ability per site/host.

I don't want to take anything away from the Webalizer developers, because they do a great job. Just wondering whether somebody, or some people have gone to the effort in Perl to achieve the same goals & more, and if so, what are the pro's and con's.

Anybody?

Replies are listed 'Best First'.
Re: Scalable Perl Webalizer Alternative(s)
by marto (Cardinal) on Jul 02, 2008 at 09:06 UTC

      I see that AWStats uses maxmind's GeoIP 'Perl API'.
      I know you can obtain CIDR country nets from ftp.(apnic)|(arin)|(lacnic)|(afrinic)|(ripe).net, yet they all have different ftp dir tree's which makes creating modules for such a use hapless -- unless they 'honest to god' kept those ftp dir tree's constant (do they keep their dir tree's constant?).

      On a different subject. Country CIDR-net firewalling is something that a lot of countries/orgs/individuals practice these days. It's quite sad. But you have to do it when you have to do it.

      marto, have you used AWStats? Is it stable? And have you used Maxmind's GeoIP Perl Module (either one) on its own?

        I have used awstats in the past, and about to start using it again for some projects I am working on at the moment (logging normal web traffic as well as streaming media).

        In the past I have never had any problems with it, and I know that several large ISPs that I have worked for in the past use it. Also it is written in Perl, which was a requirement you mentioned in your post.

        Martin
Re: Scalable Perl Webalizer Alternative(s)
by grinder (Bishop) on Jul 02, 2008 at 11:22 UTC

    You do realise of course that Webalizer was originally written in Perl? It was only later, once the feature set had settled down that the author rewrote in C for speed? For heavy traffic sites where you're crunching millions of lines per day, the decrease in run time is appreciable.

    In the eight years or so I've been using Webalizer, none of the other applications I've evaluated have shown themselves to be better. awstats may be flashier (Webalizer is pretty stodgy), but I don't find its results as useful.

    I do like Visitors a lot. It's fast and its results are gorgeous. It just doesn't do history as well as Webalizer, but in other respects it's quite complementary.

    • another intruder with the mooring in the heart of the Perl

Re: Scalable Perl Webalizer Alternative(s)
by moritz (Cardinal) on Jul 02, 2008 at 09:15 UTC
    I don't want to take anything away from the Webalizer developers

    webalizer is open source (GPL). You can either contribute to that cool project and send patches with your enhancements. (If they don't like the direction you take, you can still fork the project without taking anything away from them).

    (Update: d'oh, just noticed that I confused webalizer with awstats. Both are open source, but webalizer is written in C, so perhaps not the answer you're looking for.)

    BTW the other day I've seen a cool perl project that dumps Apache log files into a sqlite database. Depending on your goals that might help you, you could then do much of your analysis in SQL.

      I'm going to assume that you're talking about my asql tool, which was previously mentioned here as Querying Apache logfiles via SQL, and can be very useful for ad-hoc queries.

      But being console based it does lack the prettyness of awstats, webalizer, etc.

      Steve
      --

        Hi, Steve:

        your tool is very cool! it can be even better if you can add a '-e' command-line switch like MySQL's, then we can easily automate sending the output data to generate graphs or tables without the need of 'expect' script to handle the interactive things. i.e.

        asql -e 'select source, sum(size) from logs group by source' > ip_tr +affic.dat

        thanks

        lihao

Re: Scalable Perl Webalizer Alternative(s)
by Your Mother (Archbishop) on Jul 02, 2008 at 22:04 UTC

    I've done a lot of log parsing myself and it can be fun and rewarding but my advice is: Don't! Google's Urchin/Analytics is so good now that there is no reason I can think of to not use it and many that commend it far above regular log parsers. You will get a level of detail that is amazing and it will be much more accurate for things like physical location of visitors, visit times, site abandonment, etc.