Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: [OT] Simple access_log analyzer in Perl

by l2kashe (Deacon)
on Apr 09, 2003 at 19:33 UTC ( [id://249376]=note: print w/replies, xml ) Need Help??


in reply to [OT] Simple access_log analyzer in Perl

I have a great one!! its just a matter of getting it from my mind to a source file. :P

Seriously though, look at what you are asking for, and what you didnt provide

Written in Perl:
This is easy, only question is what platform? *nix, Mac (which I guess is a *nix now), MS?

Simple, not many features:
Ok, what features do you want? Why?

Neat code, at least use strict, preferably with warnings on:
Ok, first off, neat code. By who's standards? yours? the perl community at large? by a X platform developer? by a X language developer?

Next use strict.. Hrm people are either for or against its blind use, I guess is this case this is a relativly straight forward requirement
With warnings on: Why? If this is going to get run out of cron, and it warns on stuff, you are going to get an email every time it runs..

Just ONE script, 1 file:
Again why? How much functionality do you want? looking ahead we can find the data points you think are relavant.

Unique hits per day:
Ok at least one pass over the data, assuming noone is using NAT at any point, keeping a hash of hits, simple enough

Referrers:
Provided that data is in the log as well we now either A) need to enhance the first hash to keep ticks on who came from where when we count them or B) Keep a second data structure with ref counts..

Search engine keywords:
What search engine? Do all of their logs entries appear the same (I.e are they all in the same format?), does the web server know when its being accessed by a browser as opposed to a search engine?

a nice graph as well possibly:
So now we have to map out all the data points in some format. What format would you like that in? plain text via ascii art, GIF, JPEG, PNG, etc.. Do you want to be able to store some graphs in one format and others in different formats? How should the script do that? Should it graph usage for just this log file or should it maintain a cache for X period of time? How much room are you willing to give up for data points over X time frame to be saved? Do you want to be able to build dynamic graphs or only static graphs? What level of granularity should the graphs provide? How many graphs do you want to save? How much room are you willing to give up for those graph files?

No offense, as I realize I am coming across harshly. You should pick a system that is relativly close to what you want and start coding from there. The reqs are kinda vague, they don't seem to show an understanding of everything that could happen in a log file, and simply wave it off as technological magic. You want something simple, but your asking for a simple *complex* application tailored exactly to your needs. If there is any code out there, its going to be more general purpose. I.e able to handle say Apache logs and everything that could happen in them, so that it only needs to be written once to deal with any data mining from that type of log.

If you want to see how to graph data on the fly look into the excellent GD module family. Wonderful set of modules, but if you want graphs without drawing all the lines yourself, look at the GD::*Graph family.. Also if you want it to be 3D look at GD::3D* family of modules.

Ill stop now because I can simply feel the XP draining away, and Im sorry if I coming across harshly, I probably shouldnt be posting when in this mood.

A simple short answer is: More than likely someone somewhere has written exactly what you want, finding it may or may not be possible, but more than likely it will *not* be just off the beaten e-tracks

/* And the Creator, against his better judgement, wrote man.c */
  • Comment on Re: [OT] Simple access_log analyzer in Perl

Replies are listed 'Best First'.
Re: Re: [OT] Simple access_log analyzer in Perl
by Jaap (Curate) on Apr 09, 2003 at 20:04 UTC
    I could answer all your questions (and i could have answered them in the first post) but that'd be such a long story that nobody would read it. It's a K.I.S.S. thing.

    I do know the complexities of creating a Log Analyzer, which is why i'm not doing it myself (yet).

      To parse the logfile, you might have a look at regexp-log, HTTPD-Log-Filter or Log-Detect. Even if you can't use these modules directly, they will certainly give you some good ideas on how to tackle your task!

      How nice, I have used the above text twice today!

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://249376]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2024-04-25 21:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found