|Think about Loose Coupling|
Re: Question about the most efficient way to read Apache log files without All-In-One Modules from CPAN (personal learning exercise)by kcott (Bishop)
|on Jun 17, 2015 at 07:19 UTC||Need Help??|
Welcome to the Monastery.
Reading an entire logfile into memory prior to processing would be very much the exception; the norm would be to process the file a line at a time.
The format of each log entry is defined in the Apache configuration file (httpd.conf or whatever you've called it). From my httpd.conf, here's the lines that describe the access_log:
See the documentation in Apache Module mod_log_config for a description of the %X codes and other related information.
With that information to hand, it's fairly easy to construct a regex to parse the log records. Here's a script to do that. The three DATA lines are taken verbatim from my access_log file.
Be aware that your configuration may use other logfiles with different LogFormat directives; however, you should be able to contruct a suitable regex using the script above as a template. And, of course, you'll probably want to do something more useful than just printing the data.