http://qs321.pair.com?node_id=387953

monger has asked for the wisdom of the Perl Monks concerning the following question:

Greetings monks!

I'm trying to wrestle with some coding that's new to me. I've been rummaging through log files with Perl for some time, but now I need some more help, so I turn to the Perl Faithful!

Here's a snippet of what I'm looking at:

2004/09/01:10:37:57,cbt54632,192.168.1.253,F - Account Not Registered

This is simulating a function in SAS (which I don't use or know). What I would like to do is read through this file (standard format here), grab each username (here, cbt54632) and count the number of occurrences in the log for the reason listed, here "Account Not Registered".

I looked at 6039, but it didin't quite fit.

Any suggestions?

Thanks - monger

Monger +++++++++++++++++++++++++ Munging Perl on the side

Replies are listed 'Best First'.
Re: Counting In a Log File from Multiple Variables
by Rhose (Priest) on Sep 02, 2004 at 16:05 UTC
    Here is a quick and dirty script... if you want just the "Account Not Registered" and not the whole "F - Account Not Registered", you will need to parse the last field.

    Please note: I am assuming a comma is the field delimiter.

    #!/usr/bin/perl use strict; use warnings; my $Data; while(<DATA>) { chomp; @_=split(','); $Data->{$_[1]}->{$_[3]}++; } foreach my $User (sort keys %{$Data}) { print 'User: ',$User,"\n"; foreach my $Error (sort keys %{$Data->{$User}}) { print ' [',$Error,'] happened ',$Data->{$User}->{$Error},' time(s +)',"\n"; } } __DATA__ 2004/09/01:10:37:57,cbt54632,192.168.1.253,F - Account Not Registered 2004/09/01:10:37:57,cbt99999,192.168.1.253,F - Account Not Registered 2004/09/01:10:37:57,cbt54632,192.168.1.255,A - Some Other Error 2004/09/01:10:37:57,cbt99999,192.168.1.255,B - Yet Another Error 2004/09/01:11:37:57,cbt54632,192.168.1.253,F - Account Not Registered

    I hope this helps!

      I think you'll be a whole lot happier when you start writing
      print "User: $User\n"; print " [$Error] happened $Data->{$User}{$Error} time(s)\n";
      _____________________________________________________
      Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
      How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
        I think about that all the time. *Smiles* When I started writing perl code, I made some mistakes with "" and '', so I started putting literals in '', special characters (like \t and \n) in "", and I pulled out all variables.

        It's just one of those habits which has stuck around in my code since.

      Thanks Rhose. This has it started.

      I did some mods to the code for my environment, and not being fluent yet in HoHs, or any more advanced data structures, I'm going to throw back my mods and plea for aid. I've looked at the perldsc, but I'm still a bit fuzzy. Heading in the right direction, but need a bit more of a push. Here the code:

      #!/usr/bin/perl use strict; use warnings; my $line; my $Error; my $Data; my $File = "L:\\cybor\\20040831.log"; open FILE, $File || die "Can't open log file: $!"; while(<FILE>) { # chomp; #@_=split(','); foreach $line (<FILE>) { @_=split(','); $Data->{$_[1]}->{$_[3]}++; } } foreach my $User (sort keys %{$Data}) { print 'User: ',$User,"\n"; foreach my $Error (sort keys %{$Data->{$User}}) { print ' [',$Error,'] happened ',$Data->{$User}->{$Error},' time(s +)',"\n"; } }

      Here's the output from cygwin:

      User: wfmccahi [P - Password Reset ] happened 172 time(s +)

      So, it's not counting quite right. The phrase "P - Password Reset" only occurs four times in the file in question. Any suggestions?

      Thanks, monger

      Monger +++++++++++++++++++++++++ Munging Perl on the side
        I would change:

        while(<FILE>) { # chomp; #@_=split(','); foreach $line (<FILE>) { @_=split(','); $Data->{$_[1]}->{$_[3]}++; } }

        to

        while(<FILE>) { chomp; @_=split(','); $Data->{$_[1]}->{$_[3]}++; }

        The "chomp" will get rid of that pesky newline you are seeing in the output. Also, the "while(<FILE>)" will process each line of your file (one at a time) and will assign each line to $_. This means your "foreach $line (<FILE>)" is not needed.

        Note: The foreach loop in your code would assign data to the $line variable. Your split is splitting $_ (nothing specified mean you get to use the default $_)... if you were to use your foreach, you would want to change your split to "@_=split(',',$line);"

Re: Counting In a Log File from Multiple Variables
by wfsp (Abbot) on Sep 02, 2004 at 16:05 UTC
    That looks like a job for a HoH (hash of hashes).
    ${$username}{$reason}++
    The docs are very good: perlreftut, perldsc and perllol.