Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: IP Parse and Count from logfile

by mscharrer (Hermit)
on Sep 17, 2008 at 16:09 UTC ( [id://712044]=note: print w/replies, xml ) Need Help??


in reply to IP Parse and Count from logfile

Hi indkebr,
some tips for your future perl code:
  • Use more loops, especially for initialisations
  • Avoid hardcoding of names ala spf2 if not needed, use e.g. a constant array (see below)
  • To compare two strings use eq not =~
  • The easiest thing to test if a name exists in a list is to store it as hash (as you did) and use the exists operator (see below)
  • Use local scope variables, especially for loops, e.g. foreach my $var (@vars)
  • Use the readmore tags on perlmonks for medium or large code which isn't needed to present your basic question.
  • For some things you can use hash (or array) slices: my %hash; @hash{"key1","key2",key5"} = (0,0,0);

I changed your code a little to improve it and added the IP counter and print commands. I'm really not sure if I understood it right what exactly you need, so it might not be correct.

Please note that I didn't checked your IP regex, but it looks ok on the first quick look. You didn't provided a test input file so I couldn't test if my changes did introduce functional bugs.


#!/usr/bin/perl use strict; use warnings; my %domains; my %ipcount; my @domain_array = qw(ebay.com paypal.com americanbank.com usbank.com americangreetings +.com); my @vars = qw(counter dkim0 dkim1 dkim2 dkim3 dkim4 spf0 spf1 spf2 spf3 spf4); my @bad_domains; foreach my $domain (@domain_array) { foreach my $var (@vars) { $domains{$domain}{$var} = 0; } } open( my $log_domain, "logdata" ) || die "$!"; while (<$log_domain>) { my ($host) = /domain=([\w\.]+?)\s/; # find regex for domain my ($spf) = /spf=([0-4])\s/; # find regex for spf1 my ($dkim) = /dkim=([0-4])\s/; # find regex for dkim1 ###IP Regex - I'm assuming the regex for this is -- my ($ip) = $_ =~ /ip=(([0-1]?[0-9]{1,2}\.)|(2[0-4][0-9]\.)|(25[0-5 +]\.)){3}(([0-1]?[0-9]{1,2})|(2[0-4][0-9])|(25[0-5]))\s/; # Count how often this ip is found $ipcount{$ip}++; # or $domains{$host}{"ipcount"}++ below if you want the ip's per d +omain # if host is in domain list, increment counter if (exists $domains{$host}) { $domains{$host}{"counter"}++; $domains{$host}{"spf$spf"}++; $domains{$host}{"dkim$dkim"}++; } else { # else save it as bad domain push( @bad_domains, $host ); } } # You could take the @vars array here to print this line also print "Domain,\"Domain Count\",Dkim0,Dkim1,Dkim2,Dkim3,Dkim4,Spf0,Spf1,Spf2, +Spf3,Spf4\n"; foreach my $domain ( keys %domains ) { print join( ',', $domain, @{ %{ $domains{$domain} } }{@vars} ), "\ +n"; } # You can print it like this: while (my ($ip,$count) = each %ipcount) { if ($count > 1) { print "$ip,$count\n"; } } print "The total amount of domains that we don't care about is " . scalar @bad_domains . "\n"; close($log_domain);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://712044]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-26 07:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found