Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Counting Occurences, (hash referencing another hash)

by penguinfuz (Pilgrim)
on Jul 16, 2002 at 18:52 UTC ( [id://182198]=perlquestion: print w/replies, xml ) Need Help??

penguinfuz has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks, I have pondered, read, rewritten, (and consumed a couple pints) in the process for enlightenment; I now ask for divine intervention.

My basic goal is to take a source and destination IP address pair, count the occurences of destination IPs, and insert into a database along the line of "srcIP, dstIP, visit_count".
The trouble I have is counting the occurences. =(

My thoughts (at this point) look like:
%hash0 = ( $srcIP => \%hash1 ); %hash1 = ( $dstIP => $count );
Now for the code snippet (which only counts accesses at the moment):
foreach $entry (@entries) { $info = '(some).*(snazzy).*(regexp)'; ($time,$src,$dst) = ($1,$2,$3) if $entry =~ /$info/; $src =~ s/SRC=//g; $dst =~ s/DST=//g; %ips = ( $src => "$dst"); foreach $ws (keys(%ips)) { push @ips, $ips{$ws}; #%visit_count = ( $src => $cnt ); #%src_dst_cnt = ( $ws => \$visit_count ); } } $count{$_}++ for @ips; print "$_\t visits: $count{$_}\n" for (keys(%count)); return @ips;
I have left 2 commented lines in the 'foreach' loop, which hopefully illustrate my thought process.

BTW: Cheers for counting ideas!

Replies are listed 'Best First'.
Re: Counting Occurences, (hash referencing another hash)
by thpfft (Chaplain) on Jul 16, 2002 at 23:26 UTC

    this is a little garbled, but i assume from the database fields you mention that the goal is to build a table of number of visits by each source to each destination?

    one good approach to this kind of problem: first draw up the data structure which is most suited to holding the information you have in the relations you want, then write a script in two parts: first to build that structure, then to use it. in the end you'll often find some way of eliminating the middle step and - in your case - writing to the database directly as you while though the file.

    in this case i'd suggest that the data structure you want is something like:

    $totals = ( $source => { $destination => $count, }, );

    ie, a hash of hashrefs each of which refers to a { destination => count } pair. which is easily built:

    my %totals; while (<LOGFILE>) { $totals{$1}->{$2}++ if m/(...)(...)(...)/; }

    and then you just need to loop on the constructed hash to feed it all into the database, with something like:

    my $sth = $dbh->prepare('insert into totals (srcIP, dstIP, visit_count +) values (?, ?, ?)'); foreach my $source (keys %totals) { $sth->execute($source, $_, $totals{$source}->{$_}) for keys %{ $to +tals{$source} }; }

    incidentally, and in case i've completely missed the point, the main trouble you're probably having is illustrated here:

    %ips = ( $src => "$dst"); foreach $ws (keys(%ips)) { ... }

    The %ips hash will have exactly one key, $src, so looping on it isn't doing much. Which suggests that you haven't quite grasped how perl's hashes work and how much they can do for you. Some time with perldata, perldsc and perllol will help, along with a bit of strictness and if possible the perl cookbook.

      thpfft spoke:
      this is a little garbled...
      hehehe, you've peered through the looking-glass of my mind.

      a hash of hashrefs each of which refers to a { destination => count } pair. which is easily built:
      my %totals; while (<LOGFILE>) { $totals{$1}->{$2}++ if m/(...)(...)(...)/; }
      This gives me an idea, but I'm not sure if I am seeing it correctly. $totals{$1} gets the first memory match from the regexp, and ->{$2} get's the second, then ->{$2} get's incremented, right?

      hum, Maybe another example...
      The values that I am concered with from my regexp look like:
      (the destination IP address will change BTW)
      SRC=192.168.0.10, DST=66.39.54.27

      I then log the source IP, the destination IP, and how many accesses per destination. My output should hopefully be something like:

      192.168.0.10 accessed 66.39.54.27 49 times.

      ...and in case i've completely missed the point, the main trouble you're probably having...
      Not at all, I think you got it spot-on! I have been fooling around with that %ips hash quite a bit, and you've validated my thoughts that it's not going to work like that.

      Which suggests that you haven't quite grasped how perl's hashes work
      It's that obvious, huh? ;) I'm trying to use hashes more and more every day for this exact reason.

      ...Some time with perldata, perldsc and perllol will help...
      I will definately start reading some over there, thanks.

      ...along with a bit of strictness
      Yup, I do that, and with warnings too! I firmly understand the advantage of having Perl tell me "don't do that!" =) The bit of code I posted is just the 'counting' sub of a larger beast.

      ...and if possible the perl cookbook.
      I got it.

      Thanks for taking time to respond to my garbled posting. When I get stuck on something, I have a hard time explaining the issue, but if I could explain it I probably wouldn't be so stuck, if you know what I mean.
Re: Counting Occurences, (hash referencing another hash)
by mfriedman (Monk) on Jul 16, 2002 at 20:42 UTC
    I don't understand what your data structure is supposed to be. How to these source and destination IPs relate to one another? What is @entries? Why are you referencing a hash that isn't defined yet? (%hash0 = ( $srcIP => \%hash1);)

    Why aren't you using strict?

    Could you describe in more detail exactly what you're trying to count?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://182198]
Approved by VSarkiss
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-04-25 13:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found