Converting a growing hash into an array of arrays

madbombX has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I am attempting to write a script that takes all the data in a log file and parses out for a value. Ie, it takes the amavid-new logfile and pulls out the SPAM hit points per message and continually tails the file (using File::Tail) and a forked (daemonized process) and adds values to the hash as new messages come in. I am also trying to graph these values in a bar graph in order to see trends. I have settled on using GD::Graph::bars as opposed to RRDs.

My question is that incrementing a value in a 'Key => Value' pair is easy, but not in an array that is required to look as such (at least not to me):

@data = (
           ["1.6","2.2","3.4","3.6","5.4","6.2","7.1", "8.1", "9.0"],
           [    1,    2,    5,    6,    3,  15,    4,     3,     4],
           [ sort { $a <=> $b } (1, 2, 5, 6, 3, 15, 4, 3, 4) ]
         );
[download]

Is there a better way to do this without regenerating the array every time a message is added?

Or, would it potentially be better to do this all using RRDs (even though the aspect of time that RRD takes full advantage of is irrelevant). I just want to keep the # of messages per point total (I know the second portion of the question is barely Perl related, but I many out there are more experienced than I.

Thanks. Eric

UPDATE: Here is the shortened version of the completed (working) code.

$log = File::Tail->new( name => $MAILLOG, tail => -1);
while (defined(my $line=$log->read)) {
 IncrData(Get_Hits($line));
 if (($msgs{Total} % 200) == 1) { Create_Graph(); }
}

sub IncrData ($) {
   my $values = shift;

   if (exists $hits{$values} ) { ${$hits{$values}}++; }
   else {   
     my $idx = 0;
     my $endIdx = scalar(@{$data[0]});
     while ($idx < $endIdx && $data[0][$idx] < $values) {
       $idx++;
     }
     splice(@{$data[0]},$idx,0,$values);
     splice(@{$data[1]},$idx,0,1);
     $hits{$values} = \$data[1][$idx];
   }
}
[download]

Since the file is being tailed, I have the graph being recreated every 200 incoming messages. Just before the graph gets recreated, I run the sort code: @{$data[2]} = sort { $a <=> $b } @{$data[1]};

Again, thanks to all for all the help.

Comment on Converting a growing hash into an array of arrays Select or Download Code

Replies are listed 'Best First'.
Re: Converting a growing hash into an array of arrays by jmcada (Acolyte) on Jul 14, 2006 at 16:52 UTC
splice? `use Data::Dumper; my @data = ( ["1.6","2.2","3.4","3.6","5.4","6.2","7.1", "8.1", "9.0"], [ 1, 2, 5, 6, 3, 15, 4, 3, 4], [ sort { $a <=> $b } (1, 2, 5, 6, 3, 15, 4, 3, 4) ] ); splice(@{$data[0]}, 1, 0, "2.1"); splice(@{$data[1]}, 1, 0, 7); $data[2] = [sort { $a <=> $b } @{$data[1]}]; print Dumper(\@data);` [download]	[reply] [d/l]
Re^2: Converting a growing hash into an array of arrays by madbombX (Hermit) on Jul 14, 2006 at 17:17 UTC
I think I should have been a little more specific. Once I get a new message coming in and it comes through the log, I have to increment the message count for a specific hit size by one. Therefore, if I have 3 messages that are 2.1 (the hash section would look like: $hits{"2.1"} = 3). Then when the new message comes in with a hit count of 2.1, then $hits{"2.1"} = 4. I know how to do this with a hash ($hits{"2.1"}++), but is there a way to do this with that multi-dimensional array I have listed above? I know splice will work for the one array($data[0]), but I need to adjust its corresponding value in the subsequent arrays. Thanks. Eric	[reply]
Re^3: Converting a growing hash into an array of arrays by jmcada (Acolyte) on Jul 14, 2006 at 18:18 UTC
If I'm understanding this correctly, you need to increment the value in the second sub array based on the value found in the corresponding position on the first sub array. If so, this is probably a little over-thinking it, but it should work: `my @data = ( ["1.6","2.2","3.4","3.6","5.4","6.2","7.1", "8.1", "9.0"], [ 1, 2, 5, 6, 3, 15, 4, 3, 4], [ sort { $a <=> $b } (1, 2, 5, 6, 3, 15, 4, 3, 4) ] ); print join(", ", @{$data[1]}), "\n"; map { $data[1]->[$_]++ if $data[0]->[$_] =~ /3.4/ } 0..$#{$data[0]}; print join(", ", @{$data[1]}), "\n"; --(0)> perl test.pl 1, 2, 5, 6, 3, 15, 4, 3, 4 1, 2, 6, 6, 3, 15, 4, 3, 4` [download]	[reply] [d/l]
Re^4: Converting a growing hash into an array of arrays by holli (Abbot) on Jul 14, 2006 at 19:42 UTC
Re^4: Converting a growing hash into an array of arrays by madbombX (Hermit) on Jul 14, 2006 at 19:24 UTC
Re^3: Converting a growing hash into an array of arrays by shonorio (Hermit) on Jul 14, 2006 at 21:06 UTC
Eric, I just don't understand why you are change the easy way with hash to a hard way with array ? Do you have problem with performance ? Is it about the hash size ? I don't know how long will be your array/hash, but if it's going to be large, RDD would be a good solution (or other kind of database). Solli Moreira Honorio Sao Paulo - Brazil	[reply]
Re: Converting a growing hash into an array of arrays by rodion (Chaplain) on Jul 14, 2006 at 21:24 UTC
If you've got a lot of data to work with, you really want to do your incrementing updates with a hash, otherwise you will be scanning through every element of the $data[0] array every time you want to increment. The number of compares goes up with the square of the number of log entries, which can get expensive. (The solutions by jmcada and holli, however, are nice and clear, and are efficient if the number of elements in the sub-arrays don't get too long. I'd go with that approach if your logs are short.) The sub below uses an external hash of references to the elements of $data[1], called %data_refs_hash. Using that hash you can do the increment directly. You only need to scan through the array when you're adding a new key (and you only need to do that if you're keeping the keys in order, if not, leave out the while() scan and just push the new values on the end instead of splicing them.) sub IncrData { my $key = shift; if (exists $data_refs_hash{$key} ) { # if there's a hash entry ${$data_refs_hash{$key}}++; # increment it } else { # otherwise, find a spot and put one in my $idx = 0; my $endIdx = scalar(@{$data[0]}); while ($idx < $endIdx && $data[0][$idx] < $key) { $idx++; } splice(@{$data[0]},$idx,0,$key); splice(@{$data[1]},$idx,0,1); $data_refs_hash{$key} = \$data[1][$idx]; } # if you need to update the sort each time, then do @{$data[2]} = sort { $a <=> $b } @{$data[1]}; # but you shouldn't update it until you're going to use it, # since it's expensive to sort for every increment } # IncrData [download] And here's the demo and testing portion Read more... (1488 Bytes) Updated: Reformatted code to move testing into "readmore" and revised non-code portion	[reply] [d/l] [select]
Re: Converting a growing hash into an array of arrays by kwaping (Priest) on Jul 14, 2006 at 21:22 UTC
How about using `values %hash` or a hash slice? Sample code: `#!/usr/bin/perl use strict; use warnings; my %hash = (1 => 2, 3 => 4, 5 => 6); my @array1 = keys %hash; print "@array1$/"; my @array2 = values %hash; print "@array2$/"; my @array3 = @hash{@array1}; print "@array3$/"; __END__ output: 1 3 5 2 4 6 2 4 6` [download] --- It's all fine and dandy until someone has to look at the code.	[reply] [d/l] [select]