I can't tell for sure since you didn't show your __DATA__ section, but since it's complaining about not having a string to split after chomp, it's probably running into a blank line in your __DATA__ section. You can test for that before splitting (see below). On your other question, yes, it's worth sorting the database names into an array, not because of how many databases there are, but because you don't want to sort them again for each of 100_000 records, as my pseudo-code did.
I went ahead and did a working version that pulls in the data and produces the per-hour report like I think you want it. I added a lot of comments, but feel free to ask about anything you don't understand. I think you should be able to add the per-minute section yourself (see the comments for where), based on how the per-hour section works.
Once you're comfortable with how it works, one way to make the printing nicer would be to calculate the width of each column (for printf) based on the maximum width of the items in that column. I didn't get into that here, to keep it simple.
#!/usr/bin/env perl
use 5.010; use strict; use warnings;
my %h; my %m; my %db; # per-hour hash, per-minute hash, database names
while(<DATA>){
next unless /\w/; # skip blank line
+s
my($datetime,$database,$speed) = (split)[1,2,3];
my $ddhhmm = substr $datetime,0,19; # substr works we
+ll here since the lengths are static
my $ddhh = substr $datetime,0,16; # this one doesn'
+t include the minutes
$h{$ddhh }{$database} += $speed; # add the speed t
+o this hour & database
$m{$ddhhmm}{$database} += $speed; # add the speed t
+o this minute & database
$db{$database} = 1; # save the databa
+se name
}
my @db = sort keys %db; # sort and save database names as array since
+we'll be looping through them many times
# HOUR SECTION START
# print out the per-hour stats
# starting with a header line
print " collectionTime";
printf "%11s", $_ for (@db); # print each database name as a header
+taking 10 spaces
print "\n"; # end of line
for my $key (sort keys %h){
print $key; # print the date/hour key
printf "%11s", $h{$key}{$_} for (@db); # print the value for each
+ database that goes with this key
print "\n";
}
# HOUR SECTION END
# MINUTE SECTION START (using %m instead of %h)
# MINUTE SECTION END
__DATA__
server01: 2015-06-01T12:40:03-04:00 DB101 10 MB/sec
server01: 2015-06-01T12:40:03-04:00 DB202 5 MB/sec
server01: 2015-06-01T12:40:03-04:00 ASM 2 MB/sec
server01: 2015-06-01T12:40:03-04:00 MYDB101 2 MB/sec
server01: 2015-06-01T12:40:03-04:00 MYDB202 5 MB/sec
server01: 2015-06-01T12:40:03-04:00 _OTHER_DB_ 30 MB/sec
server01: 2015-06-01T12:41:03-04:00 DB101 3 MB/sec
server01: 2015-06-01T12:41:03-04:00 DB202 4 MB/sec
server01: 2015-06-01T12:41:03-04:00 ASM 2 MB/sec
server01: 2015-06-01T12:41:03-04:00 MYDB101 9 MB/sec
server01: 2015-06-01T12:41:03-04:00 MYDB202 7 MB/sec
server01: 2015-06-01T12:41:03-04:00 _OTHER_DB_ 50 MB/sec
server02: 2015-06-01T12:40:03-04:00 DB101 90 MB/sec
server02: 2015-06-01T12:40:03-04:00 DB202 9 MB/sec
server02: 2015-06-01T12:40:03-04:00 ASM 2 MB/sec
server02: 2015-06-01T12:40:03-04:00 MYDB101 3 MB/sec
server02: 2015-06-01T12:40:03-04:00 MYDB202 1 MB/sec
server02: 2015-06-01T12:40:03-04:00 _OTHER_DB_ 90 MB/sec
server02: 2015-06-01T12:41:03-04:00 DB101 1 MB/sec
server02: 2015-06-01T12:41:03-04:00 DB202 4 MB/sec
server02: 2015-06-01T12:41:03-04:00 ASM 2 MB/sec
server02: 2015-06-01T12:41:03-04:00 MYDB101 7 MB/sec
server02: 2015-06-01T12:41:03-04:00 MYDB202 7 MB/sec
server02: 2015-06-01T12:41:03-04:00 _OTHER_DB_ 55 MB/sec
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.