Re^2: Computing results through Arrays

After some search, I found elegant solution from Aaron, but still not able to meet my requirement.

#!/usr/bin/env perl
use strict; use warnings;

my $key = shift;
my @cols = @ARGV;
my %h;

die "Usage: grp.pl id_column field1 [field2]...\n" unless @cols;
# $ARGV[0] is the Base column
# $ARGV[1..x] is the list of columns to add up

while(<DATA>){
  chomp;
  my @f = split;
  for (@cols){
    $h{$f[$key]}{$_}{t} += $f[$_];
    $h{$f[$key]}{$_}{n}++;
  }
}

for my $k (sort keys %h){
  print $k;
  print "\t$h{$k}{$_}{n}\t$h{$k}{$_}{t}" for @cols;
  print "\n";
}

__DATA__
server01: 2015-06-01T12:40:03-04:00  DB101                  10 MB/sec
server01: 2015-06-01T12:40:03-04:00  DB202                   5 MB/sec
server01: 2015-06-01T12:40:03-04:00  ASM                     2 MB/sec
server01: 2015-06-01T12:40:03-04:00  MYDB101                 2 MB/sec
server01: 2015-06-01T12:40:03-04:00  MYDB202                 5 MB/sec
server01: 2015-06-01T12:40:03-04:00  _OTHER_DB_             30 MB/sec
server01: 2015-06-01T12:41:03-04:00  DB101                   3 MB/sec
server01: 2015-06-01T12:41:03-04:00  DB202                   4 MB/sec
server01: 2015-06-01T12:41:03-04:00  ASM                     2 MB/sec
server01: 2015-06-01T12:41:03-04:00  MYDB101                 9 MB/sec
server01: 2015-06-01T12:41:03-04:00  MYDB202                 7 MB/sec
server01: 2015-06-01T12:41:03-04:00  _OTHER_DB_             50 MB/sec
server02: 2015-06-01T12:40:03-04:00  DB101                  90 MB/sec
server02: 2015-06-01T12:40:03-04:00  DB202                   9 MB/sec
server02: 2015-06-01T12:40:03-04:00  ASM                     2 MB/sec
server02: 2015-06-01T12:40:03-04:00  MYDB101                 3 MB/sec
server02: 2015-06-01T12:40:03-04:00  MYDB202                 1 MB/sec
server02: 2015-06-01T12:40:03-04:00  _OTHER_DB_             90 MB/sec
server02: 2015-06-01T12:41:03-04:00  DB101                   1 MB/sec
server02: 2015-06-01T12:41:03-04:00  DB202                   4 MB/sec
server02: 2015-06-01T12:41:03-04:00  ASM                     2 MB/sec
server02: 2015-06-01T12:41:03-04:00  MYDB101                 7 MB/sec
server02: 2015-06-01T12:41:03-04:00  MYDB202                 7 MB/sec
server02: 2015-06-01T12:41:03-04:00  _OTHER_DB_             55 MB/sec
[download]

Got below result:
./grp.pl 2 3
ASM     4       8
DB101   4       104
DB202   4       22
MYDB101 4       21
MYDB202 4       20
_OTHER_DB_      4       225
[download]

Could you please help in finding a way to group on time column based on Hour and print as per my requirement.

Sample of Required output:

Frequency Minute:
           collectionTime DB101 DB202 ASM MYDB101 MYDB202 _OTHER_DB_
2015-06-01T12:40:03-04:00   100    14   4       5       6        140  
2015-06-01T12:41:03-04:00     4     8   4      16      14        105
[download]

Frequency Hour:
           collectionTime DB101 DB202 ASM MYDB101 MYDB202 _OTHER_DB_
2015-06-01T12:00:00-04:00     
2015-06-01T13:00:00-04:00
[download]

Comment on Re^2: Computing results through Arrays Select or Download Code

Replies are listed 'Best First'.
Re^3: Computing results through Arrays by aaron_baugher (Curate) on Jun 05, 2015 at 12:34 UTC
As Laurent_R said, your requirements have another dimension (or two), so my script won't work for this except for some of the basic ideas. You're probably going to want three hashes, one to collect per-hour values (%h) and one to collect per-minute value (%m), and one to collect database names (%db). Then you'll need to: for each line parse out the date-hour, date-hour-minute, database name, and speed add speed to $h{date-hour}{database name} add speed to $m{date-hour-minute}{database name} $db{database name} = 1 # put database name in hash loop through sorted keys of %db print them as headers, formatted to fit what's coming below loop through keys of %h (sorted if you want) print the key (the date-hour) loop through sorted keys of %db print $h{$key}{database name} print a newline now do the same with the per-minute hash %m [download] Try coding that, and let us know if you need help. Aaron B. Available for small or large Perl jobs and *nix system administration; see my home node.	[reply] [d/l]
Re^4: Computing results through Arrays by yasser8@gmail.com (Novice) on Jun 05, 2015 at 17:33 UTC
Thanks a lot Aaron Sir !!!! I tried coding the way you said but not sure where I am going wrong, getting lot of errors "Use of uninitialized value in string". I tried debugging but in vain. Also no idea how to print the keys in a single line, I am not able to meet this requirement "print them as headers, formatted to fit what's coming below" Please do not mind for these silly mistakes, I am still beginner in perl. #!/usr/bin/env perl use strict; use warnings; my %h; my %m; my %db; while(<DATA>){ chomp; my @fields = split; my ($date,$database_name,$speed) = @fields[1,2,3]; my ($date_hour,$minute) = split /:/, $date ; my $date_hour_minute = join (':',$date_hour,$minute) ; $h{$date_hour}{$database_name} += $speed; $m{$date_hour_minute}{$database_name} += $speed; $db{$database_name} = 1; } for my $db_keys (sort keys %db){ print "$db_keys"; for my $h_keys (sort keys %h){ print $h_keys; for my $db_keys (sort keys %db){ print "$h{$h_keys}{$db_keys}"; print "\n"; } } } [download] Will be thankful to you if you could help me please...	[reply] [d/l]
Re^5: Computing results through Arrays by aaron_baugher (Curate) on Jun 05, 2015 at 18:44 UTC
You're getting close! The main problem is with your loop logic. You want to print a header line starting with "connectionTime," followed by the database names. You can do that with something like this: `print " collectionTime"; for my $db_keys (sort keys %db){ print " $db_keys"; # adjust spaces to line things up } print "\n";` [download] Now you want to start going through the actual data, printing it so that it lines up with the headers. So this loop follows the previous one, instead of being inside it: `for my $h_keys (sort keys %h){ print $h_keys; # print the date/hour for my $db_keys (sort keys %db){ print " $h{$h_keys}{$db_keys}"; # pad with enough spaces to + match header } print "\n"; # this goes outside the inner loop, to end the line }` [download] I haven't tested that, but it's just a bit of an adjustment to what you had. Once it works, the next thing you'll probably want to look at is replacing the print statements with printf, which will help you line things up in columns even though the values are of different lengths. One more thought: for efficiency's sake, we should probably sort the %db hash keys once and put them in an array, rather than re-sorting them every time we print a line. But it'll work this way, so we can deal with that next time. Aaron B. Available for small or large Perl jobs and *nix system administration; see my home node.	[reply] [d/l] [select]
Re^6: Computing results through Arrays by yasser8@gmail.com (Novice) on Jun 05, 2015 at 20:51 UTC
Re^7: Computing results through Arrays by aaron_baugher (Curate) on Jun 05, 2015 at 21:31 UTC
Some notes below your chosen depth have not been shown here
Re^3: Computing results through Arrays by Laurent_R (Canon) on Jun 05, 2015 at 11:00 UTC
Hi, your data structure is obviously too simplified for the level of details that you want to display, your `%h` hash needs an extra level of information (the time). Once you have added that, it should be only a matter of displaying correctly.	[reply] [d/l]


go ahead... be a heretic
	PerlMonks