http://qs321.pair.com?node_id=949334

hok_si_la has asked for the wisdom of the Perl Monks concerning the following question:

Good localtime monks,

I asked a question early last week concerning information extracting from a specific file format (Extracting information from file to Hash) and BrowserUK was good enough to point me in the right direction, however I am having an issue sorting my AoH. The error I am getting when running a command line trace is, "Can't use string ("1") as a HASH ref while "strict refs" in use at getCollections.pl line 164, <ARCFILE> line 10." I can reference and print the unsorted keys just fine, however the sorted AoH is empty. For instance $collectionData[$i]{'Missing'} contains a value however $sortedCollectionData[$i]{'Missing'} is undef.

Here is my file format:
CollectionId=>26154 Framecount=>6 Status=>SC Missing=>0 Modified=>01/2 +2/2012 22:12:09 CollectionId=>26155 Framecount=>6 Status=>I Missing=>4 Modified=>01/22 +/2012 22:12:20 CollectionId=>25000 Framecount=>6 Status=>SC Missing=>0 Modified=>01/2 +2/2012 22:13:07 CollectionId=>25002 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:14 CollectionId=>25009 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:19 CollectionId=>25309 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:25 CollectionId=>25349 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:31 CollectionId=>25318 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:37 CollectionId=>21318 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:43 CollectionId=>21342 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:56
Here is my sub:
sub printCollectionData { my $arcFile = shift; my $orderBy = shift; my (@collectionData, @sortedCollectionData); my $semaphore = $arcFile . '.lock'; my ($longStatus, $rowStyle, $rowColor); open(LOCKFILE, ">>$semaphore") or die "$semaphore: $!"; flock(LOCKFILE, LOCK_EX) or die "flock() failed for $semaphore: $!"; open (ARCFILE, "<$arcFile") or die "Failed to open $arcFile: $!"; # Retrieve file information from arcFile as an array of hashes while( <ARCFILE> ) { my( $col, $cnt, $stat, $miss, $mod) = m[ ^ CollectionId \s* => \s* (\d+)? \s* Framecount \s* => \s* (\d+)? \s* Status \s* => \s* (\w+)? \s* Missing \s* => \s* ([\d,]+)? \s* Modified \s* => \s* ([\d/]+\s[\d:]+)? \s* $ ]x or warn "Bad format at line $.\n" and next; my( $modday, $modmon, $modyear, $modhrs, $modmin, $modsec ) = $mod =~ m[(\d+)/(\d+)/(\d+) (\d+):(\d+):(\d+)] or warn "Bad date format in line $." and next; push @collectionData, { CollectionId => $col, Framecount => $cnt, Status => $stat, Missing => $miss, Modified => sprintf( "%4d/%02d/%02d %02d:%02d:%02d", $modyear, $modmon, $modday, $modhrs, $modmin, $modsec ), }; } # Sort the collection according to orderBy param if($orderBy eq "collection") { @sortedCollectionData = sort { $collectionData[ $b ]{CollectionId} <=> $collectionData[ $a ]{Coll +ectionId} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; }elsif($orderBy eq "framecount") { @sortedCollectionData = sort { $collectionData[ $b ]{Framecount} <=> $collectionData[ $a ]{Framec +ount} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; }elsif($orderBy eq "status") { @sortedCollectionData = sort { $collectionData[ $a ]{Status} cmp $collectionData[ $b ]{Status} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; }elsif($orderBy eq "missing") { @sortedCollectionData = sort { $collectionData[ $b ]{Missing} <=> $collectionData[ $a ]{Missing} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; }else { @sortedCollectionData = sort { $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; } my $transactions = scalar(@collectionData); for (my $i=0; $i < $transactions; $i++) { if($sortedCollectionData[$i]{'Status'} eq "I") { $longStatus = "Incomplete"; }elsif($sortedCollectionData[$i]{'Status'} eq "SI") { $longStatus = "Submitted Incomplete"; }elsif($sortedCollectionData[$i]{'Status'} eq "C") { $longStatus = "Submitted"; }else{ $longStatus = "Submitted Complete"; } if (($i%2) == 0){ $rowStyle = "oddrow"; $rowColor = "#e5e5e5"; } else { $rowStyle = "evenrow"; $rowColor = "#ffffff"; } print qq{ <tr class=$rowStyle style=Cursor:hand onclick= \"location.href=\'get +Collection.pl?id=$collectionData[$i]{'CollectionId'}\';\" onMouseOver +=\"style.backgroundColor='#c5c5c5'\" onMouseOut=\"style.backgroundCol +or='$rowColor'\"> <th class=graycenter>$sortedCollectionData[$i]{'CollectionId'}</th +> <th class=graycenter>$sortedCollectionData[$i]{'Framecount'}</th> <th class=graycenter>$longStatus</th> <th class=graycenter>$sortedCollectionData[$i]{'Missing'}</th> <th class=graycenter>$sortedCollectionData[$i]{'Modified'}</th> </tr> </div> }; } }

Replies are listed 'Best First'.
Re: Sorting an array or hashes
by Corion (Patriarch) on Jan 23, 2012 at 08:57 UTC

    The problem is that $sortedCollectionData only contains the indices into @collectionData, but you try to access it as if it contains the elements themselves when you try to print the data:

    ... <th class=graycenter>$sortedCollectionData[$i]{'CollectionId'}</th> ...

    should be

    <th class=graycenter>$collectionData[$sortedCollectionData[$i]]{'Colle +ctionId'}</th>

    The deeper problem is, that your subroutine does too many things at once, which makes debugging such stuff much harder than it needs to be. I split up the subroutine into three steps, readCollectionData, sortCollectionData (where I thought the problem was) and printCollectionData (where I found the problem). That made it much easier to separate the things and see what each step returns as results.

    my $orderby = 'collection'; my @data = readCollectionData('dummyFilename.txt'); my @indices = sortCollectionData($orderby, @data); print Dumper \@indices; printCollectionData(\@data, @indices);

    As an aside, you had relatively large if ... elsif ... else ... blocks that decide on what to sort and the same kind again to translate the short status code into a long status message. I replaced them by a hash lookup:

    ... if($sortedCollectionData[$i]{'Status'} eq "I") { $longStatus = "Incomplete"; }elsif($sortedCollectionData[$i]{'Status'} eq "SI") { $longStatus = "Submitted Incomplete"; }elsif($sortedCollectionData[$i]{'Status'} eq "C") { $longStatus = "Submitted"; }else{ $longStatus = "Submitted Complete"; } ...

    becomes

    my %translateLongStatus = ( 'I' => 'Incomplete', 'SI' => 'Submitted Incomplete', 'C' => 'Submitted', ); ... my ($longStatus, $rowStyle, $rowColor); $longStatus = $translateLongStatus{ $collectionData[$i]{Status} } +||'Submitted Complete'; ... }

    In the end, my program looks like this (with much of the HTML printing removed):

    #!perl -w use strict; use Data::Dumper; sub readCollectionData { my $arcFile = shift; my (@collectionData); my $semaphore = $arcFile . '.lock'; my ($longStatus, $rowStyle, $rowColor); #open(LOCKFILE, ">>$semaphore") or die "$semaphore: $!"; #flock(LOCKFILE, LOCK_EX) or die "flock() failed for $semaphore: $!" +; #open (ARCFILE, "<$arcFile") or die "Failed to open $arcFile: $!"; local *ARCFILE = *DATA; # Retrieve file information from arcFile as an array of hashes while( <ARCFILE> ) { my( $col, $cnt, $stat, $miss, $mod) = m[ ^ CollectionId \s* => \s* (\d+)? \s* Framecount \s* => \s* (\d+)? \s* Status \s* => \s* (\w+)? \s* Missing \s* => \s* ([\d,]+)? \s* Modified \s* => \s* ([\d/]+\s[\d:]+)? \s* $ ]x or warn "Bad format at line $.\n" and next; my( $modday, $modmon, $modyear, $modhrs, $modmin, $modsec ) = $mod =~ m[(\d+)/(\d+)/(\d+) (\d+):(\d+):(\d+)] or warn "Bad date format in line $." and next; push @collectionData, { CollectionId => $col, Framecount => $cnt, Status => $stat, Missing => $miss, Modified => sprintf( "%4d/%02d/%02d %02d:%02d:%02d", $modyear, $modmon, $modday, $modhrs, $modmin, $modsec ), }; } return @collectionData }; # Map the program orderby names to the internal names in the hash my %sort_columns = ( collection => 'CollectionId', framecount => 'Framecount', status => 'Status', missing => 'Missing', ); sub sortCollectionData { my ($orderby, @collectionData) = @_; # Sort the collection according to orderBy param my $sort_col = $sort_columns{ $orderby } || 'Modified'; my @sortedCollectionData = sort { $collectionData[ $b ]{$sort_col} <=> $collectionData[ $a ]{$sort_c +ol} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData; return @sortedCollectionData } my %translateLongStatus = ( 'I' => 'Incomplete', 'SI' => 'Submitted Incomplete', 'C' => 'Submitted', ); sub printCollectionData { my ($collectionData, @sortedCollectionData) = @_; my @collectionData = @$collectionData; my $transactions = scalar(@collectionData); for (my $i=0; $i < $transactions; $i++) { my ($longStatus, $rowStyle, $rowColor); $longStatus = $translateLongStatus{ $collectionData[$i]{Status} } +||'Submitted Complete'; if (($i%2) == 0){ $rowStyle = "oddrow"; $rowColor = "#e5e5e5"; } else { $rowStyle = "evenrow"; $rowColor = "#ffffff"; } print $i, $longStatus, $collectionData[ $i ]->{CollectionId}, "\n" +; } } my $orderby = 'collection'; my @data = readCollectionData('dummyFilename.txt'); my @indices = sortCollectionData($orderby, @data); print Dumper \@indices; printCollectionData(\@data, @indices); __DATA__ CollectionId=>26154 Framecount=>6 Status=>SC Missing=>0 Modified=>01/2 +2/2012 22:12:09 CollectionId=>26155 Framecount=>6 Status=>I Missing=>4 Modified=>01/22 +/2012 22:12:20 CollectionId=>25000 Framecount=>6 Status=>SC Missing=>0 Modified=>01/2 +2/2012 22:13:07 CollectionId=>25002 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:14 CollectionId=>25009 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:19 CollectionId=>25309 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:25 CollectionId=>25349 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:31 CollectionId=>25318 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:37 CollectionId=>21318 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:43 CollectionId=>21342 Framecount=>6 Status=>I Missing=>5 Modified=>01/22 +/2012 22:13:56
      Thanks for the help Corion. I cleaned up my code a bit and used the following to print sorted elements of my AoH (@collectionData):
      foreach $j (@sortedCollectionData) { my ($longStatus, $rowStyle, $rowColor); $longStatus = $translateLongStatus{ $collectionData[$j]->{Status} +} ||'Submitted Complete'; if (($i%2) == 0){ $rowStyle = "oddrow"; $rowColor = "#e5e5e5"; } else { $rowStyle = "evenrow"; $rowColor = "#ffffff"; } print qq{ <tr class=$rowStyle style=Cursor:hand onclick= \"location.href=\'get +Collection.pl?id=$collectionData[$j]->{CollectionId}\';\" onMouseOver +=\"style.backgroundColor='#c5c5c5'\" onMouseOut=\"style.backgroundCol +or='$rowColor'\"> <th class=graycenter>$collectionData[$j]->{'CollectionId'}</th> <th class=graycenter>$collectionData[$j]->{'Framecount'}</th> <th class=graycenter>$collectionData[$j]->{'Missing'}</th> <th class=graycenter>$longStatus</th> <th class=graycenter>$collectionData[$j]->{'Modified'}</th> </tr> };
Re: Sorting an array or hashes
by moritz (Cardinal) on Jan 23, 2012 at 08:51 UTC
    @sortedCollectionData = sort { $collectionData[ $b ]{CollectionId} <=> $collectionData[ $a ]{Coll +ectionId} || $collectionData[ $b ]{Modified} cmp $collectionData[ $a ]{Modified +} } 0 .. $#collectionData;

    What you are storing here are numbers (specifically from 0 to $#collectionData), so @sortedCollectionData now contains these numbers in some order or another. And then you write $sortedCollectionData[$i]{'Status'}, and try to access one of these numbers as if it was a hash reference.

    You might want to sort your hash refs directly instead:

    @sortedCollectionData = sort { $b->{CollectionId} <=> $a->{CollectionId} || $b->{Modified} cmp $a->{Modified} } @collectionData;
Re: Sorting an array or hashes
by salva (Canon) on Jan 23, 2012 at 10:07 UTC
    Those so similar and ugly sorting blocks can be generated from metadata, specially if you use some sorting module from CPAN as Sort::Key or Sort::Maker:
    # untested! use Sort::Key; my %key_type = (CollectionId => 'int', Framecount => 'int', Status => 'str', Missing => 'int', Modified => 'str'); my %order = (collection => [qw(-CollectionId -Modified)], # the minus framecount => [qw(-Framecount -Modified)], # sign means status => [qw(Status -Modified)], # descending + order missing => [qw(-Missing -Modified)], modified => [qw(-Modified)]); my %sorter; for my $order (keys %order) { my @types; my @keys; for (@{$order{$order}}) { /^(-?)(\w+)$/ or die; push @types, "$1$key_type{$2}"; push @keys, $2; } $sorter{$order} = Sort::Key::multikeysorter { @{$_}{@keys} } @types; } sub printCollectionData { ... # Sort the collection according to orderBy param @sortedCollectionData = $sorter{$orderBy}->(@collectionData); ... }