http://qs321.pair.com?node_id=1062160

monkini has asked for the wisdom of the Perl Monks concerning the following question:

I have an array of arrays:

@AoA = ( [ 0.5, "b1", "c0" ], [ 0.4, "b1", "c1" ], [ 0.7, "b2", "c2" ], [ 0.3, "b3", "c3" ], [ 0.6, "b3", "c4" ], );

I would like to sort it on the 1st column (in descending order), and analyze sorted array row by row - if value in the middle column ($b) is duplicate, resample the elements in the row, then sort the remaining rows in array according to the values in 1st column (including the current one) and analyze as before - until all $b's are unique.

How do I sort over the remaining part of array?

my @sorted = sort @AoA; #sort on $a my %seen = (); my $j = 0; while ($j < $#sorted) { my $var = shift; my $b = $var[1]; my $c = $var[2]; if ( ! $seen{$b}++ ) { # $b not seen before DO STH WITH $b AND $c $seen{$b}++; } else { #seen $b before my ($b, $a) = RESAMPLE; # sort remaining array, from now on... and proceed analyzing l +ine by line } }

Replies are listed 'Best First'.
Re: Sort over remaining part of array
by hdb (Monsignor) on Nov 12, 2013 at 10:03 UTC

    Sorting with respect to the first column is a bit more involved compared to your code, see below. Sorting only part of the array can be achieved using array slices:

    @AoA = ( [ 0.5, "b1", "c0" ], [ 0.4, "b1", "c1" ], [ 0.7, "b2", "c2" ], [ 0.3, "b3", "c3" ], [ 0.6, "b3", "c4" ], ); @AoA[2..$#AoA] = sort { $a->[0] <=> $b->[0] } @AoA[2..$#AoA]; print Dumper \@AoA;

      The OP said descending order, so the key codeline should probably be:

      @AoA[2..$#AoA] = sort { $b->[0] <=> $a->[0] } @AoA[2..$#AoA];
      But that is only a detail.

      That helps, thanks!

      However, I still get an error "Use of uninitialized value in numeric comparison (<=>)" and it seems like the loop never ends...

      @AoA[0..$#AoA] = sort { $b->[0] <=> $a->[0] } @AoA[0..$#AoA]; my %seen = (); my $j = 0; while ($j < $#AoA) { my $b = $AoA[$j][1]; my $c = $AoA[$j][2]; if ( ! $seen{$b}++ ) { # $b not seen before DO STH WITH $b AND $c $seen{$b}++; } else { #seen $b before my ($b, $a) = RESAMPLE; $AoA[$j][0] = $a; $AoA[$j][1] = '$b'; @AoA[$j..$#AoA] = sort { $b->[0] <=> $a->[0] } @AoA[$j..$#AoA] +; } }

      ...any ideas?

        You re-sort the part of the array starting at $a after resampling $a. What if $a is higher after resampling? It would have to be inserted somewhere before the part you want to re-sort. Instead of sorting the complete array just because one element changes you could just find out where the resampled element has to go and use splice to extract and insert it at the new place

        Even better: Don't sort at all, resample until all $b's are unique and then do just one sort.

        Well, at a glance it looks like $j will remain zero forever...

Re: Sort over remaining part of array
by Anonymous Monk on Nov 12, 2013 at 09:42 UTC

    I would like to sort it on the 1st column ($a are numerical, $b and $c are strings), take it line by line - if $b was seen before, resample $a and $b, then sort the remaining rows in array (including the current one) and analyze as before - until all $b's are unique.

    Can you translate that into simple english please? And maybe post real sample data instead of "$a0"...?

    Thanks

      I'd sort first, then remove duplicates

      I've still no idea what resample means

      #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd pp /; my @AoA = ( [ 0.5, "b1", "c0" ], [ 0.4, "b1", "c1" ], [ 0.7, "b2", "c2" ], [ 0.3, "b3", "c3" ], [ 0.6, "b3", "c4" ], ); dd\@AoA; @AoA = sort { $$b[0] <=> $$a[0] } @AoA; dd\@AoA; { my %seen; @AoA = grep{!$seen{$$_[1]}++}@AoA; } dd\@AoA; __END__ [ [0.5, "b1", "c0"], [0.4, "b1", "c1"], [0.7, "b2", "c2"], [0.3, "b3", "c3"], [0.6, "b3", "c4"], ] [ [0.7, "b2", "c2"], [0.6, "b3", "c4"], [0.5, "b1", "c0"], [0.4, "b1", "c1"], [0.3, "b3", "c3"], ] [[0.7, "b2", "c2"], [0.6, "b3", "c4"], [0.5, "b1", "c0"]]

      Also worth considering are Sort::Key - the fastest way to sort anything in Perl

      and Sort::Key::External allows to sort huge lists that do not fit in the available memory.