knowmad has asked for the wisdom of the Perl Monks concerning the following question:
In case the subject isn't entirely clear, what I'm trying to achieve is a sort of an array of hashes by a field which contains the following three keywords -- Local, State, National. The trick is that I need to sort the records in that order. Thus a straight-forward alpha sort will not work.
I checked the Q&A, searched PM and Google and checked my various Perl books to no avail. I have a solution but am hoping a more clever monk could make it more efficient.
#!/usr/bin/perl
use strict;
use Data::Dumper;
my @advocates = (
{
fld_title => 'Rec1',
fld_type => 'National',
},
{
fld_title => 'Rec2',
fld_type => 'State',
},
{
fld_title => 'Rec3',
fld_type => 'Local',
},
);
sub by_locale {
my $a_type;
my $b_type;
$a_type = 1 if ($a->{'fld_type'} =~ /local/i);
$a_type = 2 if ($a->{'fld_type'} =~ /state/i);
$a_type = 3 if ($a->{'fld_type'} =~ /national/i);
$b_type = 1 if ($b->{'fld_type'} =~ /local/i);
$b_type = 2 if ($b->{'fld_type'} =~ /state/i);
$b_type = 3 if ($b->{'fld_type'} =~ /national/i);
$a_type <=> $b_type;
}
my @sorted_advocates = sort by_locale @advocates;
warn Dumper("unsorted_advocates", \@advocates);
warn Dumper("sorted_advocates", \@sorted_advocates);
Thanks for any pointers,
-Wm
Re: Efficiently sorting array by non-alpha strings
by Abigail-II (Bishop) on Feb 20, 2004 at 15:57 UTC
|
my %bucket;
map {push @{$bucket {$_ -> {fld_type}}} => $_} @advocates;
my @sorted = map {@{$bucket {$_}}} qw /Local State National/;
Abigail | [reply] [d/l] |
|
Thanks for the reply. I've not used the Schwarztian Transform before so tend to shy away from it. However, I read a bit more about it and discovered that it won't work for my actual code which is an array of objects.
Your bucketsort is the elegant solution that I was seeking. Interesting though it's about twice as slow as my ugly version as indicated by this benchmark:
Benchmark: timing 500000 iterations of bucketsort, sort...
bucketsort: 2 wallclock secs ( 1.72 usr + 0.00 sys = 1.72 CPU) @ 29
+0697.67/s (n=500000)
sort: 0 wallclock secs ( 0.89 usr + 0.00 sys = 0.89 CPU) @ 56
+1797.75/s (n=500000)
Rate bucketsort sort
bucketsort 290698/s -- -48%
sort 561798/s 93% --
Thanks
-Wm | [reply] [d/l] |
|
Your bucketsort is the elegant solution that I was seeking. Interesting though it's about twice as slow as my ugly version as indicated by this benchmark:
That's because you didn't fix my typos. Here's a different
benchmark, with some of the other suggestions (I didn't include any that went beyond sorting on 'fld_type'):
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw /timethese cmpthese/;
our @advocates = map {{fld_title => 'Rec' . $_,
fld_type => [qw /National State Local/] -> [ra
+nd 3]}}
1 .. shift || 1000;
sub by_locale {
my $a_type;
my $b_type;
$a_type = 1 if ($a->{'fld_type'} =~ /local/i);
$a_type = 2 if ($a->{'fld_type'} =~ /state/i);
$a_type = 3 if ($a->{'fld_type'} =~ /national/i);
$b_type = 1 if ($b->{'fld_type'} =~ /local/i);
$b_type = 2 if ($b->{'fld_type'} =~ /state/i);
$b_type = 3 if ($b->{'fld_type'} =~ /national/i);
$a_type <=> $b_type;
}
my $order = "\0local\0state\0national\0";
sub by_locale_b {
index ($order, "\0" . lc ($a -> {fld_type}) . "\0") <=>
index ($order, "\0" . lc ($b -> {fld_type}) . "\0")
}
our (@knowmad, @abigail, @borisz, @pirate);
cmpthese -1 => {
knowmad => '@knowmad = sort by_locale @advocates',
abigail => 'my %bucket;
map {push @{$bucket {$_ -> {fld_type}}} => $_} @advo
+cates;
@abigail = map {@{$bucket {$_}}} qw /Local State Nat
+ional/;',
borisz => 'my %h = (National => 3, State => 2, Local => 1);
@borisz = sort {$h {$a -> {fld_type}} <=>
$h {$b -> {fld_type}}} @advocates;',
pirate => '@pirate = map {$_ -> [1]}
sort {$a -> [0] <=> $b -> [0]}
map {[lc ($_ -> {fld_type}), $_]} @advoca
+tes',
browseruk => '@browseruk = sort by_locale_b @advocates',
};
__END__
Rate knowmad browseruk borisz pirate abigail
knowmad 39.4/s -- -48% -70% -83% -92%
browseruk 75.5/s 91% -- -42% -67% -85%
borisz 130/s 229% 72% -- -43% -74%
pirate 229/s 480% 203% 76% -- -53%
abigail 491/s 1147% 551% 279% 115% --
Abigail | [reply] [d/l] |
|
|
|
|
|
Re: Efficiently sorting array by non-alpha strings
by BrowserUk (Patriarch) on Feb 20, 2004 at 16:18 UTC
|
#!/usr/bin/perl
use strict;
use Data::Dumper;
my @advocates = (
{ fld_title => 'Rec1', fld_type => 'National', },
{ fld_title => 'Rec2', fld_type => 'State', },
{ fld_title => 'Rec3', fld_type => 'Local',
},
);
my $order = "\0local\0state\0national\0";
sub by_locale {
index( $order, "\0" . lc( $a->{ fld_type } ) . "\0" )
<=>
index( $order, "\0" . lc( $b->{ fld_type } ) . "\0" )
}
my @sorted_advocates = sort by_locale @advocates;
warn Dumper("unsorted_advocates", \@advocates);
warn Dumper("sorted_advocates", \@sorted_advocates);
| [reply] [d/l] |
|
| [reply] |
Re: Efficiently sorting array by non-alpha strings
by borisz (Canon) on Feb 20, 2004 at 16:05 UTC
|
Use a hash to remap your field.
my %h = ( National => 3, State => 2, Local => 1 );
my @sorted_advocates = sort { $h{$a->{fld_type}} <=> $h{$b->{fld_type}
+} } @advocates;
| [reply] [d/l] |
Re: Efficiently sorting array by non-alpha strings
by Limbic~Region (Chancellor) on Feb 20, 2004 at 16:09 UTC
|
knowmad,
I am assuming you want to sort first by locality type and second by title ASCIIBetically.
#!/usr/bin/perl
use strict;
use warnings;
my @advocates = (
{fld_title => 'Rec1', fld_type => 'National'},
{fld_title => 'Rec2', fld_type => 'State' },
{fld_title => 'Rec3', fld_type => 'Local' }
);
my %order = (LOCAL => 'A', STATE => 'B', NATIONAL => 'C');
my @sorted = map { $_->[0] }
sort { $a->[1] cmp $b->[1]
||
$a->[2] cmp $b->[2] }
map { [ $_ , $order{uc $_->{fld_type}}, $_->{fld_title} ]
+} @advocates;
Cheers - L~R
I am waiting for tye to comment on my persistent use of the ST.
| [reply] [d/l] |
|
L~R,
I had not considered the secondary sort by title. Nice touch!
I added your example into my benchmark tests which now include my original hack, Abigail's bucketsort example and yours (which looks to me like a Schwarztian Transform, no?). Yours takes the cake:
Benchmark: timing 500000 iterations of bucketsort, sort, st...
bucketsort: 1 wallclock secs ( 1.73 usr + 0.00 sys = 1.73 CPU) @ 28
+9017.34/s (n=500000)
sort: 1 wallclock secs ( 0.90 usr + 0.00 sys = 0.90 CPU) @ 55
+5555.56/s (n=500000)
st: 0 wallclock secs ( 0.66 usr + 0.00 sys = 0.66 CPU) @ 75
+7575.76/s (n=500000)
Rate bucketsort sort st
bucketsort 289017/s -- -48% -62%
sort 555556/s 92% -- -27%
st 757576/s 162% 36% --
Thanks,
-Wm | [reply] [d/l] |
|
knowmad,
Actually, you can avoid the ST all together, which should be even faster.
#!/usr/bin/perl
use strict;
use warnings;
my @advocates = (
{fld_title => 'Rec1', fld_type => 'National'},
{fld_title => 'Rec2', fld_type => 'State' },
{fld_title => 'Rec3', fld_type => 'Local' }
);
my %order = (LOCAL => 'A', STATE => 'B', NATIONAL => 'C');
my @sorted = sort { $order{uc $a->{fld_type}} cmp $order{uc $b->{fld_t
+ype}}
||
$a->{fld_title} cmp $b->{fld_title} } @advocates;
print "$_->{fld_title}\n" for @sorted;
Cheers - L~R | [reply] [d/l] |
|
Re: Efficiently sorting array by non-alpha strings
by dreadpiratepeter (Priest) on Feb 20, 2004 at 16:09 UTC
|
This ought to do it. WARNING: barely tested.
#!/usr/bin/perl
use strict;
use Data::Dumper;
my @advocates = (
{
fld_title => 'Rec1',
fld_type => 'National',
},
{
fld_title => 'Rec2',
fld_type => 'State',
},
{
fld_title => 'Rec3',
fld_type => 'Local',
},
);
my %map = (
local => 1,
state => 2,
national => 3
);
my @sorted_advocates =
map {$_->[1]}
sort {$a->[0] <=> $b->[0]}
map {[$map{lc($_->{fld_type})},$_]} @advocates;
warn Dumper("unsorted_advocates", \@advocates);
warn Dumper("sorted_advocates", \@sorted_advocates);
-pete
"Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."
| [reply] [d/l] |
|
|