Re: Efficiently sorting array by non-alpha strings

Your bucketsort is the elegant solution that I was seeking. Interesting though it's about twice as slow as my ugly version as indicated by this benchmark:

That's because you didn't fix my typos. Here's a different benchmark, with some of the other suggestions (I didn't include any that went beyond sorting on 'fld_type'):

#!/usr/bin/perl

use strict;
use warnings;

use Benchmark qw /timethese cmpthese/;
    
our @advocates = map {{fld_title => 'Rec' . $_,
                       fld_type  => [qw /National State Local/] -> [ra
+nd 3]}}
                     1 .. shift || 1000;
    
sub by_locale {
    my $a_type;
    my $b_type;
    $a_type = 1 if ($a->{'fld_type'} =~ /local/i);
    $a_type = 2 if ($a->{'fld_type'} =~ /state/i);
    $a_type = 3 if ($a->{'fld_type'} =~ /national/i);
    $b_type = 1 if ($b->{'fld_type'} =~ /local/i);
    $b_type = 2 if ($b->{'fld_type'} =~ /state/i);
    $b_type = 3 if ($b->{'fld_type'} =~ /national/i);
    $a_type <=> $b_type;
}

my $order = "\0local\0state\0national\0";
sub by_locale_b {
    index ($order, "\0" . lc ($a -> {fld_type}) . "\0") <=>
    index ($order, "\0" . lc ($b -> {fld_type}) . "\0")
}
                       
our (@knowmad, @abigail, @borisz, @pirate);
    
cmpthese  -1  => {
    knowmad   => '@knowmad = sort by_locale @advocates',
    abigail   => 'my %bucket;
                  map {push @{$bucket {$_ -> {fld_type}}} => $_} @advo
+cates;
                  @abigail = map {@{$bucket {$_}}} qw /Local State Nat
+ional/;',
    borisz    => 'my %h = (National => 3, State => 2, Local => 1);
                  @borisz = sort {$h {$a -> {fld_type}} <=>
                                  $h {$b -> {fld_type}}} @advocates;',
    pirate    => '@pirate = map {$_ -> [1]}
                            sort {$a -> [0] <=> $b -> [0]}
                            map  {[lc ($_ -> {fld_type}), $_]} @advoca
+tes',
    browseruk => '@browseruk = sort by_locale_b @advocates',
};

    
__END__
            Rate   knowmad browseruk    borisz    pirate   abigail
knowmad   39.4/s        --      -48%      -70%      -83%      -92%
browseruk 75.5/s       91%        --      -42%      -67%      -85%
borisz     130/s      229%       72%        --      -43%      -74%
pirate     229/s      480%      203%       76%        --      -53%
abigail    491/s     1147%      551%      279%      115%        --
[download]

Abigail

Comment on Re: Efficiently sorting array by non-alpha strings Download Code

Replies are listed 'Best First'.
Re: Re: Efficiently sorting array by non-alpha strings by borisz (Canon) on Feb 20, 2004 at 16:48 UTC
But the Hash (%h) should only setup once in my solution, at the top of a script or so. That gives: `Rate knowmad browseruk pirate abigail borisz knowmad 22.1/s -- -49% -80% -93% -95% browseruk 43.4/s 96% -- -61% -86% -89% pirate 111/s 403% 156% -- -64% -73% abigail 308/s 1291% 609% 177% -- -25% borisz 411/s 1758% 847% 270% 34%` [download] -- Boris	[reply] [d/l]
Re: Efficiently sorting array by non-alpha strings by Abigail-II (Bishop) on Feb 20, 2004 at 17:02 UTC
Highly suspicious results. You didn't publish source, but I have a hunch your code has `%h` lexical to the main program. This will make the sort a linear pass (no sorting happens). Try again with `our %h`. Abigail	[reply]
Re: Re: Efficiently sorting array by non-alpha strings by borisz (Canon) on Feb 20, 2004 at 17:11 UTC
You are right, very impressive abigail-ii found my bug without any source. Here the corrected results: `Rate knowmad browseruk borisz pirate abigail knowmad 22.1/s -- -49% -72% -80% -93% browseruk 43.0/s 94% -- -45% -62% -86% borisz 78.5/s 255% 83% -- -30% -74% pirate 112/s 407% 161% 43% -- -64% abigail 308/s 1291% 616% 292% 174% --` [download] Boris	[reply] [d/l]
Re: Re: Efficiently sorting array by non-alpha strings by knowmad (Monk) on Feb 20, 2004 at 17:04 UTC
I did fix a few typos but my benchmark isn't nearly as sophisticated as yours. I reran the tests using your code and am getting similar results, albeit my rate is much slower :(. Following BorisZ's comment, I placed the hash setup outside the cmpthese and indeed it is coming out as the speed demon. Interesting, I tried doing the same with the `my %bucket` line in your code and the performance decreased dramatically. I guess that using local variables provides better performance than globals? Thanks again for the insights! -Wm	[reply] [d/l]
Re: Efficiently sorting array by non-alpha strings by Abigail-II (Bishop) on Feb 20, 2004 at 17:12 UTC
Referring to lexical variables from within a Benchmark is the mistake made most often when benchmarking. And a very large percentage of the benchmarks I've seen in the past years make this mistake. It has prompted me to give a talk about this (and other mistakes) at YAPC::NA last year. Moving `my %h` from BorisZ code to the main program is wrong. Moving `my %bucket` from my code to the main program is wrong (although for a different reason!). Abigail	[reply]


There's more than one way to do things
	PerlMonks