Re: Sorting an array of arrays by field

I love this site and have learned more about PERL here, than any other website

Not much, I guess ;^). Please, it's "Perl" (mind the capitalization).

@look = <INF>;

It's better to use <DATA> for your code. That way, the data goes with the program and you don't have to provide it separately.

@out = sort { (split '|', $a, 12)[4] <=> (split '|', $b, 12)[4] } @look;

You need to quote the pipe: '\|'.

@fields = @$line;

$line is not an arrayref. The split you did only extracted the 4th field of the string for sorting, nothing more. @out is a regular array of strings, like @look.

Applying this comments, and adding strict and warnings, your code becomes:

my @look = <DATA>;
my @out = sort { (split '\|', $a, 12)[4] <=> (split '\|', $b, 12)[4] }
+ @look;
foreach my $line (@out) { 
    print $line;
}
__DATA__
accred|143|0|0|412|0|0|0|0|0|0|0|0|
accu-b|36|0|0|103|0|38|0|0|0|0|2|0|
accua|35|0|0|27|0|37|0|0|0|0|1|0|
[...]
[download]

If you want to access the fields individually, you have to perform another split inside the foreach loop.

Further on, doing a Guttman-Rosler transform, it could be:

## tested
my @out =
    map { substr $_, 6 }   ## hardcoded "6" here...
    sort
    map { my $val = (split /\|/)[4]; sprintf "%06d$_",$val }  ## ... a
+nd here
    <DATA>;
[download]

my @d = <DATA>; my @e; my @key = map { (split /\|/)[4] } @d;
use Benchmark qw/cmpthese/;
cmpthese (4e4, {
    grt    => sub { @e = map { substr $_, 6 } sort map { my $val = (sp
+lit /\|/)[4]; sprintf "%06d$_",$val} @d; },
    raw    => sub { @e = sort { (split '\|', $a, 12)[4] <=> (split '\|
+', $b, 12)[4] } @d; },
    ambrus => sub { @e = @d[ sort { $key[$a] <=> $key[$b] } 0 .. @d - 
+1 ]; },
});
[download]

          Rate    raw    grt ambrus
raw     2742/s     --   -58%   -92%
grt     6547/s   139%     --   -80%
ambrus 33058/s  1106%   405%     --
[download]

Update: Added benchmark. || Included ambrus' solution.

--
David Serrano

Comment on Re: Sorting an array of arrays by field Select or Download Code

Replies are listed 'Best First'.
Re^2: Sorting an array of arrays by field by salva (Canon) on Oct 16, 2006 at 12:29 UTC
Your benchmark is unfair as the key extraction for ambrus solution is not being measured. Anyway, using Sort::Key is, as usual, the fastest solution! `use Sort::Key 'ikeysort'; my @d = <DATA>; my @e; use Benchmark qw/cmpthese/; cmpthese (4e4, { grt => sub { @e = map { substr $_, 6 } sort map { my $val = (split /\\|/ +)[4]; sprintf "%06d$_",$val} @d; }, raw => sub { @e = sort { (split '\\|', $a, 12)[4] < +=> (split '\\|', $b, 12)[4] } @d; }, ambrus => sub { my @key = map { (split /\\|/)[4] } @d +; @e = @d[ sort { $key[$a] <=> $key[$b +] } 0 .. @d - 1 ]; }, sk => sub { @e = ikeysort { (split '\\|', $_, 12)[ +4] } @d } });` [download] on my computer says... `Rate raw grt ambrus sk raw 1887/s -- -63% -69% -74% grt 5141/s 172% -- -15% -29% ambrus 6061/s 221% 18% -- -16% sk 7246/s 284% 41% 20% --` [download]	[reply] [d/l] [select]
Re^2: Sorting an array of arrays by field by Anonymous Monk on Oct 15, 2006 at 02:54 UTC
Thanks for the help, David.	[reply]


XP is just a number
	PerlMonks