Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

numbers OK; Re: sorting comma separated value file

by tye (Sage)
on Aug 22, 2000 at 22:04 UTC ( #29070=note: print w/replies, xml ) Need Help??

in reply to sorting comma separated value file

The following code automatically detects numeric fields or even fields of mixed text and numbers and sorts them properly (one of my favorite tricks). Note that it doesn't handle quoted fields that contain commas (or even where some lines have the field quoted and some don't and where the quotes are not supposed to affect the sort order). Replace the simple split/\s*,\s*/ with a use of the CSV module if you have that kind of data.

#!/usr/bin/perl -w use strict; die "Usage: $0 col[,col[...]] [file[,...]]\n" unless @ARGV; my @cols= map { $_-1 } split/,/,shift; my @lines= <>; my @sort= map { my $x=join"\0"x5,(split/\s*,\s*/)[@cols]; $x =~ s/(^|[^\d.])(\d+)/$1.pack("N",$2)/eg; $x } @lines; print @lines[ sort { $sort[$a] cmp $sort[$b] } 0..$#sort ];

Note that this code explicitly avoids using nifty nested map tricks because they tend to slow things down. For example, the code above was over twice as fast as the following sexier code in my large-file tests:

die "Usage: $0 col[,col[...]] [file[,...]]\n" unless @ARGV; my @cols= map { $_-1 } split/,/,shift; print map { $_->[1] } sort { $a->[0] cmp $b->[0] } map { my $x=join"\0\0\0\0",(split/\s*,\s*/)[@cols]; s/(^|[^\d.])(\d+)/$1.pack("N",$2)/eg; [$x,$_] } <>;

P.S. The reason that this nested-map version is slow is not because I don't have tilly's illustrious patch (just to counter tilly's down-playing of how neat his patch is). Those are all 1-to-1 maps. (:

P.P.S. I think that this is a Schwartzian Transform, but I wasn't sure I'd done it right and didn't want to mislabel it. :) Update: While I was typing, an example of a Schwartzian Transform was posted just above and, other than mixing 1 and 0, I did write one.

        - tye (my smileys are ambidextrous!)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://29070]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2021-03-02 08:48 GMT
Find Nodes?
    Voting Booth?
    My favorite kind of desktop background is:

    Results (41 votes). Check out past polls.