Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Sort questions (use a GRT)

by grinder (Bishop)
on Jan 28, 2005 at 22:03 UTC ( [id://426137]=note: print w/replies, xml ) Need Help??


in reply to Sort questions

This is not too shabby to start with, you're on the right track. To sort by the day of the week, use a hash as a lookup, to retrieve the day value:

my %dow = (qw( Sunday 0 Monday 1 Tuesday 2 Wednesday 3 Thursday 4 Friday 5 Saturday 6 )); # ... $dow{$a[7]} <=> $dow{$b[7]}

You might want to make Sunday => 7 if you want it to sort higher, to put the weekend at the end. I'm using a qw() quote-word construct, which returns a list and assigns that to a hash. Just a bit less typing.

For the timestamps, if you can guarantee that they will always follow a ddd:dd:dd format, you can just sort them alphabetically as you are already doing

To combine them all into an efficient sort statement, use a Guttman-Rosler Transform (GRT)*. The idea is to build a prefix to the string so that you can use a bare sort operation, and then throw away the prefix after the sort and recover the original record.

my @out = map { (split /%/)[2] } sort map { my @field = split /,/; "$dow{$field[7]}%$field[8]%$_" } @in ;

Since the first field of the record is also part of the sort criteria, you don't actually have to make it part of the prefix. So you only have to add on a munged day of week and the timestamp to have it sort correctly. Then, after the sort, split it up on % and throw the first two away to recover your record.

If the array can be really large, say, over 200_000 entries, then you should consider preparing data files so that the operating system sort command can be used. Otherwise I'd just stick with Perl.

* see Advanced Sorting - GRT - Guttman Rosler Transform for a discussion on the GRT.

- another intruder with the mooring in the heart of the Perl

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://426137]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-04-19 14:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found