Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

how does these map statements work

by martin_100 (Initiate)
on Aug 06, 2014 at 08:06 UTC ( [id://1096398]=perlquestion: print w/replies, xml ) Need Help??

martin_100 has asked for the wisdom of the Perl Monks concerning the following question:

The following code was an answer posted to a query a long time back on how to sort a text file that had 5 columns of numbers and the user wanted to sort based on 2nd column. I don't understand the solution and was hoping someone could explain it.

use strict; use File::Slurp qw(read_file); print map { $_->[1] } sort { $a->[0] <=> $b->[0] } map { [(split /\s+/)[1], $_] } read_file( $ARGV[0] );

Replies are listed 'Best First'.
Re: how does these map statements work
by frozenwithjoy (Priest) on Aug 06, 2014 at 08:56 UTC

    Edit to include code from martin_100's question on sorting lines based on column 2 of a space-delimited file:

    print map { $_->[1] } sort { $a->[0] <=> $b->[0] } map { [(split /\s+/)[1], $_] } read_file( $ARGV[0] );

    Let's read the code in reverse order.

    Read the file
    read_file( $ARGV[0] );

    Start by reading in the file line by line.

    Convert each line into an anonymous array: [ 'column 2 text', 'full line of text']
    map { [(split /\s+/)[1], $_] }

    The lines are passed (as $_) to the 'first' map. $_ is implicitly passed to split /\s+/, which splits the line on spaces. (Note: the , $_ has nothing to do with the split!) Since it is wrapped in (...)[1], the split output is treated like a list in which the 2nd element is returned. This, itself, is inside [ ..., $_], which gives you an anonymous array containing the second element from the split ((split /\s+/)[1]) and the entire original line ($_).

    Sort arrays based on first element of array (e.g., 'column 2 text')
    sort { $a->[0] <=> $b->[0] }

    These anonymous arrays are all passed along to the sort. Sort orders the arrays based on the first element of each anonymous array, hence the ->[0].

    Pass second elements of sorted arrays (e.g., 'full line of text') to print statement
    map { $_->[1] }

    Now the sorted anonymous arrays containing both the second column of text and the entire line of text are passed to the 'second' map. This map passes only the second element of each array (the full original line) to the print statement, thereby printing the lines sorted based on the second column!

Re: how does these map statements work
by AppleFritter (Vicar) on Aug 06, 2014 at 09:40 UTC

    This is a standard technique called the Schwartzian Transform (after Randal L. Schwartz, aka merlyn, though he wasn't the one who named it). The idea is that you augment your data with additional information (that's the innermost map), then use that extra info to sort (the sort call, naturally), and finally discard it again (the outermost map), leaving only the original data, in the right order.

    The reason why this is done is that often, you want to sort based on the result of some potentially expensive operation. Using the Schwartzian transform, you only have to perform this operation once per item, rather than twice per call to your comparison function. Since Perl's sort (mergesort, if you're curious) is O(n log(n)), and this is optimal for sorting algorithms in the general case, the Schwartzian transform is almost always a good idea if you need to process list elements in some fashion before sorting -- and the performance gain will be larger the more expensive processing an item is.

    Here's merlyn's classic column on the technique.

Re: how does these map statements work
by AnomalousMonk (Archbishop) on Aug 06, 2014 at 10:25 UTC
Re: how does these map statements work
by Anonymous Monk on Aug 06, 2014 at 08:36 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1096398]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-04-19 04:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found