Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Twisted sort requirements

by forrest (Beadle)
on Feb 01, 2006 at 06:56 UTC ( [id://526984]=perlquestion: print w/replies, xml ) Need Help??

forrest has asked for the wisdom of the Perl Monks concerning the following question:

Suppose I have the following list of things to sort:
@things_to_sort = ( { author => 'bart', title => 'skateboarding' }, { author => 'lisa', title => 'postmodernism' }, { author => 'marge', title => 'hairstyles' }, { author => 'lisa', title => 'THIS BOOK FIRST' }, { author => 'homer', title => 'donuts' }, { author => 'bart', title => 'coolness' } );
My sort requirements are: If the title is 'THIS BOOK FIRST', that needs to go first, followed by any other books by the same author alphabetically by title. All other books follow alphabetically by author, then title. Is it possible to write a sort routine so that
@sorted_things = sort by_twisted_requirements @things_to_sort
will do what I need? Or do I need to break it up into @things_at_the_top and @the_rest, sort those and then smack the list back together? I think I can do that, but it's hella ugly. Can anyone suggest an elegant solution to this problem?

Replies are listed 'Best First'.
Re: Twisted sort requirements
by BrowserUk (Patriarch) on Feb 01, 2006 at 07:42 UTC

    You are going to need a pre-pass of the data to find the author of the titled book. I've lower cased the titles for the sort so that "the book of ... " and "The book of..." will sort together, but that is easily removed. This assumes that your authors names and book titles don't contain null chars.

    #! perl -slw use strict; my @things_to_sort = ( { author => 'bart', title => 'skateboarding' }, { author => 'lisa', title => 'postmodernism' }, { author => 'marge', title => 'hairstyles' }, { author => 'lisa', title => 'THIS BOOK FIRST' }, { author => 'homer', title => 'donuts' }, { author => 'bart', title => 'coolness' } ); my $firstTitle = 'THIS BOOK FIRST'; ## Find the author of the title my $firstAuthor; $_->{ title } eq $firstTitle and $firstAuthor = $_->{ author } for @things_to_sort; die "Title $firstTitle not found" unless $firstAuthor; my @sorted = map{ $_->[0] } sort { $a->[ 1 ] cmp $b->[ 1 ] } map { [ $_, sprintf "%s%s%s\0%s\0", $_->{ title } eq $firstTitle ? chr(0) : '', $_->{ author } eq $firstAuthor ? chr(0) : '', lc( $_->{ author } ), lc( $_->{ title } ) ] } @things_to_sort; printf "%-8s %-20s %-6s %-10s\n", %$_ for @sorted; __END__ c:\Perl\test>junk title THIS BOOK FIRST author lisa title postmodernism author lisa title coolness author bart title skateboarding author bart title donuts author homer title hairstyles author marge

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Twisted sort requirements
by davido (Cardinal) on Feb 01, 2006 at 07:08 UTC

    This should do it:

    use strict; use warnings; use Data::Dumper; my @things_to_sort = ( { author => 'bart', title => 'skateboarding' }, { author => 'lisa', title => 'postmodernism' }, { author => 'marge', title => 'hairstyles' }, { author => 'lisa', title => 'THIS BOOK FIRST' }, { author => 'homer', title => 'donuts' }, { author => 'bart', title => 'coolness' } ); my @sorted = sort { $a->{title} eq "THIS BOOK FIRST" && -1 or $b->{title} eq "THIS BOOK FIRST" && 1 or $a->{author} cmp $b->{author} or $a->{title} cmp $b->{title} } @things_to_sort; print Dumper \@sorted;

    This uses logical short-circuiting to "fall through" until one of the comparisons produces a result. First, $a is tested to see if it has the key phrase. If not, then $b is tested. If not, then $a and $b are compared to see if there is inequality in author names. If so, great, but if not, fall through to the title comparison.


    Dave

      That's the kind of thing I'm looking for, but you've missed the part of the requirement that after I get "THIS BOOK FIRST" I need all books by the same author to follow immediately after that (and finally the rest).

      I guess that might not be possible with a simple sort call.

        It is definately possible.

        graff kindly /msg'ed me to let me know I missed a specification. Update: I have now re-written my original method in such a way that the author is stored for future comparisons if title is "THIS BOOK FIRST". I then rewrote the logic into basic 'if' conditionals so that I could visualize the flow a little better. But I guess I've stared at it too long, and am simply missing something that someone's going to spot quickly. Unfortunately, that "something" is eluding me. Here is my code so far, that doesn't result in the correct sort order. I hope someone spots my flaw because it's going to keep me awake tonight.

        use strict; use warnings; use Data::Dumper; my @things_to_sort = ( { author => 'bart', title => 'skateboarding' }, { author => 'lisa', title => 'postmodernism' }, { author => 'marge', title => 'hairstyles' }, { author => 'lisa', title => 'THIS BOOK FIRST' }, { author => 'homer', title => 'donuts' }, { author => 'bart', title => 'coolness' } ); my @sorted; my $author = ''; @sorted = sort { if ( $a->{title} eq "THIS BOOK FIRST" ) { $author = $a->{author}; return -1; } elsif ( $b->{title} eq "THIS BOOK FIRST" ) { $author = $b->{author}; return 1; } elsif ( $a->{author} eq $author and $b->{author} eq $author ) { return( $a->{title} cmp $b->{title} ); } elsif ( $a->{author} eq $author ) { return -1; } elsif ( $b->{author} eq $author ) { return 1; } else { return ( $a->{author} cmp $b->{author} or $a->{title} cmp $b->{title} ) } } @things_to_sort; print Dumper \@sorted;

        I know it's really ugly. The "if" statements tend to do that, but I converted over to "if"s to help in visualizing the flow. ...obviously it didn't help me see the problem. I can't wait to hear if anyone else can spot it.


        Dave

Re: Twisted sort requirements
by GrandFather (Saint) on Feb 01, 2006 at 07:49 UTC

    With a little work before hand sort is the tool to use:

    use warnings; use strict; my @things_to_sort = ( { author => 'bart', title => 'skateboarding' }, { author => 'lisa', title => 'postmodernism' }, { author => 'marge', title => 'hairstyles' }, { author => 'lisa', title => 'THIS BOOK FIRST' }, { author => 'homer', title => 'donuts' }, { author => 'bart', title => 'coolness' } ); my @authors = grep {$_->{title} eq 'THIS BOOK FIRST'} @things_to_sort; print "$_->{author}: $_->{title}\n" for sort TwistedSort @things_to_s +ort; sub TwistedSort { if ($a->{author} eq $b->{author}) {# Same author, just check title return -1 if $a->{title} eq $authors[0]->{title}; return 1 if $b->{title} eq $authors[0]->{title}; return $a->{title} cmp $b->{title}; } return -1 if $a->{author} eq $authors[0]->{author}; return 1 if $b->{author} eq $authors[0]->{author}; return $a->{author} cmp $b->{author}; }

    Prints:

    lisa: THIS BOOK FIRST lisa: postmodernism bart: coolness bart: skateboarding homer: donuts marge: hairstyles

    DWIM is Perl's answer to Gödel
Re: Twisted sort requirements
by graff (Chancellor) on Feb 01, 2006 at 07:37 UTC
    Working from the first reply, which was a good start, I'm not sure this would qualify as "elegant", but it does what you want:
    use strict; use warnings; use Data::Dumper; my @things_to_sort = ( { author => 'bart', title => 'skateboarding' }, { author => 'lisa', title => 'postmodernism' }, { author => 'marge', title => 'hairstyles' }, { author => 'lisa', title => 'THIS BOOK FIRST' }, { author => 'homer', title => 'donuts' }, { author => 'bart', title => 'coolness' } ); my ($first) = grep { $_->{title} eq 'THIS BOOK FIRST' } @things_to_sor +t; my @sorted = sort { $a->{title} cmp $b->{title} } grep { $_->{author} eq $first->{author} } @things_to_sort; push @sorted, sort { $a->{author} cmp $b->{author} or $a->{title} cmp $b->{title +} } grep { $_->{author} ne $first->{author} } @things_to_sort; print Dumper \@sorted;
    I couldn't figure out a way to do it all in a single sort block -- especially when it comes to figuring out which author should be listed first. That just seemed to need an extra step at the outset. Maybe there's a way to use just one sort block after grepping for "THIS BOOK FIRST", but I haven't found it...

    update: my first posting had a much more complicated block for the first sort -- which was a failed attempt to do everything in one sort -- but I simplified it to just the appropriate logic.

Re: Twisted sort requirements
by jonadab (Parson) on Feb 01, 2006 at 13:29 UTC

    You can get most of what you need with a Schwartzian Transform, but there's one little wrinkle that complicates things: during the first map, when you run across an arbitrary title (say, 'postmodernism' by 'lisa'), you don't yet know whether the same author also has a 'THIS BOOK FIRST' title. So you need an extra pass, and an orcish manoeuver, to tell you that.

    (If terms like 'Schwartzian Transform' and 'orcish manoeuver' are new for you, you should first look those up (Google is good here) and try to understand them before trying to use the code below.)

    my %orcish; my @sorted = map { # Final pass: restore the original datum, dropping # the extra stuff we added in the second pass: $$_[4] } sort { # Books whose author has a 'THIS BOOK FIRST' title are first: $$a[0] <=> $$b[0] # Within that, sort by author: or $$a[1] <=> $$b[1] # Within that, sort by whether it's THIS BOOK FIRST or not: or $$a[2] <=> $$b[2] # Within that, sort by the title, alphabetically: or $$a[3] <=> $$b[3] } map { # Second pass: gather enough info that sort can sort like we want: [$orcish{$$_{author}}, $$_{author}, ($$_{title} eq 'THIS BOOK FIRST'), $$_{title}, $_] } map { # First pass: take note of which authors have 'THIS BOOK FIRST': $orcish{$$_{author}}++ if $$_{title} eq 'THIS BOOK FIRST'; # But just pass through the data unmodified for now: $_ } @things_to_sort;

    Note that the way I've written this, if an author has _two_ books with the magic title, that author's works will be listed before the ones who only have _one_ such title. If this is undesireable, change the ++ to =1 in the first pass.

    Also note that with this structure in place it's easy to now make further refinements to the sort order, by making slight modifications to the second pass and/or the sort. For instance, if I wanted the words "A", "An", and "The" on the beginnings of titles to be ignored for sorting purposes, then in the second pass, instead of using $$_{title} directly, I would use title_for_sort_purposes($$_{title}) and then write that subroutine to strip initial articles from its input and return what is left. Case unification could also be performed, whitespace consolidation, and whatever other forms of normalization you desire. The author's name, too, can be handled in this way, and in fact could be parsed and the last name split out (if your author names aren't already in "last, first" form), and so forth. The original data are retained as the last element in each anonymous array, so you can go nuts making adjustments to the other elements to tweak how they sort, without changing how they appear once sorted.

Re: Twisted sort requirements
by salva (Canon) on Feb 01, 2006 at 14:06 UTC
    other monks have already suggested ways to do the "twisted sort", but have you considered just changing your data structures on the first place?

    You are storing in the same array two kinds of unrelated information: a list of books and author ordering information. Using a magic marker is a bad idea almost every time.

    In that case, the obvious data structures to use are an array to store book descriptions and a hash to store author ordering information.

    If you are getting the information from a DB, the information should already be structured that way there. You should even be able to get the list of books ordered from the database!

    If you are reading the information from a file (or from the network) you can build those two structures on the fly on the parsing phase with almost identical effort and then other parts of your code will become much cleaner.

Re: Twisted sort requirements
by injunjoel (Priest) on Feb 01, 2006 at 17:58 UTC
    Greetings all,
    A little late I see but here is what I would do...
    #!/usr/bin/perl -w use strict; my @things_to_sort = ( { author => 'bart', title => 'skateboarding' }, { author => 'lisa', title => 'postmodernism' }, { author => 'marge', title => 'hairstyles' }, { author => 'lisa', title => 'THIS BOOK FIRST' }, { author => 'homer', title => 'donuts' }, { author => 'bart', title => 'coolness' } ); #this begs for a new structure. #lets use a hash. my %authors_books; #turn your AoH's into a HoA's #see perldsc for reference foreach (@things_to_sort){ push @{$authors_books{$_->{author}}}, $_->{title}; } #sort the books for each author foreach my $author (keys %authors_books){ @{$authors_books{$author}} = sort{ $a eq "THIS BOOK FIRST" && -1 or $b eq "THIS BOOK FIRST" && 1 or $a cmp $b }@{$authors_books{$author}}; } #figure out who to display first. my @display_keys = sort{ (grep /THIS BOOK FIRST/, @{$authors_books{$a}}) && -1 or (grep /THIS BOOK FIRST/, @{$authors_books{$b}}) && 1 or $a cmp $b } keys %authors_books; #display the information. foreach my $display_key (@display_keys){ print "$display_key books:\n"; foreach (@{$authors_books{$display_key}}){ print "\t$_\n"; } } exit;
    The Output
    lisa books: THIS BOOK FIRST postmodernism bart books: coolness skateboarding homer books: donuts marge books: hairstyles


    -InjunJoel

    P.S.
    thanks davido for the sort idea.

    "I do not feel obliged to believe that the same God who endowed us with sense, reason and intellect has intended us to forego their use." -Galileo
Re: Twisted sort requirements
by NetWallah (Canon) on Feb 02, 2006 at 01:28 UTC
    Yes - this is VERY late to this party, but, this code should be relevant for reference.

    I liked injunjoel's streightforward code, but I wanted to simplify it, and make special cases stand out. I also dislike handling special cases inside sort routines, because these are called more often than the number of items to be sorted. I also believe I have minimized the number of loops.

    The only part I am not happy with is the "Special Marker" that is passed into the sort. That is a compromise made in order to simplify the code:

    #!/usr/bin/perl -w use strict; my @things_to_sort = ( { author => 'bart', title => 'skateboarding' }, { author => 'lisa', title => 'postmodernism' }, { author => 'marge', title => 'hairstyles' }, { author => 'lisa', title => 'THIS BOOK FIRST' }, { author => 'homer', title => 'donuts' }, { author => 'lisa', title => 'Eating Veggies' }, { author => 'bart', title => 'coolness' } ); #this begs for a new structure. #lets use a hash. my (%authors_books, $FirstAuth); my $FirstTitle = "THIS BOOK FIRST"; my $Special_Low_Sort_Marker = "0x00" x 100; # A hundred nulls #turn your AoH's into a HoA's #Find first Author #Mark Specially-named books #see perldsc for reference foreach (@things_to_sort){ if ( $_->{title} eq $FirstTitle){ $FirstAuth = $_->{author} ; push @{$authors_books{ $_->{author} }}, $Special_Low_Sort_Marker . $_->{title}; # Special Marker - + Push to top of SORT }else{ push @{$authors_books{ $_->{author} }}, $_->{title}; } } #display the information. if ($FirstAuth){ PrintAuthor($FirstAuth); delete $authors_books{$FirstAuth}; } foreach my $author ( sort keys %authors_books){ PrintAuthor($author); } ######################## sub PrintAuthor{ my $author = shift; print "$author books:\n"; foreach (sort @{$authors_books{$author}}){ s/^$Special_Low_Sort_Marker//; # Remove Special marker print "\t$_\n"; } }
    I added an extra data line, to ensure that sorting was alpha, according the OP's spec. The output:
    lisa books: THIS BOOK FIRST Eating Veggies postmodernism bart books: coolness skateboarding homer books: donuts marge books: hairstyles
    Extending this to allow Multiple "First Authors" is left as an exercise for the reader.

         "For every complex problem, there is a simple answer ... and it is wrong." --H.L. Mencken

      Another late one but never mind! I also adopted the extra data line. Then a pass through the data once to find who wrote "THIS BOOK FIRST" and the rest can be done in a Schwartzian transform.
      #!/usr/local/bin/perl -w # use strict; our @thingsToSort = ( {author => 'bart', title => 'skateboarding'}, {author => 'lisa', title => 'postmodernism'}, {author => 'marge', title => 'hairstyles'}, {author => 'lisa', title => 'THIS BOOK FIRST'}, {author => 'homer', title => 'donuts'}, {author => 'bart', title => 'coolness'}, {author => 'lisa', title => 'eating veggies'}); our $firstAuthor = q(); foreach (@thingsToSort) { next unless $_->{title} eq 'THIS BOOK FIRST'; $firstAuthor = $_->{author}; last; } print map {"$_->[0]->{author}: $_->[0]->{title}\n"} sort { $b->[1] <=> $a->[1] || $b->[2] <=> $a->[2] || $a->[0]->{author} cmp $b->[0]->{author} || $a->[0]->{title} cmp $b->[0]->{title} } map { [ $_, $_->{title} eq 'THIS BOOK FIRST', $_->{author} eq $firstAuthor ] } @thingsToSort;
      I tried to think of a way to accomplish the first pass in the transform to make things simpler but failed.

      Cheers,

      JohnGG

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://526984]
Approved by jbrugger
Front-paged by jbrugger
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-03-28 17:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found