http://qs321.pair.com?node_id=1103673

fionbarr has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to sort date that looks like this:
29 '0,wupra00a0537' 30 '31,wppra00a0513' 31 '0,wupra00a0535' 32 '0,wuprd02a0089' 33 '0,wupwa07a0663' 34 '0,wdpwa00a0013' 35
my code looks like this:
my @array = (); @array = map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { [ $_, /.+,/ ] } @cpu_lt_40
I want to sort by the number in descending order and the sort result is (partial):
147 '0,wuprm00a0539' 148 '0,wuprm00a0539' 149 '28,wppra02a0015' 150 '24,wppra01a0016' 151 '0,wupra00a0532' 152 '15,wppra01a0015' 153 '22,wppra01a0095'

Replies are listed 'Best First'.
Re: schwartzian transform
by Corion (Patriarch) on Oct 13, 2014 at 21:15 UTC

    What steps have you undertaken to debug this issue?

    What values does $b->[1] and $a->[1] have? How could you find this out yourself?

    What does the following code output?

    $_= '31,wppra00a0513'; print /.+,/;

    Compare the output with

    $_= '31,wppra00a0513'; print /(.+),/;

    Maybe you want to read Regexp Quote-Like-Operators on how m/.../ behaves in list context.

    Update: My interpretation of your input data might be off-base. Still, the debugging steps outlined above still hold.

      /(.+),/ makes ALL the difference....thanks
Re: schwartzian transform
by GrandFather (Saint) on Oct 13, 2014 at 21:16 UTC

    Neither your sample nor your description make it clear how you want the data sorted. However the following may help you sort out whatever it it you want to do:

    use strict; use warnings; my @array = map {$_->[0]} sort {$b->[5] <=> $a->[5] || $b->[3] <=> $a->[3]} map {[$_, /(\d+),(\D+)(\d+)(\D+)(\d+)/]} map {chomp; s/'//g; $_} <DATA>; print join "\n", @array; __DATA__ 31 '0,wupra00a0535' 147 '0,wuprm00a0539' 148 '0,wuprm00a0539' 149 '28,wppra02a0015' 150 '24,wppra01a0016' 151 '0,wupra00a0532' 152 '15,wppra01a0015'

    Prints:

    147 0,wuprm00a0539 148 0,wuprm00a0539 31 0,wupra00a0535 151 0,wupra00a0532 150 24,wppra01a0016 149 28,wppra02a0015 152 15,wppra01a0015
    Perl is the programming world's equivalent of English
Re: schwartzian transform
by Ea (Chaplain) on Oct 14, 2014 at 08:10 UTC
    Your problem is in the first map (or the last map reading left to right which is not how I think when I'm building a Schwartzian transform). The /.+,/ in
    map { [ $_, /.+,/ ] }
    is only returning the number of matches (or 1 in this case). As a result it should give you back the original list. If you're trying to sort on computed fields, you need to extract them first and _then_ bundle them up in the anonymous array.

    So something like
    map     { my @fields = split /,/; [ $_, $field[1] ] } might get you closer to what you're looking for.

    Once you've mastered that, you'll next want to look at the Orcish manoeuvre which speeds up your sorts

    Sometimes I can think of 6 impossible LDAP attributes before breakfast.
      I don't really see how an Orcish manoeuvre would improve a Schwartzian Transform sort.

        In this particular case, I'm not sure it would. In general, it can, in situations where the (chronologically) first transform consumes a lot of time for some reason, if you have duplicated data in the list that you are sorting. Not needing to perform the same time-intensive computation repeatedly saves time. But in this case, it doesn't look like there are going to be a lot of duplicate data (though only someone with access to the original poster's real data could say that for sure), and furthermore that regular expression in the first transform doesn't look like it would be much of a bottleneck, so I'm not sure much if anything would be gained.

        You're right, Laurent. You only calculate the sort value once in the first call to map, so there's no savings to be made by caching.

        Those two techniques must be stored together in the "neat things to use with sort" area of my brain. Apologies.

        Sometimes I can think of 6 impossible LDAP attributes before breakfast.
      The /.+,/ ... is only returning the number of matches ...

      Not that it makes any difference (the logic is still wrong), but it's returning the success of the match:

      c:\@Work\Perl>perl -wMstrict -MData::Dump -le "my @strings = ('x,x', 'yy,yy,yy', 'z,zzz,z,zz', 'foo'); ;; my @ra = map { [ $_, /[xyz]+,/ ] } @strings; dd \@ra; ;; @ra = map { [ $_, /[xyz]+,/g ] } @strings; dd \@ra; ;; @ra = map { [ $_, scalar(/[xyz]+,/) ] } @strings; dd \@ra; ;; dd \@strings; " [["x,x", 1], ["yy,yy,yy", 1], ["z,zzz,z,zz", 1], ["foo"]] [ ["x,x", "x,"], ["yy,yy,yy", "yy,", "yy,"], ["z,zzz,z,zz", "z,", "zzz,", "z,"], ["foo"], ] [["x,x", 1], ["yy,yy,yy", 1], ["z,zzz,z,zz", 1], ["foo", ""]] ["x,x", "yy,yy,yy", "z,zzz,z,zz", "foo"]
      See Regexp Quote-Like Operators (in perlop) -> /PATTERN/msixpodualgc -> "Matching in list context".

      Update: Edited code example for space.