Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Schwartzian transform deformed with impunity

by biohisham (Priest)
on Apr 22, 2012 at 16:23 UTC ( #966478=perlquestion: print w/replies, xml ) Need Help??

biohisham has asked for the wisdom of the Perl Monks concerning the following question:

Well, I guess I got away with this in some magical way I can't fathom in a Perl 5.12.4, I confess I was being abusive when tried to numerically compare some group of character strings which contained a number with the purpose of sorting them, the guilty feeling of seeing the warnings is irksome. I have a list of files that I want to sort orderly based on the number in their names and I thought I will achieve that through a Schwartzian transform. My files have the format of 'sequence<n>.gb.txt' where <n> is any number.

What my code does is that it goes around the directory picking these file names and feed that into an array, even though the files are arranged in the directory they are not in that array, so doing @sorted = map{$_->[0]} sort{$a->[2]<=>$b->[2]} map{[$_,split/sequence/]} @unsorted was my option, trying various combinations to split finally landed me in the direction (I tried splittig around /./ or /\d+/..etc). It is clear that sort() is so generous, I tried cmp (just to test what the output looks like). The code sorts @unsorted and yet complains of 'arguments being not numeric in numeric comparison (<=>)' blah blah

So Perl's sort() gracefully understood what I mean yet I got forgiving-ly pinched,I wonder as to how I can best evade introducing such warnings (going "no warnings" of course is not an option for me;)), any ideas?

use strict; use warnings; my @unsorted; my @sorted; while(my $file = <DATA>){ chomp $file; push @unsorted, $file; } @sorted = map{$_->[0]} sort{$a->[2] <=> $b->[2]} map{[$_, split/sequen +ce/]} @unsorted; print join("\n",@sorted); __DATA__
##OUTPUT## Argument "" isn't numeric in numeric comparison (<=>) at SortQ line 11, <DATA> line 9. ... ...
UPDATE:Apparently the powers of a Schwartzian ensemble are so crazy

David R. Gergen said "We know that second terms have historically been marred by hubris and by scandal." and I am a two y.o. monk today :D, June,12th, 2011...

Replies are listed 'Best First'.
Re: Schwartzian transform deformed with impunity
by moritz (Cardinal) on Apr 22, 2012 at 16:39 UTC
Re: Schwartzian transform deformed with impunity
by jwkrahn (Monsignor) on Apr 22, 2012 at 23:16 UTC
    use strict; use warnings; chomp( my @unsorted = <DATA> ); my @sorted = map unpack( 'x4a*', $_ ), sort map pack( 'Na*', /(\d+)\D* +\z/, $_ ), @unsorted; print map "$_\n", @sorted; __DATA__
Re: Schwartzian transform deformed with impunity
by dave_the_m (Monsignor) on Apr 22, 2012 at 19:27 UTC
    If the file format really is as fixed as you describe, with only the number component varying, then you could strip off everything except the number, sort the numbers, then print out or assign to an array while reconstructing the file name from the number:
    use strict; use warnings; print map "sequence$\n", sort { $a <=> $b } map { /(\d+)/; $1 } <DATA>; __DATA__ ....


      map { /(\d+)/; $1 }

      No, that is wrong.    If /(\d+)/ doesn't match then $1 will not contain valid data:

      $ perl -le' use Data::Dumper; my @y = map { /(\d+)/; $1 } qw/ ab123cd ab456cd abcdefg ab789cd /; print Dumper \@y; ' $VAR1 = [ '123', '456', undef, '789' ];

      Just use:

      $ perl -le' use Data::Dumper; my @y = map /(\d+)/, qw/ ab123cd ab456cd abcdefg ab789cd /; print Dumper \@y; ' $VAR1 = [ '123', '456', '789' ];

      The regular expression by itself will just do the right thing.

Re: Schwartzian transform deformed with impunity
by salva (Canon) on Apr 23, 2012 at 08:58 UTC
    going "no warnings" of course is not an option for me

    Why not? there is nothing wrong in disabling them, at least if you know why they are happening:

    @sorted = map{ $_->[0]} sort{ no warnings 'numeric'; $a->[2] <=> $b->[2] } map{[$_, split/sequence/]} @unsorted;

    However, for this particular case and as already stated in other monks answers, it is easier to extract just the number.

    Besides that, if you are concerned about the sort performance, you should try Sort::Key and Sort::Key::Radix:

    use Sort::Key::Radix 'ukeysort'; my @sorted = ukeysort { /(\d+)/; $1 } @unsorted;

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://966478]
Approved by marto
Front-paged by planetscape
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (None)
    As of 2021-10-21 06:03 GMT
    Find Nodes?
      Voting Booth?
      My first memorable Perl project was:

      Results (82 votes). Check out past polls.