Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Sorting text-number values

by merrymonk (Hermit)
on Nov 29, 2016 at 13:44 UTC ( [id://1176809]=perlquestion: print w/replies, xml ) Need Help??

merrymonk has asked for the wisdom of the Perl Monks concerning the following question:

A recent question of mine asked how to sort data, which was <alpha character><numeric characters> for example <Ab1234>.
Several Monks kindly gave suggestions and the ‘heart’ of one of them was
%out = (); for (@in){ if (/^([A-Za-z_]+)(\d+)$/){ # build hash of arrays # with alpha part uppercase:original as key push @{$out{join ':',uc $1,$1}},$2; } else { warn "\nInput data error <$_>\n"; } }

This gave exactly what I wanted.

As so often is the case the problem has become more complicated since I find that:

1. There can be a number or numbers in the middle as well at the end;
2. There can be some numeric values only.

Numbers in the middle case

A typical set of values when numbers are in the middle are

blank_5_str_1 blank_5_str_10 blank_5_str_11 blank_5_str_12 blank_5_str_13 blank_5_str_14 blank_5_str_2 blank_5_str_3 blank_5_str_4 blank_5_str_5 blank_5_str_6 blank_5_str_7 blank_5_str_8 blank_5_str_9

Note - the numbers may not be 'surrounded' by '_' characters.

For this data I would like the ‘text’ part data to be considered as blank_5_str_ and the number part of the data to be the number after the last _.

Therefore the first 3 items of the sorted list will be:

blank_5_str_1 blank_5_str_2 blank_5_str_3

Can the sort be changed to allow for this case?

Numbers only

One solution for this is to: 1. Split the data into 2 lists – one for text/number data and one for just numbers 2. Sort both lists independantly
3. Join the list so that the numbers are at the start or end

Is there a better solution than this?

Replies are listed 'Best First'.
Re: Sorting text-number values
by haukex (Archbishop) on Nov 29, 2016 at 13:53 UTC

    Hi merrymonk,

    Did you check out the other replies you got in that thread? For example, a search on PerlMonks as suggested by Corion would have given you, among other things, natural sort on array of arrays (Update: which in turn contains links to other places like How do I do a natural sort on an array?), or salva suggested Sort::Key:

    use Sort::Key::Natural qw/natsort/; my @data = qw/blank_5_str_1 blank_5_str_10 blank_5_str_11 blank_5_str_12 blank_5_str_13 blank_5_str_14 blank_5_str_2 blank_5_str_3 blank_5_str_4 blank_5_str_5 blank_5_str_6 blank_5_str_7 blank_5_str_8 blank_5_str_9 /; print "$_\n" for natsort @data; __END__ blank_5_str_1 blank_5_str_2 blank_5_str_3 ... blank_5_str_13 blank_5_str_14

    Hope this helps,
    -- Hauke D

Re: Sorting text-number values
by johngg (Canon) on Nov 29, 2016 at 15:19 UTC

    Use a regex with captures, making the string part optional. Simply sorting by packed string then final number it is possible to get the numbers only lines at the start.

    johngg@shiraz:~/perl/Monks > perl -Mstrict -Mwarnings -E ' my @data = qw{ this_5_string_12 some_12_garbage_23 this_5_string_8 17 this_5_string_23 some_12_garbage_6 102 this_5_string_19 5 this_5_string_101 }; my $width = 50; say for map { substr $_, 54 } sort map { do { no warnings qw{ uninitialized }; pack qq{A${width}NA*}, m{(.*\D)?(\d+)$}, $_; } } @data;' 5 17 102 some_12_garbage_6 some_12_garbage_23 this_5_string_8 this_5_string_12 this_5_string_19 this_5_string_23 this_5_string_101

    To get the numbers only lines at the end involves substituting a "high values" string for the missing string part.

    johngg@shiraz:~/perl/Monks > perl -Mstrict -Mwarnings -E ' my @data = qw{ this_5_string_12 some_12_garbage_23 this_5_string_8 17 this_5_string_23 some_12_garbage_6 102 this_5_string_19 5 this_5_string_101 }; my $width = 50; say for map { substr $_, 54 } sort map { do { m{(.*\D)?(\d+)$}; pack qq{A${width}NA*}, ( $1 ? $1 : qq{\x7f} x $width ), $2 +, $_; } } @data;' some_12_garbage_6 some_12_garbage_23 this_5_string_8 this_5_string_12 this_5_string_19 this_5_string_23 this_5_string_101 5 17 102

    Sorting by final number then string mixes the numbers only lines in with the rest.

    johngg@shiraz:~/perl/Monks > perl -Mstrict -Mwarnings -E ' my @data = qw{ this_5_string_12 some_12_garbage_23 this_5_string_8 17 this_5_string_23 some_12_garbage_6 102 this_5_string_19 5 this_5_string_101 }; my $width = 50; say for map { substr $_, 54 } sort map { do { no warnings qw{ uninitialized }; pack qq{NA${width}A*}, reverse( m{(.*\D)?(\d+)$} ), $_; } } @data;' 5 some_12_garbage_6 this_5_string_8 this_5_string_12 17 this_5_string_19 some_12_garbage_23 this_5_string_23 this_5_string_101 102

    I hope this is helpful.

    Cheers,

    JohnGG

      As the first part of johngg’s suggestion gave what I wanted, I copied the Perl into a file and ran this in an MSDOS window.

      my @data = qw{ this_5_string_12 some_12_garbage_23 this_5_string_8 17 this_5_string_23 some_12_garbage_6 102 this_5_string_19 5 this_5_string_101 }; my $width = 50; say for map { substr $_, 54 } sort map { do { no warnings qw{ uninitialized }; pack qq{A${width}NA*}, m{(.*\D)?(\d+)$}, $_; } } @data;
      Perl ran but sadly did not give the sorted data at the end.

      Can you tell me where it is stored and how to store this in an array? I did add lines to print out @data but that, probably as expected, gave me the what was stored with the ‘qw’ at the beginning of the code.

      I have used Perl for many years but I have never got to grips with this sort of coding!

        So, what did Perl output?

Re: Sorting text-number values
by LanX (Saint) on Nov 29, 2016 at 13:56 UTC
    Please note that </br> is not a Markup in the Monastery

    Better use <br> (rarely) or <p> , otherwise some monks may see only broken text and consequently downvote you.

    HTH! :)

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

    Update

    from Markup in the Monastery

    Tags You Should NOT Use

    ...

    Inserting a <br> tag forces a newline at the point at which it's inserted. Monastery documents differ on its acceptability.

Re: Sorting text-number values
by poj (Abbot) on Nov 29, 2016 at 14:23 UTC
    Is there a better solution than this?

    It depends, are the values in the set unique if converted to upper case and are the numbers always less than say 10 digits ?

    Update - assuming answers are yes try

    #!perl use strict; my %out=(); while (<DATA>){ chomp; my $key = uc $_; $key =~ s/(\d+)/sprintf("%010d",$1)/eg; $out{$key} = $_; } print "$out{$_} ($_)\n" for (sort keys %out); __DATA__ blank_5_str_1 blank_5_str_10 blank_5_str_11 blank_5_str_12 blank_5_str_13 blank_5_str_14 blank_5_str_2 blank_5_str_3 blank_5_str_4 blank_5_str_5 blank_5_str_6 blank_5_str_7 blank_5_str_8 blank_5_str_9
    poj
Re: Sorting text-number values
by tybalt89 (Monsignor) on Nov 29, 2016 at 16:26 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1176809 use strict; use warnings; print sort {$a =~ s/.*[^\d\n]//r <=> $b =~ s/.*[^\d\n]//r} <DATA>; __DATA__ this_5_string_12 some_12_garbage_23 this_5_string_8 17 this_5_string_23 some_12_garbage_6 102 this_5_string_19 5 this_5_string_101

    Outputs:

    5 some_12_garbage_6 this_5_string_8 this_5_string_12 17 this_5_string_19 some_12_garbage_23 this_5_string_23 this_5_string_101 102

    Is this what you want as sort order?

      First a general thank you to all who have suggestions to my query.

      It so happens that the first part of johngg’s method has given the order I was looking for. However, the additional parts of his and other suggestions just shows me how careful you have to be in specifying what is wanted and also just how many ways there are ‘to boil an egg”!

Re: Sorting text-number values
by tybalt89 (Monsignor) on Nov 30, 2016 at 16:20 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1176809 use strict; use warnings; # sort by last number in each string print map $_->[0], sort {$a->[-1] <=> $b->[-1]} map [$_, /\d+/g], <DAT +A>; __DATA__ this_5_string_12 some_12_garbage_23 this_5_string_8 17 this_5_string_23 some_12_garbage_6 102 this_5_string_19 5 this_5_string_101

    prints:

    5 some_12_garbage_6 this_5_string_8 this_5_string_12 17 this_5_string_19 some_12_garbage_23 this_5_string_23 this_5_string_101 102

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1176809]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-24 23:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found