http://qs321.pair.com?node_id=11101345

edujs7 has asked for the wisdom of the Perl Monks concerning the following question:

Hi folks, hoping someone can help me.

I trying to sort alphabetically from a file that contains several rows with numbers and words - I need to sort them based on the word column. Below an idea of what the file contains.

1 2 3 delta 1 2 3 apricot 1 2 3 charlie 1 2 3 bravo 1 2 3 echo 1 2 3 fox
My struggle is that I cannot make it work so that it sorts alphabetically but only on the last column (words.

Any advise will be greatly appreciated. Thanks so much

My problem is

Replies are listed 'Best First'.
Re: Sort alphabetically from file
by hippo (Bishop) on Jun 14, 2019 at 13:33 UTC

    TIMTOWTDI but here is one to get your teeth into.

    #!/usr/bin/env perl use strict; use warnings; print sort { ($a =~ /\S+\n$/g)[0] cmp ($b =~ /\S+\n$/g)[0] } <DATA> __DATA__ 1 2 3 delta 1 2 3 apricot 1 2 3 charlie 1 2 3 bravo 1 2 3 echo 1 2 3 fox
      Thanks man but for some reason not working - I'm using a .txt file as input to sort BT<W print sort { ($a =~ /\S+\n$/g)[0] cmp ($b =~ /\S+\n$/g)[0] } file.txt not sure if I'm using it correctly

        You have just demonstrated why haukex advised you to "post a question effectively". Your problem now has more to do with reading a file than it does with sorting. You must create a filehandle by opening (open) the file and then read from the filehandle with the diamond operatior <> (perlop). All the answers you have received use the special filehandle DATA which is open by default.

        For additional info on sorting, refer to FAQ How do I sort an array by (anything)?

        Bill

        You'll need to open the file first. eg:

        open my $in, '<', 'file.txt' or die "Cannot open file for reading: $!" +; print sort { ($a =~ /\S+\n$/g)[0] cmp ($b =~ /\S+\n$/g)[0] } <$in>; close $in;

        See open and close for all the details.

Re: Sort alphabetically from file
by daxim (Curate) on Jun 14, 2019 at 14:05 UTC
    sort -k4 < the_input_file

      I had no idea the Unix sort could do that. Thanks for posting. I've used GNU Utilities for Win32 before but it hasn't been updated since 2003 and is getting hard to find. I found a new project with the same objective called GNU on Windows that looks promising.

      From reading the Wikipedia page for Sort it looks like the left angle bracket redirect, '<', isn't needed to read the input file. Also, I noticed that Windows Server 2012 includes a command called sort that doesn't have the -k option. That's a pity since not only does it have less functionality than the Unix version but also it complicates setting up a path to the GNU version with the same name.

        If you're stuck in Wintendo land there's also Cygwin which has the GNU stuff and lots of other open source stuff.

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

      Just for the laugh of it:

      Windows (the trademarked (C) "operating system" the swiss-...-cheese of operating systems) offers its own cmd line sort function aptly named sort. (I have learned this fact from: Re: Sorting a text file). From some initial (and final) enquiries it looks like it sorts on unit-character columns! So, it is practically a useless executable occupying space just like so many of its siblings. The ever so helpful people (don't ever ask there a question how to suicide lest you want an overflow) of stack overflow suggest using a spreadsheet to sort a file if you find yourself in a Windows system! Or install GNU utils in order to benefit from the Unix sort.

      Thanking Unix, thanking GNU, thanking Perl for my everyday sanity.

Re: Sort alphabetically from file
by haukex (Archbishop) on Jun 14, 2019 at 13:25 UTC
Re: Sort alphabetically from file
by tybalt89 (Monsignor) on Jun 15, 2019 at 02:54 UTC
    #!/usr/bin/perl # https://perlmonks.org/?node_id=11101345 use strict; use warnings; print sort { $a =~ s/[\s\d]*//r cmp $b =~ s///r } <DATA>; __DATA__ 1 2 3 delta 1 2 3 apricot 1 2 3 charlie 1 2 3 bravo 1 2 3 echo 1 2 3 fox
      $a =~ s/[\s\d]*//r cmp $b =~ s///r

      Doesn't the proper operation of the compare using this trick (for which, I think, see "The empty pattern //" in perlop) assume a left-side-then-right-side order in which cmp (see Equality Operators in perlop) evaluates its operands (which I don't see specified anywhere)? I.e.,
          $a =~ s///r cmp $b =~ s/[\s\d]*//r
      fails (testing under Perl version 5.14). One can imagine that such a situation might easily arise during a cut/paste change of sorting order from ascending to descending. Isn't this a bug waiting to be born (can we call it a larva)? Doesn't it at least merit a prominent comment?


      Give a man a fish:  <%-{-{-{-<

Re: Sort alphabetically from file
by johngg (Canon) on Jun 15, 2019 at 14:04 UTC

    In case your 4th column contains spaces you can split with a limit. Here I am opening the file contained in a HEREDOC as opposed to using the automatically opened DATA file handle.

    johngg@shiraz:~/perl/Monks$ perl -Mstrict -Mwarnings -E ' open my $inFH, q{<}, \ <<__EOF__ or die $!; 1 2 3 delta 1 2 3 apricot 1 2 3 blue cup 1 2 3 charlie 1 2 3 yellow banana 1 2 3 bravo 1 2 3 echo 1 2 3 fox __EOF__ print for map { substr $_, 50 } sort map { pack q{A50A*}, ( split m{\s+}, $_, 4 )[ -1 ], $_ } <$inFH>; close $inFH or die $!;' 1 2 3 apricot 1 2 3 blue cup 1 2 3 bravo 1 2 3 charlie 1 2 3 delta 1 2 3 echo 1 2 3 fox 1 2 3 yellow banana

    Just another way out of many.

    Cheers,

    JohnGG

Re: Sort alphabetically from file
by james28909 (Deacon) on Jun 15, 2019 at 01:56 UTC
    Using Hash of Arrays, is simple enough.
    use strict; use warnings; my %hash; while (<DATA> =~ /(\d)\s+(\d)\s+(\d)\s+(\w+)/){ push @{$hash{$4}}, $1, $2, $3; } print "@{$hash{$_}}[0..2] $_\n" for sort keys %hash; __DATA__ 1 2 3 delta 1 2 3 apricot 1 2 3 charlie 1 2 3 bravo 1 2 3 echo 1 2 3 fox
    EDIT: changed up code slightly

      Original contents:

      thanks I tried the below and no errors but nothing happens, sure it's me though.
      if ($ARGV[0] eq "-a") { open (INFILE, "$ARGV[1]") or die "$ARGV[1] cannot be openned : $!"; my %hash; while ($source_file =~ /(\d)\s+(\d)\s+(\d)\s+(\w+)/) { push @{$hash{$4}}, $1, $2, $3; } print "@{$hash{$_}}[0..2] $_\n" for sort keys %hash; }
      BTW I should've mentioned I must call my program as follows: myprogram.pl($ARGV[0]) option($ARGV[1]) mytextfile.txt($ARGV[-1]) I'm using this for assigning the text file to a variable $source_file = "$ARGV[-1] hence using $source_file to call the text file but now working. am I missing something?
      Thanks - will try this out. :)

      2019-06-17 Athanasius restored original contents and added code tags around the program call

        • open (INFILE, "$ARGV[1]") or die "$ARGV[1] cannot be openned : $!"; while ($source_file =~ /(\d)\s+(\d)\s+(\d)\s+(\w+)/)
          Maybe (probably) in your head there's a connection between INFILE and $source_file, but not in your script.
          For Perl, they're unrelated…
        • if ($ARGV[0] eq "-a")
          How do you know your $ARGV[0] really equals "-a"?
          What happens if it doesn't?
          I recommend to add an else block with a print "whatever\n", at least while you're still experimenting.
        • Many of the suggestions to your questions include using strict and warnings. Although this seems to impede or slowing down one's development, I recommend it, too:
          • Yes, the time it takes before your script "runs", will be longer
          • The time until your script works correctly, will be shorter
          One of strict's messages is confusing for beginners:
          Global symbol "$x" requires explicit package name
          should read
          Do you really want "$x" to be a global symbol? Better declare it with my
        If you are new to Perl, you might like diagnostics, which won't throw more errors, but messages that (hopefully) are more informative. So, your script(s) should start with
        use strict; use warnings; use diagnostics;

        It is better to open the file using open INFILE, "<", $ARGV[1] or, even better, open $inFILE_HANDLE, "<", $ARGV[1]. It is not causing you a problem now because open() opens for reading (called the "mode") without explicitly setting mode (e.g. reading: "<")

        There is a problem reading the file. First you open a file using open(), for that you get a fileHANDLE, e.g. the INFILE or $inFILE_HANDLE. Then you loop reading from the FILE-HANDLE using the diamond operator while(<$inFILE_HANDLE>){ print $_ } and then you close the file(HANDLE): close $inFILE_HANDLE;. You can't read any file contents from a variable which just stores the fileNAME.

        In the case your input is not unique wrt the fourth column you will miss input, all similar column-four lines will go to the hash keyed on column-four overriding any previous line with same column-four key. One solution is not to use a hash but an array of arrays (these are formed in exactly the same way as you do now with the regex) and sort works on arrays.

        See Re^3: Sort alphabetically from file and Re^3: Sort alphabetically from file, they already gave you hints for file open/read problems.