Just another Perl shrine PerlMonks

### Re^3: Sort alphabetically from file

by soonix (Canon)
 on Jun 15, 2019 at 09:50 UTC Need Help??

in reply to Re^2: Sort alphabetically from file
in thread Sort alphabetically from file

• open (INFILE, "$ARGV[1]") or die "$ARGV[1] cannot be openned : $!"; while ($source_file =~ /(\d)\s+(\d)\s+(\d)\s+(\w+)/)
[download]
Maybe (probably) in your head there's a connection between INFILE and $source_file, but not in your script. For Perl, they're unrelated… • if ($ARGV[0] eq "-a")
[download]
How do you know your $ARGV[0] really equals "-a"? What happens if it doesn't? I recommend to add an else block with a print "whatever\n", at least while you're still experimenting. • Many of the suggestions to your questions include using strict and warnings. Although this seems to impede or slowing down one's development, I recommend it, too: • Yes, the time it takes before your script "runs", will be longer • The time until your script works correctly, will be shorter One of strict's messages is confusing for beginners: Global symbol "$x" requires explicit package name
Do you really want "$x" to be a global symbol? Better declare it with my If you are new to Perl, you might like diagnostics, which won't throw more errors, but messages that (hopefully) are more informative. So, your script(s) should start with use strict; use warnings; use diagnostics; [download] Replies are listed 'Best First'. Re^4: Sort alphabetically from file by edujs7 (Novice) on Jun 15, 2019 at 10:24 UTC noted. Thank you very much for your support. Also, if your on Windows like me, I open($file, '<', shift) or die "$!"; and then immediately binmode($file);.

Commands or parameters or filenames added to the command line when calling the script get put into an array called @ARGV and when you call shift it increments $ARGV[0] to$ARGV[1] to $ARGV[2] and so on for each shift used. So, if you used C:\path\to\script\perl my_script.pl file_1.txt outfile.txt then you could use shift again to open() (use three arg open) and instead of printing it to the console window, you can write the output to$outFile.

use strict;
use warnings;

my %hash;

open (my $inFile, '<', shift) or die "$!";
open (my $outFile, '>', shift) or die "$!";

binmode($inFile); binmode($outFile);

while (<$inFile> =~ /(\d)\s+(\d)\s+(\d)\s+(\w+)/){ push @{$hash{$4}},$1, $2,$3;
}

print $outFile "@{$hash{$_}}[0..2]$_\n" for sort keys %hash;
[download]
Usage: C:\path\to\script\perl my_script.pl inFile.txt outFile.txt
[download]

Also, please note this also removes one space from each column per row. As long as that does not corrupt your data set it should be fine. It actually may save you some hard drive space. :)

EDITED: fixed typo in matching patterns, thanks haukex

EDITED: changed and made obvious that the individual needs to make absolutely certain that this does not corrupt anything in their data set.

EDITED: had to add a new paragraph so my second EDIT looked ok.

when you call shift it increments $ARGV[0] to$ARGV[1] to $ARGV[2] and so on for each shift used. No, shift removes the first element of @ARGV on each call, returning the element it removed. /(\d)\s*(\d)\s*(\d)\s*(\w*)/ Note that this will also match a line as simple as "123", or really anything that has three consecutive digits, since that's the only thing this regex requires. I would strongly recommend using \s+, \d+, and \w+, and anchoring the regex to the beginning and end of the string with ^ resp.$.

As long as that does not corrupt your data set it should be fine (and i am sure it is fine)

Sorry, but how can you be sure? Some file formats require \t as a column separator.

Update: Expanded the last quote and highlighted the part I was reacting to.

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11101391]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2022-08-08 06:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?