Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

New Perl User Question

by Rpick (Novice)
on Feb 01, 2002 at 21:20 UTC ( [id://142769]=perlquestion: print w/replies, xml ) Need Help??

Rpick has asked for the wisdom of the Perl Monks concerning the following question:

I'm a Perl Newbie, and this is my first attempt at a Perl script.

Below is the source code.
I would like to make this work with the filename to be processed as a command line arugment, however every time I've tried using the while(<>){ } I see in the Perl books, the program runs once for every line in the file.

I know this code is clunky, and probably a lot longer than it needs to be, suggestions on shortening it would also be appreciated.

#################################################### # InvClean.pl : Raw Scanned File Processing program # Version 1.0 # Written by Robb Pickinpaugh # 01/31/2002 # for use on Windows NT #################################################### use strict;
# Get Filename to process. my $processfilename=''; print "\nEnter filename to process (type exit to quit): "; chomp ($processfilename = <STDIN>); ########################################### # # Setting the Rules for Processing # ########################################### ########################################### # # This sets to name of the file to which # the corrected data will be saved # ########################################### my $cleanfilename = "$processfilename.clean"; ########################################### # # This sets the numeric value # for the "usual" starting character # for each line in the raw file # ########################################### my $correctstartchar = 16; ############################################ # # This sets the "usual" starting length for # lines starting with the "usual" starting # character. # ############################################ my $correctstartlength = 16; ############################################ # # This sets the correct length of lines # after they have been stripped of extra # characters. # ############################################ my $correctcleanlength = 13; ############################################## # # This sets the length of lines that do not # include the extra stop and start characters # that are sometimes included in scanned data # ############################################## my $typedlength = 14; ############################################### # # Do not change these values, they are used to # report the number of lines read, and written # ############################################### my $rawfilelength = 0; my $cleanfilelength = 0; ########################### # # Call Processing Routine # ########################### &ProcessFile; ############################################# # # Report number of lines read from raw file, # and written to "cleaned" file. # ############################################# print "$rawfilelength lines read from $processfilename\n"; print "$cleanfilelength lines written to $cleanfilename\n"; ################################# # # Actual Processing of the File # ################################# sub ProcessFile { my $data=''; my $datalength=0; my $startchar=''; open (RAWFILE, "$processfilename") || die "cannot open: $!"; open (CLEANFILE, ">$cleanfilename") || die "cannot open: $!"; while (<RAWFILE>){ $rawfilelength++; $data = $_; $datalength = length($data); $startchar = ord($data); if ($startchar == $correctstartchar){ if($datalength == $correctstartlength){ chomp $data; chop $data; $data = reverse ($data); chop $data; $data = reverse ($data); }else{ next; } if (length($data) == $correctcleanlength){ print CLEANFILE "$data\n"; $cleanfilelength++; } }elsif ($datalength == $typedlength){ print CLEANFILE "$data"; $cleanfilelength++; }elsif ($datalength > $correctcleanlength) { my $datalengthtrack = $datalength; chomp $data; $datalengthtrack--; chop $data; $datalengthtrack--; $data = reverse ($data); while ($datalengthtrack > $correctcleanlength){ chop $data; $datalengthtrack--; } $data = reverse ($data); print CLEANFILE "$data\n"; $cleanfilelength++; }elsif ($datalength < $correctcleanlength) { next; } } close (RAWFILE) || die "cannot close $processfilename: $!"; close (CLEANFILE) || die "cannot close $cleanfilename: $!"; } print "\a"; exit(0);

Replies are listed 'Best First'.
Re: New Perl User Question
by BazB (Priest) on Feb 01, 2002 at 21:35 UTC

    This is fairly straightforward - you might want to use Super Search for other examples.

    Commandline arguments are available in the array @ARGV

    Try something like this:

    #!/usr/bin/perl -w use strict; my $in_file = shift @ARGV; open(INFILE, "$in_file") or die "Can't open input file!: $!\n"; while (<INFILE>) { # do stuff with each line of INFILE until # there are no more lines to process } close(INFILE);

    Hope that helps.

    BazB.

      BazB,

      Thanks a lot, that did exactly what I was looking for.
      I guess I missed the need for the shift before the @ARGV.

      Thanks again

Re: New Perl User Question
by screamingeagle (Curate) on Feb 01, 2002 at 21:40 UTC
    you could also use the following modules to help with parsing command-line parameters :
    a) GetOpt::Std
    b) GetOpt::Long
    in case u decide to extend your programs with additional command-line parameters, and/or u need to ensure that the correct data types are being passed via the command line , the modules mentioned above should come in handy
Re: New Perl User Question
by sparkyichi (Deacon) on Feb 01, 2002 at 21:44 UTC
    To pass a file from the command line use @ARGV instead of <STDIN>. So your code:
    my $processfilename=''; print "\nEnter filename to process (type exit to quit): "; chomp ($processfilename = <STDIN>);
    Could be:
    my $processfilename=$ARGV[0];


    Sparky
    FMTEYEWTK
Re: New Perl User Question
by CharlesClarkson (Curate) on Feb 02, 2002 at 13:10 UTC

    I know this code is clunky, and probably a lot longer than it needs to be, suggestions on shortening it would also be appreciated.

    Careful what you wish for:

    Take a look at perlstyle. It is included with the standard perl distribution. One style rule mentions the use of the underscore _ to separate the words in variable names. This makes reading variables faster and easier especially for non-native speakers of English. I also prefer to avoid mixed case variables and subroutine names to avoid miss-typing.

    Other style rules mentioned in perlstyle include: always use spaces around operators, use 4-spaces as tabs, and add a space after commas. These rules are not set in concrete. The best thing is find your style, compare it with that of others and then be consistent.

    Keeping this in mind, We can apply BazB's advice:
    my $process_file_name = shift @ARGV;
      Charles,

      Thanks for the warning on being careful what I wish.

      Actually what you showed me was exactly what I was looking for.
      My programming background is in C++, and a bit of VB, and VBA.
      I'm just starting out in perl, and wasn't aware of functions like substr.

      That was exactly the kind of explanation I was looking for.

      Thanks again.

Re: New Perl User Question
by talexb (Chancellor) on Feb 02, 2002 at 04:34 UTC
    I'd like to suggest formatting more like this for your code.

    Code formatting is a religious topic, so let me just walk around the topic gingerly by suggesting that you want to make the code as readable as possible so that it's easy to maintain, easy to document, and easy to debug. Never assume that once you write a piece of code you're never going to have to deal with it again. Unless it's a one-liner, you're probably going to have to go back to it.

Re: New Perl User Question
by edebill (Scribe) on Feb 02, 2002 at 18:24 UTC

    people seem to have missed the rather idiomatic:

    while(my $data = <>){ process the line of data }

    <> in a while like this is a special case, and will automatically open any files listed on the command line, or read from standard input. You should be able to save out $processfilename=$ARGV[0]; beforehand for use in making your $cleanfilename. Using this construct saves you from needing to manually open and close the file.

    Using my where you first use your variables is more readable than "declaring" them beforehand. This is one of the shortcomings of C, that every language since seems to have gone out of their way to overcome.

    A little indentation would also make your code a little easier to parse visually :-)

    Oh, and you might not want to unconditionally decrement $datalengthtrack after a chomp(). Chomp doesn't always remove characters, so depending on the input dataset, you might get errors. It DOES however return the number of chars removed, so you can capture that info and use it ($datalengthtrack -= chomp($data); or the like)

Re: New Perl User Question
by grinder (Bishop) on Feb 02, 2002 at 21:26 UTC
    My one word of advice would be to
    Ditch those comments.

    Seriously. They distract from understanding the code. They will slowly drift out of sync. If you need to comment what purpose a variable serves, you have named it poorly.

    --
    g r i n d e r
    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u';

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://142769]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2024-04-23 13:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found