And reducing the comments some:
# file name for corrected data
my $clean_file_name = "$processfilename.clean";
# numeric value of the "usual" line starting character
my $correct_start_char = 16;
# "usual" starting length of lines starting with $correct_start_char
my $correct_start_length = 16;
# correct length of lines after they have been stripped
my $correct_clean_length = 13;
# length of lines that do not include the extra stop and start charact
+ers
my $typed_length = 14;
# Do not change these values, they are used to
# report the number of lines read, and written
my $raw_file_length = 0;
my $clean_file_length = 0;
You might take alook at perlsub. It is also included with the
standard perl distribution and gives some excellent reasons for not
prefixing subroutine calls with the ampersand &. One more
rule you might adopt is to avoid using global values in subroutines. When
possible attempt to pass all the needed variables to a sub and return
expected values from the sub. This makes the sub more portable and the
program easier to maintain.
With that in mind, we could change:
&ProcessFile;
to:
my($raw_file_length, $clean_file_length) =
process_file(
$process_file_name, $clean_file_name, $correct_start_char,
$correct_start_length, $correct_clean_length, $typed_length);
All this might be shortened, without a loss of clarity with an array:
my @args = (
$ARGV[0], # file to process
"$ARGV[0].clean", # processed file
16, # numeric value of the "usual"
# line starting character
16, # "usual" starting length of lines
# starting with $correct_start_char
13, # correct length of lines after they
# have been stripped
14 # length of lines that do not include
# the extra stop and start characters
);
my($raw_file_length, $clean_file_length) = process_file( @args );
NOTE: We could pass @args with a reference, but I'm not willing
to add too many new concepts at one time. perlref and perlsub have
more information.
Before we examine the subroutine, let's add the end of the program:
print "$raw_file_length lines read from $args[0]\n";
print "$clean_file_length lines written to $args[1]\n";
print "\a";
exit(0);
For the subroutine, we'll examine the logic of the if/else
and we'll use the substr function in place of the
chomp/chop method you used.
chomp $data;
chop $data;
$data = reverse ($data);
chop $data;
$data = reverse ($data);
Could be replaced with:
$data = substr($data, 1, -2);
And:
chomp $data;
$datalengthtrack--;
chop $data;
$datalengthtrack--;
$data = reverse ($data);
while ($datalengthtrack > $correctcleanlength){
chop $data;
$datalengthtrack--;
}
$data = reverse ($data);
print CLEANFILE "$data\n";
Could be replaced with:
print CLEANFILE substr( substr($data, 0, -2), -$correct_clean_length), "\n";
The
if/else block might be re-written as:
if ( $start_char == $correct_start_char ) {
next if $data_length != $correct_start_length;
if ( $data_length == $correct_start_length ) {
print CLEANFILE substr($data, 1, -2), "\n";
$clean_file_length++;
}
} else {
if ( $data_length > $correct_clean_length ) {
print CLEANFILE substr( substr($data, 0, -2), -$correct_clean_
+length), "\n";
$clean_file_length++;
} elsif ( $data_length == $typed_length ) {
print CLEANFILE $data;
$clean_file_length++;
}
}
We'll have to retrieve those variables we passed to the sub:
my ( $file_name, $clean_file_name, $correct_start_char,
$correct_start_length, $correct_clean_length, $typed_length) =
+ @_;
And I took the liberty of shortening the file handle names:
open RAW, $file_name or die "Cannot open $file_name: $!";
open CLEAN, ">$clean_file_name" or die "Cannot open $clean_file_name:
+$!";
And since we were actually keeping track of lines not file lengths, I used:
my($raw_lines, $clean_lines);
in the sub, instead of:
my($raw_file_length, $clean_file_length)
Putting it all together:
#!/usr/bin/perl
####################################################
# InvClean.pl : Raw Scanned File Processing program
# Version 1.0
# Written by Robb Pickinpaugh
# 01/31/2002
# for use on Windows NT
####################################################
use strict;
use warnings;
my @args = (
$ARGV[0], # file to process
"$ARGV[0].clean", # processed file
16, # numeric value of the "usual"
# line starting character
16, # "usual" starting length of lines
# starting with $correct_start_char
13, # correct length of lines after they
# have been stripped
14 # length of lines that do not include
# the extra stop and start characters
);
my($raw_file_length, $clean_file_length) = process_file( @args );
print "$raw_file_length lines read from $args[0]\n";
print "$clean_file_length lines written to $args[1]\n";
print "\a";
exit(0);
sub process_file {
my (
$file_name, $clean_file_name, $correct_start_char,
$correct_start_length, $correct_clean_length, $typed_length) =
+ @_;
open RAW, $file_name or die "Cannot open $file_name: $!
+";
open CLEAN, ">$clean_file_name" or die "Cannot open $clean_file_na
+me: $!";
my ($raw_lines, $clean_lines);
while ( my $line = <RAW> ) {
my $line_length = length $line;
my $start_char = ord $line;
if ( $start_char == $correct_start_char ) {
next if $line_length != $correct_start_length;
if ( $line_length == $correct_start_length ) {
print CLEAN substr($line, 1, -2), "\n";
$clean_lines++;
}
} else {
if ( $line_length > $correct_clean_length ) {
print CLEAN substr( substr($line, 0, -2), -$correct_cl
+ean_length), "\n";
$clean_lines++;
} elsif ( $line_length == $typed_length ) {
print CLEAN $line;
$clean_lines++;
}
}
$raw_lines = $.;
}
close RAW or die "cannot close $file_name: $!";
close CLEAN or die "cannot close $clean_file_name: $!";
return ($raw_lines, $clean_lines);
}
__END__
HTH,
Charles K. Clarkson
HTH,
Charles K. Clarkson
Clarkson Energy Homes, Inc.