I think I recognise some of this ;-) First, your if(){}elsif(){}else(){} is redundant. The 'assignment,next if match' construction does exactly the same thing.
Second, lets rethink the data structure. Using arrays means that you have to hard code labels, keep track of positions, generally keep a lot of state in your program. I think this screams for a hashes.
I'll take the code in order:
#!/usr/bin/perl-Tw
use strict;
use Text::ParseWords;
my $fname = "CommaSample.dat";
my $pretty = 1;
my %major_PER_Data = ();
my %major_EMP_Data = ();
my %major_ADR_Data = ();
{
my ($toggle, @data) = (0);
open FH, "< $fname" or die "Cannot open datfile: ", $!;
while (<FH>) {
chomp;
@data = "ewords('\s+', 0, $_);
$toggle = 1, next if $data[0]=~/PER/;
$toggle = 2, next if $data[0]=~/EMP/;
$toggle = 3, next if $data[0]=~/ADR/;
last if $data[0]=~/EOS/;
die "Unknown or missing record tag: ",
"Got $_ on line $., ",
"datafile $fname.$/"
if !$toggle;
chomp;
@data = "ewords('\s+', 0, $_);
if ($toggle == 1) {
$major_PER_Data{data[0]} = {
Name => [$data[1]],
Color => [split /\s*,\s*/, $data[2]],
Date => [$data[3]]};
}
elsif ($toggle == 2) {
$major_EMP_Data{data[0]} = {
Company => [$data[1]],
Position => [$data[2]],
Date => [$data[3]],
City => [$data[4]],
State => [$data[5]],
SecretCode => [$data[6]],
RoundNumber => [$data[7]],
Bugeyes => [$data[8]],
}
elsif ($toggle == 3) {
$major_ADR_Data{data[0]} = {
Street => [$data[1]],
City => [$data[2]],
State => [$data[3]],
Zip => [$data[4]],
Mileage => [$data[5]],
MoreCode => [$data[6]]};
}
$toggle = 0;
}
}
print "Personal Data...$/";
for my $entity ( sort keys %major_PER_Data) {
# only getting these from one hash, using for all three
print $entity,$/;
for my $item (qw(Name Color Date)) {
print map {"$item: $_$/"} @{$major_PER_Data{$entity}->{$item}
+};
}
# ... and so on for %major_others
}
I kept your organization as much as possible, but I would be better to key first on "Entity" within a single hash referencing a hash with keys 'PER', 'ADR', and 'EMP'. I'm assuming Entity is a unique identifier.
There are lots of ways to improve this. I'd try factoring out the data parsing to subroutines, keeping arrays of each line's field names, and making use of slices.
After Compline, Zaxo |