in reply to Re^4: Module for parsing tables from plain text document
in thread Module for parsing tables from plain text document
Let me throw in some assumptions here (having to deal with quite a few other text-based formats at work):
- You can skip the headers, because they are standardized across files
- Everything before the first number (or minus sign) is the location name
- Data columns always contain a value
- The only column that can contain spaces is the location name
That means, we can just collapse spaces. We have to handle the location name special, but after that can use split to recover the columns:
#!/usr/bin/env perl use strict; use warnings; use Data::Dumper; use Carp; my @sites; open(my $ifh, '<', 'eclipse.txt') or croak($!); # Skip header for(1..5) { my $tmp = <$ifh>; } while((my $line = <$ifh>)) { chomp $line; next if($line eq ''); # Ignore empty lines my %entry; $line =~ s/\ +/ /g; # Collapse spaces if($line =~ /^(.*?)\s[-\d]/) { $entry{location} = $1; # Remove location name $line =~ s/^.*?\s([-\d])/$1/; # Split along spaces my @parts = split/\ /, $line; foreach my $name (qw[long1 long2 lat1 lat2 elevation h m s PA +Alt]) { $entry{$name} = shift @parts; } push @sites, \%entry; } } close $ifh; print Dumper(\@sites);
That results in an array of hashes:
$VAR1 = [ { 's' => '59', 'elevation' => '0', 'long2' => '45.', 'lat2' => '55.', 'lat1' => '-36', 'location' => 'Auckland', 'm' => '33', 'h' => '4', 'long1' => '174', 'PA' => '313', 'Alt' => '13' }, { 'h' => '4', 'm' => '40', 'PA' => '326', 'Alt' => '11', 'long1' => '173', 'lat2' => '35.', 'long2' => '55.', 's' => '34', 'elevation' => '30', 'location' => 'Blenheim', 'lat1' => '-41' }, { 'h' => '4', 'm' => '42', 'PA' => '327', 'Alt' => '9', 'long1' => '175', 'lat2' => '35.', 'long2' => '25.', 's' => '28', 'elevation' => '0', 'location' => 'Cape Palliser', 'lat1' => '-41' }, ...
PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^6: Module for parsing tables from plain text document
by LanX (Saint) on Jan 13, 2023 at 14:15 UTC | |
by cavac (Parson) on Jan 13, 2023 at 14:39 UTC | |
by LanX (Saint) on Jan 13, 2023 at 15:14 UTC | |
by Anonymous Monk on Jan 13, 2023 at 15:11 UTC | |
by LanX (Saint) on Jan 13, 2023 at 15:18 UTC |
In Section
Seekers of Perl Wisdom