There are number of ways to do it with straight regular expressions, but the technology you are feeding this into (and thus the necessary input-output mapping) is unfamiliar to me. Parsing one of your lines may be as easy as /^\s*(\S+)\s+(\S+)\s*$/, but maybe this needs the m modifier depending on context. This particular expression will fail on all lines other than your data lines, since all it does is (thanks to YAPE::Regex::Explain):
The regular expression:
(?m-isx:^\s*(\S+)\s+(\S+)\s*$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?m-isx: group, but do not capture (with ^ and $
matching start and end of line) (case-
sensitive) (with . not matching \n)
(matching whitespace and # normally):
----------------------------------------------------------------------
^ the beginning of a "line"
----------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
\S+ non-whitespace (all but \n, \r, \t, \f,
and " ") (1 or more times (matching the
most amount possible))
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
$ before an optional \n, and the end of a
"line"
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
and can be shown to work on this example with
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
$_ = <<'EOT';
Port 1 Database Assignments
Region Data Type # Records
GLOBAL --
LOCAL --
BUF --
D1 Unused
D2 Unused
D3 Unused
D4 Unused
D5 Unused
D6 Unused
D7 Unused
D8 Unused
A1 Unused
A2 Unused
A3 Unused
USER Unused
EOT
my %hash;
while (/^\s*(\S+)\s+(\S+)\s*$/mg) {
$hash{$1} = $2;
}
print Dumper \%hash;
However, it'll break pretty quickly if your input is not representative; e.g. if you Region or Data Type contain white space (this looks fixed width to me) or if # Records is not null.
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.
|