Here's another approach. It doesn't extract all the fields in one swell foop (the patterns I'm calling $real numbers have to be extracted in a separate step), but one can imagine some degree of customization is possible for different types of records:
c:\@Work\Perl\monks>perl -wMstrict -le
"my @lines = (
'C31 6 3 2.4 1.5 2.6 ',
'C32 2 7 3 1.0 ',
'H31 1 1 0 21.0 11.2 5.3 1.4',
'T11 2 1 0 6.0 1.1 2.2',
'L06 1 1 0 1.0 3.3',
'L99 1 1 0 1.1 2.2 3.3 4.4 5.5',
);
;;
my $int = qr{ (?<! \d) \d+ (?! \d) }xms;
my $real = qr{ $int [.] $int }xms;
my $header = qr{ [[:upper:]] \d\d }xms;
;;
my $n = 4;
my $extract = qr{
($header) \s+ ($int) \s+ ($int) \s+ ($int) ((?: \s+ $real){1,$n})
+\s*
}xms;
;;
for my $line (@lines) {
printf qq{'$line' -> };
my $got = my ($h, $d1, $d2, $d3, $r) = $line =~ m{ \A $extract \z }
+xms;
;;
if ($got) {
my @reals = $r =~ m{ $real }xmsg;
print qq{'$h' '$d1' '$d2' '$d3' (@reals)};
}
else {
print 'unknown';
}
}
"
'C31 6 3 2.4 1.5 2.6 ' -> unknown
'C32 2 7 3 1.0 ' -> 'C32' '2' '7' '3' (1.0)
'H31 1 1 0 21.0 11.2 5.3 1.4' -> 'H31' '1' '1' '0' (21.0 11.2 5.3 1.4)
'T11 2 1 0 6.0 1.1 2.2' -> 'T11' '2' '1' '0' (6.0 1.1 2.2)
'L06 1 1 0 1.0 3.3' -> 'L06' '1' '1' '0' (1.0 3.3)
'L99 1 1 0 1.1 2.2 3.3 4.4 5.5' -> unknown
Update: Tested under Perl versions 5.14.4 and 5.8.9.
Give a man a fish: <%-(-(-(-<
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|