Perl-Sensitive Sunglasses | |
PerlMonks |
Re: Applying regex to each line in a record.by haukex (Archbishop) |
on Oct 25, 2020 at 10:45 UTC ( [id://11123145]=note: print w/replies, xml ) | Need Help?? |
To comment on your code specifically:
You're activating paragraph mode with $/ = "", meaning that for your sample data, each call to <$fh> in scalar context (e.g. my $para = <$fh>) will return one paragraph (e.g. "first:\nthis:that\nhere:there..."). However, in your while, you're calling <$fh> in list context (because you're assigning to an array), which will cause it to return all records, i.e. all paragraphs. This means that the second time the while tries to execute, it won't get anything from $fh, so the while loop will only execute once, and that makes the while loop kind of useless in this code. You can see this yourself by adding a print Dumper(\@records); at the top of the while (I'd also strongly recommend setting $Data::Dumper::Useqq=1;). Next, based on your variable naming and code, I guess that what you are expecting is that foreach my $line (@records) will loop over the lines in each paragraph. However, Perl doesn't do this automatically - with this code, you'd have to split each element of @records manually. What you're doing instead is looping over the paragraphs. Here is the code I think you were trying to write:
As you can see, the problem actually occurrs before your code even gets to the regex. The above approach is ok, as long as the paragraphs don't get too large to fit comfortably into RAM. Otherwise, you'd have to choose a more efficient approach like reading the file line-by-line and recognizing paragraphs with a state machine type approach. The other monks have shown you several examples of different approaches.
In Section
Seekers of Perl Wisdom
|
|