svsingh has asked for the wisdom of the Perl Monks concerning the following question:
I'm trying to pull the title and h1 out of an HTML file (local). I figured this is a little too simple to use any of the HTML parsing modules and I'm using a simple match. The HTML file is guaranteed to have only one h1.
Here's what I'd like to do ...
$/ = '</h1>'; my $chunk = <HTMFILE>; $chunk =~ m%<title>(.+)</title>.*<h1>(.+)</h1>%i;
... which returns a pair of undefs. If I split the match over a couple of lines, however, everything works out just fine. Here's what's working:
$/ = '</h1>'; my $chunk = <HTMFILE>; $chunk =~ m%<title>(.+)</title>%i; my $title = $1; $chunk =~ m%<h1>(.+)</h1>%i; my $heading = $1;
The best explanation I can think of is .* only matches up to a certain number of characters. My test file has 3750 characters between </title> and <h1>. Is that what's happening here?
Thanks for your help.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Is there a Limit on Matching .*
by sauoq (Abbot) on Jul 15, 2003 at 00:01 UTC | |
by BUU (Prior) on Jul 15, 2003 at 03:43 UTC | |
by sauoq (Abbot) on Jul 15, 2003 at 06:37 UTC | |
by svsingh (Priest) on Jul 15, 2003 at 14:43 UTC | |
Re: Is there a Limit on Matching .*
by Elian (Parson) on Jul 15, 2003 at 03:38 UTC | |
Re: Is there a Limit on Matching .*
by nysus (Parson) on Jul 15, 2003 at 00:12 UTC | |
Re: Is there a Limit on Matching .*
by LazerRed (Pilgrim) on Jul 15, 2003 at 00:11 UTC | |
Re: Is there a Limit on Matching .*
by graff (Chancellor) on Jul 15, 2003 at 04:57 UTC | |
(jeffa) Re: Is there a Limit on Matching .*
by jeffa (Bishop) on Jul 15, 2003 at 15:09 UTC |
Back to
Seekers of Perl Wisdom