Regex not working

imrags has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Regex not working by davorg (Chancellor) on Jul 17, 2009 at 09:31 UTC
Your regex is failing to take account of the newline characters. However, parsing HTML with a regex is a bad idea. You should look at using a real HTML parser. -- See the Copyright notice on my home node. Perl training courses	[reply]
Re: Regex not working by prasadbabu (Prior) on Jul 17, 2009 at 09:40 UTC
Hi Raghu, As davorg suggested it is always better to use the HTML Parsers to do these kind of stuffs. You have missed 's' option modifier. Also use qr to quote regular expressions instead of manually backslashing everything. `use strict; use warnings; my $content = "<div class='roundedBoxBody'><p> <table>sample table</table> <p> </p></p>"; my $x = qr{<div class='roundedBoxBody'><p>}; my $y = qr{<p> </p></p>}; my $content_out = $1 if ($content =~ m\|$x(.*)$y\|s); print $content_out;` [download] Prasad	[reply] [d/l]
Re^2: Regex not working by davorg (Chancellor) on Jul 17, 2009 at 09:44 UTC
Also use qr to quote regular expressions instead of manually backslashing everything. That's actually not necessary here. The OP was just escaping things that didn't need escaping. `$x = "<div class='roundedBoxBody'><p>"; $y = '<p> </p></p>';` [download] Works just as well. -- See the Copyright notice on my home node. Perl training courses	[reply] [d/l]
Re: Regex not working by Anonymous Monk on Jul 17, 2009 at 09:36 UTC
try this: `$content =~ /$x(.*)$y/s;` [download] And follow the above recommendation.	[reply] [d/l]
Re: Regex not working by imrags (Monk) on Jul 17, 2009 at 10:10 UTC
Thank you everyone, the /s was the problem, i had not put it...that prevented the regex from working Also, I am planning to use HTML::TreeBuilder...to get the table. <table border='1' width='50%' align='center'><tr><td><strong>Customer< +/td><td><strong>Total Samples</td><td><s trong>SL Violations</td><td><strong>Avg Availability</td></tr><tr><td> All Customers</td><td>187556</td><td>2167</td><td>98.84</td> </td></tr></table> <br><p><strong><h2><center>Customers Below 90% Available</center></h2> +</p><table border='1' width='50%' align= 'center'><tr><td><strong>Customer</td><td><strong>Total Samples</td><t +d><strong>SL Violations</td><td><strong> Availability</td></tr> <tr><td>10P</td><td>1064</td><td>130</td><td>87.78 %</td></tr> <tr><td>B8S</td><td>326</td><td>34</td><td>89.57 %</td></tr> </tr></table> [download] I'm trying to get individual values from the table and then convert to pdf... Would HTML::TreeBuilder be a good choice to fetch data? Raghu	[reply] [d/l]
Re^2: Regex not working by grinder (Bishop) on Jul 17, 2009 at 12:03 UTC
I'm trying to get individual values from the table and then convert to pdf... Would HTML::TreeBuilder be a good choice to fetch data? I've minimal experience with it, mainly because each time I pick it up, I've found the interface cumbersome, and unwieldy to use. And it's pretty slow, relatively speaking, although I don't consider that to be an important point. I find HTML::Parser much easier to use (although you have to invest some time in learning how to use it). If you install it via a package, do yourself a favour and track down the examples directory that is bundled with the distribution. You will probably find an example that you can adapt to the problem at hand. It's a complex tool that's worthwhile mastering if you have to grovel around in HTML files. • another intruder with the mooring in the heart of the Perl	[reply]
Re^3: Regex not working by prantikd (Novice) on Jul 18, 2009 at 12:20 UTC
Hi Raghu, Since, you are evaluating different perl modules to parse HTML files, you can take a look at HTML::TokeParser. It is an alternative HTML::Parser interface. I have used it and found it pretty helpful. - Prantik	[reply]


Do you know where your variables are?
	PerlMonks