Re: Removing text between HTML tags

in reply to Removing text between HTML tags

The substitution below works for the sample provided, however this is the wrong way to do it, I've assumed that this is very simple HTML with nothing that would break a very simple minded substitution. (eg, what would happen when a button with the alternative text "Next >" ) There is a famous response to this on another site, but the Perl specific response is to use a HTML parsing module eg HTML::TokeParser::Simple which helpfully has extracting the content from a html file as the first example.

s/<[^>]+>//g;
[download]

print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."

Comment on Re: Removing text between HTML tags Select or Download Code

Replies are listed 'Best First'.
Re^2: Removing text between HTML tags by perll (Novice) on Sep 14, 2014 at 15:03 UTC
Thanks, I know about HTML::TokeParser::Simple, but I am working on my office laptop and firewall blocks cpan :( It will take time for me to get that module. Anyway it is a known set of HTML and will be same for all pages, thank you.	[reply]
Re^3: Removing text between HTML tags by Utilitarian (Vicar) on Sep 14, 2014 at 15:48 UTC
Is Metacpan blocked? `print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."`	[reply] [d/l]
Re^3: Removing text between HTML tags by sundialsvc4 (Abbot) on Sep 15, 2014 at 00:23 UTC
“Without any further ado,” talk with your boss and ask him or her to arrange for you to have access to CPAN. (You can, if necessary, install all of the modules that you need locally to just your own account and machine, so there are no system-integrity risks.) There is zero doubt in my mind that there is really no other business-justifiable way to get this job done. (And there undoubtedly will be more business-cases like this one. You must have the Right Tools For The Job.)
Re^4: Removing text between HTML tags by perll (Novice) on Sep 23, 2014 at 09:34 UTC
Thanks for reply, my company sucks, we have access only to intranet and to access internet we have a separate bay. This is a new project still in POC and I am working off the records to show the director I can do something :) I got HTML::TokeParser and HTML::TreeBuilder and planning to re-write the code.	[reply]

In Section Seekers of Perl Wisdom