Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

(need feedback) Re: HTML::LinkExtractor

by PodMaster (Abbot)
on Aug 24, 2002 at 14:56 UTC ( [id://192544]=note: print w/replies, xml ) Need Help??


in reply to HTML::LinkExtractor

I thought about adding the following
=head2 SNIPPET You've just gotten a link with C<_TEXT> but you don't want the HTML crap that comes with the text. While C<HTML::LinkExtractor> won't get rid of it for you, it's easier than easy with C<HTML::TokeParser::Simp +le> use HTML::TokeParser::Simple; my $Link = { '_TEXT' => '<a href="http://perl.com/"> I am a LINK!! +! </a>'}; warn StripHTML( \$Link->{_TEXT} ); warn StripHTML( \'<q>Turn on your love light BABY!</q>' ); sub StripHTML { my $HtmlRef = shift; my $tp = new HTML::TokeParser::Simple( $HtmlRef ); my $t = $tp->get_token(); # MUST BE A START TAG (@TAGS_IN_NEED +) # otherwise it ain't come from LinkE +xtractor if($t->is_start_tag) { return $tp->get_trimmed_text( '/'.$t->return_tag ); } else { die " IMPOSSIBLE!!!! "; } } =head1 AUTHOR
But then it hit me, why not just provide this as a package method?

Or provide an option to do this automatically?

Use get_text instead of get_trimmed_text (maybe make this an option as well)?

BTW ~ I'm gonna stick with HTML::TokeParser::Simple. Ovid doesn't need the publicy, but I like it. This'll be on CPAN before monday.

update: well, I made some changes and put it up on CPAN

____________________________________________________
** The Third rule of perl club is a statement of fact: pod is sexy.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://192544]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (3)
As of 2024-04-26 00:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found