Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: (jcwren) Re: Text::Balanced woes..

by u914 (Pilgrim)
on May 27, 2002 at 03:05 UTC ( #169469=note: print w/replies, xml ) Need Help??

in reply to (jcwren) Re: Text::Balanced woes..
in thread Text::Balanced woes..

I see...
thank you very much, Chris.... i never would have though to try a test excluding everything before the tag...

You're right that Text::Balanced isn't very useful like this, i can't imagine that the author meant it to be this way.

You can see what i'm trying to do (well, actually i'll be parsing a href links out in an effort to combat chatroom spambots), is there another method you'd suggest?

It was looking at the docs that convinced me that Text::Balanced was the right thing for me... i'd be using the 5th (#4) element.. i haven't found another lib that'll supply just the stripped URL inside the tag yet... and while Perl seems super-cool for text handling (i'm a duffer), i'd rather not rewrite the wheel..

in any case, thanks very much for your reply!

Replies are listed 'Best First'.
(jcwren) Re: Text::Balanced woes..
by jcwren (Prior) on May 27, 2002 at 03:12 UTC

    There are several packages based on HTML::Parser, such as HTML::LinkExtor, that shouldn't require you to invent too many wheels. I would take a look at that.

    I would avoid at all costs attempting to use a regular expression to attempt to extract links. That's just a path to problems.


    e-mail jcwren

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://169469]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (10)
As of 2023-03-29 12:47 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (71 votes). Check out past polls.