http://qs321.pair.com?node_id=1161484


in reply to REGEX for url

It looks like you're just trying to extract values of href= attributes from anchor tags (i.e. the "..." from <a href="...">) in html data.

I'm surprised that no one yet has mentioned that there are CPAN modules for doing exactly that - e.g. HTML::LinkExtor, among others. (I haven't had occasion to use them myself. but to do what you're doing, I'd start with one of those.)

Replies are listed 'Best First'.
Re^2: REGEX for url
by wrkrbeee (Scribe) on Apr 25, 2016 at 21:46 UTC
    You are exactly right, extract data between anchor tags. I will try the CPAN module you mentioned. Thank you!!
      Having looked a little more at the CPAN search results, I find it odd that the man page for HTML::LinkExtor appears to be shorter and simpler than the one for HTML::SimpleLinkExtor -- I'm not sure what "Simple" is supposed to refer to in the latter module.