replacing clickable web and email addresses

markolus has asked for the wisdom of the Perl Monks concerning the following question:

Hi there -- I am currently involved in trying to do two things:

1) take input text (in the form of a string parameter value to a script) of potentially email and web addresses and replace them with clickable alternatives.

i.e.
www.domain.com to be replaced/rewritten as <a href="/cgi-bin/script.pl?u=www.domain.com">www.domain.com</a> and mail@domain.com to be rewritten as <a href="mailto:mail@domain.com">mail@domain.com</a>

To to do the above I am working using both Email::Find and URI::Find although I have been using the s!(www.[^\s]+)!<a href="/cgi-bin/script.pl?u=$1">$1</a>!gi piece of code up to now for web addresses.

----....
Now the bit that is causing a bit of a headache is doing it the other way round.

i.e. converting the clickable links as shown above back to their text original. Do I need a wierd and wonderful regex and substitition piece of code or is there a module that will make my life easy? I need to be able to reproduce my html link format and not just have a simple target=_blank href?

Any ideas?

Comment on replacing clickable web and email addresses Select or Download Code

Replies are listed 'Best First'.
Re: replacing clickable web and email addresses by dreadpiratepeter (Priest) on Mar 22, 2002 at 14:16 UTC
Try HTML::Parser. You should be able to pull the anchors out of your document and replace them with whatever you'd like. -pete "I am Jack's utter lack of disbelief"	[reply]
Re: replacing clickable web and email addresses by Mask (Pilgrim) on Mar 22, 2002 at 14:50 UTC
You can try to see what guys in "Perl Text to HTML" Project have done. Also there is a perl program called txt2html - probably you can be enspired from it's sources.	[reply]
Re: replacing clickable web and email addresses by AidanLee (Chaplain) on Mar 22, 2002 at 14:17 UTC
Try HTML::Parser. using regexes to parse HTML is a nightmare as many will attest to, and HTML::Parser is usually the defacto reccommendation around here when you've got HTML to munge. HTH	[reply]


"be consistent"
	PerlMonks