You can try a negative lookbehind assertion (in perlman:perlre), but it has to be a fixed length pattern, and so won't work if there's spaces around the '=' (so you can first fix that): s/href\s*=\s*/href=/ig;
s,(?<!(?i:href=))((?:http|ftp|news)://\S*),<a href="$1">$1</a>,g;
You might try to think of something besides '\S*' also, in case
the user puts in a period after the href, or puts it between
parenthesis or ???.
Update: Actually, if the user DID already make an href, they may or may not have put quotes around the '=' in the url, so maybe the easiest thing to do is rewrite all urls:# Now its starting to get ugly!
$str =~ s#((?i:<a\s+href\s*=\s*"?)?(?:http|ftp|news)://[^"]+)#substr($
+1,0,1) eq '<' ? $1 : qq!<a href="$1"
>$1</a>!#eg;
I think that's right now (and you still may want to replace that '[^"]' with something else) :-)
Actually, even that may not work if there's other attributes besides 'href' in the tag, but I'm
done. The rest is up to you :-) | [reply] [Watch: Dir/Any] [d/l] [select] |
You should maybe look into URI::Heuristic for completing your links. Something like this:
#!/usr/bin/perl
use strict;
use URI::Heuristic qw(uf_uristr);
my $url = uf_uristr("www.perlmonks.com");
print "<a href=$url>$url</a>\n";
If the user input is www.perlmonks.com or http://www.perlmonks.com it will give you the proper URI. In this case with html code:
<a href=http://www.perlmonks.com>http://www.perlmonks.com</a>
using www.perlmonks.com as input.
It also does ftp and others.
ok..it's a bit off what you really asked for but i just noted this module at CPAN and liked it ;o)
Update:
#!/usr/bin/perl
use URI::Heuristic qw(uf_uristr);
use strict;
my $url;
my $input = "<a href = http://www.perlmonks.com>perlmonks</a>";
#my $input = "www.perlmonks.com";
if ($input =~ /^</) {
$input =~ /<a\s+href\s*=\s*(.*)>.*<\/a>/i;
$url = uf_uristr("$1");
} else {
$url = uf_uristr("$input");
}
print "<a href=$url>$url</a>";
This works just fine, also with resolving the url to an uri.
You may also drag out the text by enclosing the last .* and drag out the value of $2.
The regex should work on it's own without using the module also, me thinks. | [reply] [Watch: Dir/Any] [d/l] [select] |
How about just adding (?:<a href = \")? and (?:<\/a>)? at the begining and end (respectively) of your regexp?
That's the best suggestion I can give with my Perl knowledge.
FouRPlaY
Learning Perl or Going To Die Trying
| [reply] [Watch: Dir/Any] [d/l] [select] |
my $newcontent = _translateLinks2($initcontent);
#
sub _translateLinks2 {
my $content = shift;
my @all = split(/ /,$content);
my $skip = 0;
my @trans;
for my $a (@all) {
# print "{ $a }\n";
if ($a =~ m#<a#i) {
$skip = 1;
}
# print "skip : $skip\n";
if ($skip == 0 && $a =~ m#http#i) {
if ($a =~ m#^(\n|\r)#) {
$a =~ s/\n|\r//gis;
$a = qq(\n<a href="$a">$a</a>);
}
elsif ($a =~ m#(\n|\r)$#) {
$a =~ s/\n|\r//gis;
$a = qq(<a href="$a">$a</a>\n);
}
else {
$a = qq(<a href="$a">$a</a>);
}
}
elsif ($skip == 0 && $a =~ m#\w+\.com|\w+\.net|\w+\.org|\w+\.gov#i
+) {
if ($a =~ m#^(\n|\r)#) {
$a =~ s/\n|\r//gis;
$a = qq(\n<a href="http://$a">http://$a</a>);
}
elsif ($a =~ m#(\n|\r)$#) {
$a =~ s/\n|\r//gis;
$a = qq(<a href="http://$a">http://$a</a>\n);
}
else {
$a = qq(<a href="http://$a">http://$a</a>);
}
}
# print "[[ $a ]]\n";
if ($a =~ m#<\/a>#i) {
$skip = 0;
}
push (@trans,$a);
}
my $translated = join (' ',@trans);
return $translated;
}
| [reply] [Watch: Dir/Any] [d/l] |