Hi Guys as you can see from my 2 points of experiance, I am a rookie in your world even though i must have been working with perl for abotu 3 years now. Any way down to business.
Im currently writing a content managment system, not a problem there as Javascript and HTMLPad are easy. The problem is that i would like people to put in a URL and then for the software to pull that page (easy) but then sort out all the links for images, backgournd tags etc..
I am already handled the style sheet, by parsing for it.. the collect it with UA and inserting that as page contents with the correct tags. My problems start to happen when someone puts in a URL like http://www.xxxxxx.co.uk/newsletter.htm
for a start the page contains links like src="http://www.xxxxxx.com" which is really the same site just differnet domain name, or the page doesn't already contain the http://www.xxxxx.co.uk just ../../blah.gif or /blah/blah/funny.gif
here is what i have already, as i know your wisdom is better in regexps than mine im hoping you can help or point me in the direction of a Perl module. As im a rookie, im sure you will find loads wrong with the code from the word go.. but here is a snippet..
#Pull the page and sort it out.. then display edit window
use LWP::UserAgent;
my $ua = LWP::UserAgent->new();
$ua->agent("");
my $content = $ua->get($fields{'url'})->content();
$fields{'url'} =~ s/http:\/\/(.*?)\/.*/$1/ig;
$content =~ s/src="/src="http:\/\/$fields{'url'}\//sig;
#Handle Styles
$content =~ m/<link href="(.*?)"/ig;
my $styleurl = $1;
my $styles;
if ($styleurl ne ''){
$styles = $ua->get($fields{'url'}.'/'.$styleurl)->con
+tent();
}
$styles = '<style type="text/css"><!--'.$styles.'--></styl
+e>';
$content =~ s/<\/head>/<\/head>$styles/sig;
$tpl_inner = &gettpl($skindir,'pointblank_templateadd2.htm
+');
$tpl_inner =~ s/<!-- Content -->/$content/ig;