Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Speed consideration

by SpaceAce (Beadle)
on Mar 04, 2003 at 01:07 UTC ( [id://240206]=perlquestion: print w/replies, xml ) Need Help??

SpaceAce has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a PERL program that will use SSI to insert affiliate code into web pages. Here is an example of the source code for a web page using this program:

... surrounding HTML ... <A HREF='http://affiliateprogram.com?id=<!--#include virtual="/cgi-bin +/affiliate.pl?option=50"-->'>Link</A> ... surrounding HTML ... ... surrounding HTML ... <A HREF='http://affiliateprogram.com?id=<!--#include virtual="/cgi-bin +/affiliate.pl?option=50&mainafil=mainaffiliateid"-->'>Link</A> ... surrounding HTML ...

People will link to the page like this:
http://www.mysite.com/page.html?affiliateid. The script affiliate.pl will use the QUERY_STRING and (QUERY_STRING_UNESCAPED || DOCUMENT_ARGS) to determine whether to return the affiliate ID from the DOCUMENT_URI or the one embedded in the "include virtual" tag.

My question is this: Given that the SSI directives could appear quite a few times in a single document, under heavy traffic loads would it be better to execute a single call that would process an entire HTML page using regular epressions to insert the affiliate code in all the right places or to go ahead and let the simple insertion script be run multiple times per page load? The advtantage to a single call is that PERL only has to run once but the script would be much more complex, processing an entire web page that could be anywhere from a few K to 40K+. With the simple SSI insertion, the script needs to be executed again and again but all it does is return a string.

I appreciate any input.

Thanks,
SpaceAce
s>>sp>;s>..|>\u$&ace>g;print;

Replies are listed 'Best First'.
Re: Speed consideration
by dws (Chancellor) on Mar 04, 2003 at 03:12 UTC
    Given that the SSI directives could appear quite a few times in a single document, under heavy traffic loads would it be better to execute a single call that would process an entire HTML page using regular epressions to insert the affiliate code in all the right places or to go ahead and let the simple insertion script be run multiple times per page load?

    There's a third option that's close in spirit to your first choice: Instead of using regular expressions to rewrite the HTML, usage a templating mechanism (e.g., HTML::Template).

    Your second scheme starts a process per SSI insertion. Under load, this'll kill you.

    Don't be concerned that the page you're building is 40K or so. That's peanuts compared with the overhead launching several process.

Re: Speed consideration
by perrin (Chancellor) on Mar 04, 2003 at 05:13 UTC
    mod_include is not as fast as you might think. Under mod_perl, the Apache::SSI module which is 100% Perl runs faster. Chances are, the all Perl option will be faster.
Re: Speed consideration
by koolade (Pilgrim) on Mar 04, 2003 at 03:57 UTC
    I would also recommend something like HTML::Template (or Template Toolkit) if you want to do some complex calculations/checking. But some of this can be done in SSI pretty simply. e.g.:
    <A HREF='http://affiliateprogram.com?id=<!--#if expr="$QUERY_STRING" - +-><!--#echo var="QUERY_STRING" --><!--#else -->50<!--#endif -->">Link +</A>
    You can also use SSI to check to make sure that the query string matches a specific regular expression, etc. See http://httpd.apache.org/docs/mod/mod_include.html for more info.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://240206]
Approved by Enlil
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-03-29 07:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found