mod_perl/CGI/MySQL programs are interesting applications to attempt increasing performance of, there
are so many moving parts, you need to take a more holistic approach.
There's generally a combination of several factors affecting performance. Warning: This will
be a long post (and possibly slightly OT), because I am fairly interested in this subject =)
Below is an outline of the areas I believe you need to consider when attemping to increase the performance of your perl database interface. Some many not be possible depending on
the resources/time you have available, but I wanted to outline everything I could think of,
and allow you to choose what is possible:
Minimize Disk IO. Look at the physical hardware you are running the database
and/or webserver on. Generally, IO problems contribute the greatest to system slowdowns.
Look at purchasing a faster hard drive, more ram if your machine is swapping out to the
hard drive, or - in the extreme - dedicating a server to house the database.
Look at the setup of your MySQL database . There is a great
FAQ about getting maximum performance out of MySQL, here.
Analyze your queries, and figure out which take the most amount
of time, then learn how to make indexes. Remember, don't just
set these and think you're done. I've been using MySQL full-time
for just over 2 years now, and I routinely make small tweaks with
indexes I thought were perfect a month earlier. Set a monthly or
bi-monthly schedule to check your indexes, it is worth it. This
is the most important of all the tips I outline in this
post. It's not uncommon for indexing to get you a 100% speed
increase from a few minutes of tweaking.
Here's a neat one, I've read about but not used (yet), Heap
Tables. A heap table is an in-memory table that validation-type
data should be stored in. You put any information in here that
is somewhate small, and constant such as a list of valid Countries.
When mysql starts it does so with the help of a start-up script.
You can change the start-up script so it to performs a series of
SQL commands after beginning the server. This can be used
to automatically create a HEAP table, and fill it with
data from other tables/outside sources. Queries that use the heap table
could be greatly sped up. Anytime where you need alot of
speed from your "validation" tables, give this a shot.
Anaylze all the SQL queries not only with EXPLAIN, but using
DBIx::Profile in your perl programs. It will give you a nice
breakdown of each SQL query that was run, and tell you which
SQL queries are taking the largest amount of time.
Never use SELECT * inside an SQL query. Only fetch
the columns & rows you need, nothing more. It's a huge waste of resources to
ask for information you throw out and never use.
Web Server Setup. Look at Apache, and if there is anything you
can do to speed it up. Try using Apache::DBI, which caches
the database handle. This means that when your script runs
it won't need to connect to the database, because the connection
is kept open for you. Also see if there are any modules
compiled into Apache that are not necessary to the functioning
of your website. Consider recompiling with just what you
need - go lean.
Here are two FAQ's on Apache and mod_perl performance you should
look at:
Write the shortest, cleanest perl code you can to get the job
done. The shorter the code, the easier it is to optimize
and bug test, since you have less to keep in your head all
at once.
Use Devel::DProf and Apache::DProf to profile the actual
perl code to see where your program is spending the most amount of time.
It gives you a nice breakdown of each function in your program and
tell you how long each one took. Without doing this, you'll
just be just guessing. I once read somewhere that programmers
spend most of their time optimizing the wrong section of
code. Don't fall into this trap, learn to profile.
Now, with having said all that, do not be afraid to use a templating
system. Yes, you will get a slight slowdown when compared to
embedding the HTML right in the perl code, but the benefits are
numerous. You get cleaner code, a seperation of presentation from
logic, and more avenues for extra optimizations (which I'll get into
next).
One of my personal favorites is HTML::Template, which
allows a complete seperation of logic and design. You can't embed
perl code inside it, instead there is a mini templating language you
use. There's something about mixing two different syntaxes together
(SQL and perl, HTML and perl, etc) that confuses me, which is why I like
HTML::Template. Combine this with HTML::Pager for easy paging
of database results, and it makes hard things simple.
So, you're database and web server are running at peak performance. Your
perl code is optimized and profiled. Want more speed? Now you need to
start looking at what happens after the information leaves your server.
Do some remote load testing. There are many great services that can do
this for you, some even free or offering free trials. A quick search on
google turns up hundreds of related sites, two key players in this area
are Keynote and Service Metrics. I've
used both of these, with good results.
With this information you can pin-point where a/the speed problem
lies. They can tell you if it's your server, or hosting company.
You can use this as ammo when negotiating for faster/better
service. Also, consider getting an SLA (Service Level Agreement)
from your hosting company to gaurantee the speed of your pipe.
Don't forget the browser! The browser is an often forgotten
piece of the puzzle. You want to make sure that the browser
can download and render the HTML in the fastest possible
time. How do you do this? A simple answer is to make sure
your HTML is XHTML 1.0 compliant, a super clean version
of HTML. Now that your HTML is completely inside HTML
template files, it will be relatively painless to process them
through HTML-Tidy.
HTML-Tidy is a utility that takes any sort of HTML and outputs
cleaned up XHTML compliant code. The theory is that if the browser doesn't have to
"guess" at the sizes of images or close "p" tags itself, it can
allocate more resources to parsing the HTML, and drawing the screen faster.
Look at Apache::GZIP. I've had no experience with
using this module, but I hear it can increase download speeds for images
and HTML. (even dynamically generated HTML) Please be aware that it could probably cause performance
issues on your web server, since it needs to do alot of extra
work compressing things on the fly. It's up to you to decide if
the extra speed for your users is worth the trade off.
Try using HTML::Clean to filter out any extras, such as
excess whitespace. You can sometimes compress your output
by a further 10-20% using this module. Use this module with
caution, it is quite agressive with it's cleaning. I would suggest
a lower level of optimization rather than full, as it's been known
to play havoc with javascript.
Whew! Sorry for the length of this post. Once I started writing I couldn't
stop. Perhaps I should put this into a tutorial.
Anyone have other performance improvement suggestions?
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|
|