hacker has asked for the wisdom of the Perl Monks concerning the following question:
I host a number of community projects, mailing lists, cvs repositories, and websites on two servers, one on each coast. I pay for all the bandwidth out of pocket, currently, but the popularity of the sites I host is beginning to strain the budget. Recent statistics on the logs indicate we're pushing about 10gb/day outbound (no, it's not pr0n, filesharing, or mp3s ;).
I posted an article on advogato on this, and someone suggested (among other things) using one of those "donation thermometers" to do the tracking and incentive push for users. You've probably seen these things on the lawns of high-schools from time to time for different community drives. I think this would be a great idea.
I've got some very rudimentary log parsing code (below) that I hacked up just to see how much we're actually pushing from our servers, and was amazed at the results. I'll clean this up a bit soon, but this was a 2-minute hack just to get the values out of the logs:
use strict; # of course use File::Basename; # limit the output names use File::stat; # file size/date use File::Find; # only in `pwd` use Cwd; # lint and untaint `pwd` my ($root) = getcwd =~ /(.*)/; my $total; find( { untaint_pattern=>'.*', no_chdir => 1, wanted => sub { return unless /foo-.*\z/; my $v_snap_file = $File::Find::name; my $basefile = basename($v_snap_file); my $count = `/bin/grep $basefile \ /var/log/squid/access.log | /usr/bin/wc -l`; $count =~ s/^\s+//g; my $v_sb = stat("$v_snap_file"); my $v_filesize = $v_sb->size; my $v_bprecise = sprintf "%.0f", ($v_filesize); my $v_bsize = insert_commas($v_bprecise); my $v_kprecise = sprintf "%.0f", ($v_filesize/1024); my $v_ksize = insert_commas($v_kprecise); my $v_filedate = scalar localtime $v_sb->mtime; my $basename_v = basename($v_snap_file); print "File Name...: $basename_v\n"; print "File Size...: $v_bsize bytes ($v_ksize kb)\n"; print "Downloads...: ", insert_commas($count); my $tbytes = $v_filesize * $count; print "Total bytes.: ", insert_commas($tbytes), "\n\n"; $total += $tbytes; } }, $root); print "Final total bytes: ", insert_commas($total), "\n\n"; sub insert_commas { my $text = reverse $_[0]; $text =~ s/(\d{3})(?=\d)(?!\d*\.)/$1,/g; return scalar reverse $text; }
This code works, and produces output like:
File Name...: foo_bin-1.2.tar.gz File Size...: 1,200,837 bytes (1,173 kb) Downloads...: 448 Total bytes.: 537,974,976 File Name...: foo-desktop-1.2.0.0.i386.rpm File Size...: 2,022,261 bytes (1,975 kb) Downloads...: 163,976 Total bytes.: 331,602,269,736 ... Final total bytes: 532,767,443,273
So far, so good. I can get the sizes and total size (not the most efficient way, considering the size of the logs, but it'll do in a cronjob).
These projects also have donation buttons to PayPal on them (yes, I know, PayPal.. but it's convenient for the users) and I have perl code (below) that can log into PayPal and extract the payment history as a flat HTML file.
use strict; use LWP::UserAgent; use LWP::Protocol::https; use HTTP::Cookies; # Some preset data. my $domain = "https://www.paypal.com/"; my $login_url = "cgi-bin/webscr?__track=_login-run:"; $login_url .= "p/gen/login:_login-submit"; my $overview = "cgi-bin/webscr?cmd=_history"; $overview .= "&login_access=1234567890"; my $ua = LWP::UserAgent->new(env_proxy => 1, keep_alive => 1, timeout => 30, ); # $ua->agent('Mozilla/5.0'); # $ua->protocols_allowed( [ 'http', 'https'] ); # Build a browser object, allow POST redirects. my $browser = LWP::UserAgent->new (); push @{$browser->requests_redirectable}, 'POST'; # Attach the browser to an empty cookie jar. my $cookie_jar = HTTP::Cookies->new(); $browser->cookie_jar ($cookie_jar); # Attempt to get the homepage, with us logged in. my $login_response = $browser->post("${domain}${login_url}",[ cmd => '_login-submit', login_cmd => '', login_params => '', login_cancel_cmd => '', login_email => 'foo@bar.org', login_password => 'ItsASecret', ] ); my $response = $browser->get ("${domain}${overview}"); open(PPAL, ">paypal.html") or die $!; print PPAL $response->content; close PPAL; # print $response->content;
Again, so far so good... so my goal is to try to integrate these, so I can do a bandwidth-over-donations type of "thermometer" that can show me how far we've consumed, and the number of donations given that offset that consumption.
I'm talking in CB right now, and there are some interesting ideas being discussed, like using Imager, or GD to generate a graphical representation of the thermometer (something like this graphic or this one I found through google's image search).
Has anyone done anything like this? Pointers? Ideas? Other approaches I can take?
edited: Thu Apr 3 21:07:13 2003 by jeffa - added readmore tag
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Donation Tracking "Thermometer"
by MZSanford (Curate) on Apr 03, 2003 at 19:43 UTC | |
Re: Donation Tracking "Thermometer"
by Anonymous Monk on Apr 03, 2003 at 20:52 UTC | |
Re: Donation Tracking "Thermometer"
by shotgunefx (Parson) on Apr 03, 2003 at 22:18 UTC |