http://qs321.pair.com?node_id=455046
Category: PerlMonks Related Scripts
Author/Contact Info crashtest
Description: Ever wondered if you've seen every single quip that displays on the top of the pages on this site? I was curious, so I put together a little screen-scraper to check. Hey, it was a slow Sunday afternoon!

No command-line arguments for this script, but there are three variables at the top of the code (clearly marked) that provide some crude configuration. $iterations determines how often perlmonks.org is queried, $nice sets how long to pause between HTTP requests, while $status_print_interval configures how often the script pipes up with a quick status report to show it's still alive.

After 100 iterations, I think I've "collected 'em all", but I look forward to running this script every couple of months to see what other witticisms those [id://pmdev]s come up with:
********** RESULTS **********
2 time(s): Perl Sensitive Sunglasses
7 time(s): more useful options
9 time(s): Pathologically Eclectic Rubbish Lister
3 time(s): XP is just a number
8 time(s): Welcome to the Monastery
5 time(s): Think about Loose Coupling
3 time(s): P is for Practical
4 time(s): Syntactic Confectionary Delight
4 time(s): Perl Monk, Perl Meditation.
7 time(s): Your skill will accomplish what the force of many cannot
8 time(s): "be consistent."
5 time(s): go ahead... be a heretic
8 time(s): Keep It Simple, Stupid
6 time(s): laziness, impatience, and hubris
4 time(s): Perl: the Markov chain saw
10 time(s): There's more than one way to do things.
3 time(s): Just another Perl shrine
4 time(s): good chemistry is complicated,and a little bit messy-LW
#!/usr/bin/perl

use strict;
use warnings;
use LWP::Simple;

#####################################################################
### Set how many times to run, how long (in seconds) to wait between
### page hits, and how often to print an interim status report.
my ($iterations, $nice, $status_print_interval) = (100, 5, 5);
#####################################################################

my %quips;
$SIG{INT} = sub{ print_results(\%quips); exit(0) };
my $loopcount = 0;

print "Beginning PerlMonks Quip Gatherer...\n";
while($iterations > 0){
    my $content = get('http://www.perlmonks.org');
    die "Failed to load content!\n" unless defined($content);
    
    extract_quip(\%quips, \$content);
    $iterations--;
    $loopcount++;
    print "Found ", scalar keys %quips,
        " quip(s) so far, $iterations iteration(s) left...\n"
        if ($loopcount % $status_print_interval == 0);
    sleep($nice) if ($iterations);
}
print_results(\%quips);

#####################################################################
### SUBS                                                          ###
#####################################################################

sub extract_quip{
    my ($quips, $content) = @_;
    if ($$content =~ m!<td class="monkquip"[^>]+>(.*?)</td>!s){
        my $data = $1;
        $data =~ s!\<[^>]*>!!sg;
        $data =~ s!\s{2,}!!sg;
        $quips->{$data}++;
    }
}

sub print_results{
    my $quips = shift;

    print "\n********** RESULTS **********\n";
    while (my ($key, $val) = each(%$quips)){
        print "$val time(s): $key\n";
    }
}