http://qs321.pair.com?node_id=207629
Category: Web Stuff
Author/Contact Info Joshua b. Jore aka diotalevi josh@lavendergreens.org
Description: Downloads the weekly Real Synthetic Audio radio shows from http://www.synthetic.org

Update 0: I wrote the JS reading regex wrong and have now corrected it. Please try again.

Update 1: Mr. Muskrat pointed out some other fixes at Re: Real Synthetic Audio downloader and I've incorporated them.

Update Next (as in not actually in there): It'd be even nicer if it kept track of the modification dates on the .js and .asx files so it wouldn't need to GET all the time.


=pod

This script retreives DJ Todd's internet radio show Real
Synthetic Audio from http://www.synthetic.org and stores
local copies of the retrieved audio. You can specify your
own local directory, file extension and url to match by
modifying SaveDir, SaveExt and SaveType. Any newly
downloaded files will be noted on STDOUT so this should
work in a cron script. DJ Todd seems to put new shows up on
or after Sunday so just run the job once a week - say
Monday night. Be nice to his server - it's a good show
and I want it to stay available.

=cut

use strict;
use warnings;
require LWP::UserAgent;
$| = 1;

our $SaveDir = '/home/josh/rsa/';
our $SaveExt = '.wma';
our $SaveType = qr{http://.+?\.asx};
use constant DEBUG => 0;

our ($ua, $rq, $rs);

$ua = LWP::UserAgent->new;

my $downloads = get_downloads();
download_files( $SaveDir, $downloads );

sub get_downloads {
    my %downloads;
    my @js_urls = map "http://synthetic.org/jscript/${_}showlist.js", 
+('', 'previous-');
    
    JSURL: for my $js_url (@js_urls) {
        print "JS $js_url\n" if DEBUG;
        $rs = $ua -> get( $js_url );
        next JSURL unless $rs->is_success;

        my @asx_urls = $rs -> content() =~ m|$SaveType|g;
        ASXURL: for my $asx_url (@asx_urls) {
            print "ASX $asx_url\n" if DEBUG;
            $rs = $ua -> get( $asx_url );
            my $wma = $rs -> content;
            $wma =~ s/[\s\15\12]+//g;

            $wma =~ /(\d+)-(\w+)/;
            my ($date, $speed) = ($1, $2);

            if (not $downloads{$date} or
                $speed eq 'isdn') {
                $downloads{$date} = $wma;
                print "\$downloads{$date} = $wma\n" if DEBUG;
            }
            else {
                print "SKIP $wma $1 $2\n" if DEBUG;
            }
        }
    }

    return \ %downloads;
}

sub download_files {
    my ($directory, $download) = @_;

    for my $base_file (sort keys %$download) {
        my $wma_url = $download -> {$base_file};
        print "$base_file: " if DEBUG;
        my $file = "$directory$base_file$SaveExt";

        if (-e $file) {
            print "SKIP\n" if DEBUG;
            next;
        }

        print "downloading " if DEBUG;

        $rq = HTTP::Request -> new( GET => $wma_url );
        $rs = $ua->request( $rq, $file );

        print $rs -> is_success() ? "OK\n" : "FAIL\n" if DEBUG;
        print "$file\n" unless DEBUG;
    }
}