Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Mirror only the installable parts of CPAN

by merlyn (Sage)
on Aug 08, 2002 at 06:10 UTC ( [id://188527]=sourcecode: print w/replies, xml ) Need Help??
Category: Utility Scripts
Author/Contact Info merlyn
Description: As noted in a parallel thread, I have this short program which can mirror a complete set of the installable modules for use with CPAN.pm.

This is for review purposes only. A final version of this code will appear in my LM column. Comments are welcome.
WARNING: As stated, this was a preliminary version of this program for comment only. While writing the column, I fixed a few bugs. Do not use the version here. Use the version there instead.

#!/usr/bin/perl -w
use strict;
$|++;

### CONFIG

# the CPAN url to fetch
my $REMOTE = "http://www.cpan.org/";
# my $REMOTE = "file://Users/merlyn/MIRROR/CPAN/";
# my $REMOTE = "http://fi.cpan.org/";
# my $REMOTE = "http://au.cpan.org/";

# the path to the local mirror
# warning: unknown files below this dir are deleted!
my $LOCAL = "/Users/merlyn/Perl/MINICPAN/";

# how verbose?  false means nothing but errors
my $TRACE = 1;

## END CONFIG

## core:
use File::Path qw(mkpath);
use File::Basename qw(dirname);
use File::Spec::Functions qw(catfile);
use File::Find qw(find);

## LWP:
use URI ();
use LWP::Simple qw(mirror RC_OK RC_NOT_MODIFIED);

## Compress::Zlib
use Compress::Zlib qw(gzopen $gzerrno);

## first, get index files
my_mirror($_) for qw(
             authors/01mailrc.txt.gz
             modules/02packages.details.txt.gz
             modules/03modlist.data.gz
            );

## now walk the packages list
my $gz = gzopen(catfile($LOCAL, "modules/02packages.details.txt.gz"), 
+"rb")
  or die "Cannot open details: $gzerrno";
my $state = 1;
while ($gz->gzreadline($_) > 0) {
  if ($state == 1) {        # in header
    $state = 2 unless /\S/;
    next;
  }
  if ($state == 2) {        # blank following header
    $state = 3;
    next;
  }

  my ($module, $version, $path) = split;
  my_mirror("authors/id/$path");
}

## finally, clean the files we didn't stick there
clean_unmirrored();

exit 0;

BEGIN {
  my %mirrored;

  sub my_mirror {
    my $path = shift;

    my $remote_uri = URI->new_abs($path, $REMOTE)->as_string;
    my $local_file = catfile($LOCAL, $path);

    return if $mirrored{$local_file}++;

    ## presume "authors/id/*" is up to date if it is present
    return if $path =~ m{^authors/id} and -f $local_file;

    print "$remote_uri -> $local_file\n" if $TRACE;

    mkpath(dirname($local_file), 1, 0711);
    my $status = mirror($remote_uri, $local_file);

    return if $status == RC_OK or $status == RC_NOT_MODIFIED;
    warn "$remote_uri: $status!\n";
  }

  sub clean_unmirrored {
    find sub {
      return unless -f and not $mirrored{$File::Find::name};
      print "removing $File::Find::name\n" if $TRACE;
      unlink $_ or warn "Cannot remove $File::Find::name: $!";
    }, $LOCAL;
  }
}
Replies are listed 'Best First'.
Re: Mirror only the installable parts of CPAN
by zentara (Archbishop) on Aug 08, 2002 at 15:57 UTC
    I would find it useful if you could select which files you wanted
    to mirror. Sort of a selected download list. Maybe if I edit the
    02packages.details.txt.gz list to only contain the modules I want.
    It would be a 1-stop download for the latest versions of all
    modules I use.
      If you only want to upgrade the modules you use, then you don't want a CPAN mirror. Just go to CPAN.pm and enter "r" to see what modules are out of date.

      Of course, you could add a regex match in the loop that mirrors individual modules if you wanted. Beware that this will then delete the other modules you may have downloaded to that tree before.

      -- Randal L. Schwartz, Perl hacker

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://188527]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2024-04-16 14:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found