Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

...This is a preview just for Perlmonks of the continuing work I'm doing with my BackPAN indexer....

MyCPAN can now create CPAN-like directories out of a directory of distributions. Run a script then point CPAN.pm at your directory to use it as your CPAN source. This worked was sponsored by a customer at the day job (and talk to me if you can convince your boss that this might be something worthwhile to sponsor too).

Previously, you could do this task with a minicpan and CPAN::Mini::Inject. You kept two repositories. You updated minicpan, which undid all of your private stuff, then you re-injected everything. CPAN::Mini::Inject then updated the modules/02package.details.txt.gz and CHECKSUMS files. That's fine if you're injecting a few things.

My task is to create a CPAN-like structure of stuff that is mostly not on CPAN, or when nothing in the private CPAN comes from the real CPAN. We've been calling this "DPAN", for DarkPAN. You don't have to worry about what's in a distro or which author it should belong too, and you don't have parallel directories. Just dump a bunch of distros in a directory. Those might be private modules, CPAN modules, forked modules, vendor modules, and so on. DPAN doesn't care. Just dump them in a directory.

MyCPAN::Indexer pulls out all of the information and turns the source directory into something that the CPAN tools can understand.

You start with MyCPAN::Indexer. It's still in development, so some things are a little rough. Install it or get it from Github. Install the dependencies.

Inside MyCPAN::Indexer is an examples/ directory with a bunch of junk in it. You want the dpan script.

% perl examples/dpan my_modules_dir/

With the defaults, this looks for all distributions under my_modules_dir, collects information about each and puts it in the indexer_reports/ directory. It then goes through all of the reports and collects the information it needs for the CPAN index files. Finally, in my_modules_dir/ it creates the modules/ directory with the index files the CPAN tools need and puts a CHECKSUMS file in each directory that has distributions in it. You can now point CPAN.pm to this directory and install directly from it.

There are a couple of things to watch out for:

  • It indexes everything it finds, so if you have multiple versions of a distribution, they all end up in 02packages.details.txt.gz. Fixing that is on the To Do list, but not too important for my purposes right now.
  • With CPAN.pm, you can have any directory structure you like. So far, we've had to keep the authors/id/X/XX/XXXX directory structure for CPANPLUS.
  • If you try to install a module and CPAN.pm does not find it or one of its dependencies from any source in urllist, it falls back to some internal URLs. I don't know what CPANPLUS does.
  • On Strawberry Perl, Archive::Extract complains about not being able to extract the dist when it worked just fine.

The lastest version of my cpan script might help you. You can dump and load configs without fooling with the shell. The -J (capital J) will dump the current config to STDOUT. It's the same format as CPAN::Config:

% cpan -J > MyCPANConfig.pm

Edit that file how you like. I change the urllist.

I have several versions for testing different things. If I want to install Foo::Bar with my DPAN config pointing to my DarkPAN, I load the right configuration with -j (lowercase j):

% cpan -j DPANConfig.pm Foo::Bar

Now, I've said that DPAN is for DarkPAN, but it's also for another thing I want to do: DistributedPAN. If you look in 02packages.details.txt, you'll see lines like:

Foo::Bar 1.23 B/BD/BDFOY/Foo-Bar-1.23.tgz

When I created CPAN::PackageDetails to play with this, we discovered that CPAN.pm will happily deal with absolute paths there. The distributions files could be anywhere:

Foo::Bar 1.23 /usr/local/dpan/Foo-Bar-1.23.tgz Bar::Baz 2.45 /home/brian/dists/Bar-Baz-2.45

Once I started thinking about that, I wanted to make it so the files don't even have to be local:

Foo::Bar 1.23 /usr/local/dpan/Foo-Bar-1.23.tgz Bar::Baz 2.45 http://www.example.com/dists/Bar-Baz-2.45 Quux 2.45 http://www.cpan.org/authors/id/B/BD/BDFOY/Quux-2 +.45

Once that third column handling is refactored into a general URl or file fetcher, things get more interesting. I haven't looked at what that might take in CPAN.pm though.

And, since I was writing CPAN::PackageDetails, I wanted to support another possible format. This one has a column for the author and might list the same namespace several times

Foo::Bar BDFOY 1.23 B/BD/BDFOY/Foo-Bar-1.23.tgz Foo::Bar BDFOY 2.01 /home/brian/dists/Foo-Bar-2.01.tgz Foo::Bar SNUFFY 1.24 http://www.example.com/dist/Foo-Bar.tgz

Remember Synopsis 11? Perl 6 supports not only version restrictions on loading a module, but loading the same module from different authors:

use Dog:ver(Any):auth(Any); use Dog:ver(Any):auth<cpan:BDFOY>; use Dog:ver<1.2.1>:auth(Any); use Dog:ver(1.2.1..1.2.3);

With a change to 02packages.details.txt, the CPAN tools can support this too.

Not to worry though. That's just something fun to think about right now. Once the rest of DPAN seems stable, I can start adding cool features like that.

--
brian d foy <brian@stonehenge.com>
Subscribe to The Perl Review

In reply to A preview of DPAN by brian_d_foy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (9)
As of 2024-04-18 14:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found