Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

(jeffa) Re: Public export of Perl Monks database

by jeffa (Bishop)
on Feb 21, 2003 at 17:22 UTC ( [id://237497]=note: print w/replies, xml ) Need Help??


in reply to Public export of Perl Monks database

You don't have to have direct access to the database to make useful things such as statistics, new searches, bible, etc. All you need is a script to fetch nodes (i recommend fetching XML versions).
use strict; use warnings; use Data::Dumper; use XML::Simple; use LWP::Simple; our $URL = 'http://www.perlmonks.org/index.pl'; our $PATH = '/path/to/perlmonks/nodes'; for (0 ... 666666) { my $node = get "$URL?node_id=$_&displaytype=xml"; my $xml = XMLin($node); next if $xml->{title} =~ /Permission\s+Denied/i; next if $xml->{title} =~ /Not\s+found/i; open FH, '>', "$PATH/$_.xml" or die "can't write: $!"; print FH $node; sleep 5; # play nice ;) }
Very simple, could use some more work, but this will get the job done. Just be sure and run it during the weekend or other 'less busy' times. ;) I also have some code over at Node XML to HTML that transforms the XML into HTML ... it's not perfect either, but it's a start.

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)

Replies are listed 'Best First'.
Re: (jeffa) Re: Public export of Perl Monks database
by zby (Vicar) on Feb 21, 2003 at 18:45 UTC
    Yes - I know that technically I can do that. I did not know that XML interface, but you can always use a HTML::Parser. What I am asking is if this is allowed. And beside that, this would generate quite some load on the server when downloading the whole database your way.

    I believe that when it is done my way - it would encourage people to think up new ways to use it.

      "What I am asking is if this is allowed"

      Well ... it's not not allowed.

      "...this would generate quite some load on the server..."

      Damn spiffy it will. See up there in my post where i said "run it during the weekend or other 'less busy' times"? However, due to the fact that the code only fetches each node as XML, it's not quite as much of a load as you might think. The server does not have to generate nodelets and such.

      "I believe that when it is done my way..."

      And that's why i posted. You might be waiting a loooong time for your idea to be implemented here, unless you want to become a god and do it yourself. :)

      For the record, i would love to have access to the database. From time to time i like to do a little history/research and that would make my life much easier. Until then, i just run a script similar to the one i cranked out above when there are very few users on the site.

      jeffa

      L-LL-L--L-LL-L--L-LL-L--
      -R--R-RR-R--R-RR-R--R-RR
      B--B--B--B--B--B--B--B--
      H---H---H---H---H---H---
      (the triplet paradiddle with high-hat)
      
        Thanks for your code anyway.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://237497]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2024-04-18 11:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found