Re^2: Crawling with Parallel::ForkManager


Pathologically Eclectic Rubbish Lister
	PerlMonks

Re^2: Crawling with Parallel::ForkManager

by listanand (Sexton)

on Aug 07, 2009 at 22:34 UTC ( [id://786949]=note: print w/replies, xml )

Need Help??

in reply to Re: Crawling with Parallel::ForkManager
in thread Crawling with Parallel::ForkManager

Thanks for writing.

I am using LWP::Simple (mirror method) to retrieve the PDFs. Without using the Parallel, everything works fine.

Comment on Re^2: Crawling with Parallel::ForkManager

Replies are listed 'Best First'.

Re^3: Crawling with Parallel::ForkManager
by tokpela (Chaplain) on Aug 08, 2009 at 08:55 UTC

Just a guess here...

Have you tried to download the PDF using the $mech connection you are already using? Say using:

$mech->get($url_to_pdf);
$mech->save_content( $filename );
[download]

Maybe this is a cookie issue. I believe that $mech will accept cookies by default. This might mean that using a separate mirror process causes a different connection to take place and the web server maybe does not allow a direct connection from that page without a cookie.

It might work for you in the browser since your browser would already have a cookie.

[reply]
[d/l]

In Section Seekers of Perl Wisdom