Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^9: saveResources_future and tab in WWW::Mechanize::Chrome

by 1nelly1 (Sexton)
on May 14, 2020 at 07:48 UTC ( [id://11116775]=note: print w/replies, xml ) Need Help??


in reply to Re^8: saveResources_future and tab in WWW::Mechanize::Chrome
in thread saveResources_future and tab in WWW::Mechanize::Chrome

As I mentioned earlier I am just testing WWW::Mechanize::Chrome in order to use it instead of WWW::Mechanize::Firefox. In WWW::Mechanize::Firefox you implemented the save_url save_content methods and now I experiment with your saveResources_future method in WWW::Mechanize::Chrome. I do this by just using exactly the code of the documentation:

my $file_map = $mech->saveResources_future( target_file => 'this_page.html', target_dir => 'this_page_files/', wanted => sub { $_[0]->{url} =~ m!^https?:!i }, )->get();

I noticed that you changed the reference of $file_map now refering to a hash containing uri as key and file as value. In my last posting I wanted to inform you that there is only one value in that hash which is the url of the current fetched webpage. I also wanted to inform you that there is no saving of files at all.
This seems to be like that because the printing in this code of saveResources_future is never executed:
$names{ $resource->{url} } ||= File::Spec->catfile( $target_dir, $name +s{ $resource->{url} }); my $target = $names{ $resource->{url} } or die "Don't have a filename for URL '$resource->{url}' ?!"; $s->log( 'debug', "Saving '$resource->{url}' to '$target'" ); open my $fh, '>', $target or croak "Couldn't save url '$resource->{url}' to $target: $!"; binmode $fh; print $fh $resource->{content}; CORE::close( $fh );

So for me as is the method does not work as intended. Kindly recheck by yourself.
Thank you and best regards
1nelly1

Replies are listed 'Best First'.
Re^10: saveResources_future and tab in WWW::Mechanize::Chrome
by Corion (Patriarch) on May 14, 2020 at 10:03 UTC

    Not having example code makes it much harder to find out what parts of the code don't work in your opinion.

    I've added a new test case and will release it soonish, but I don't know if it addresses your case.

      Ok. Here is my code:

      use strict; use warnings; use Log::Log4perl qw(:easy); use WWW::Mechanize::Chrome; Log::Log4perl->easy_init($ERROR); my $mech = WWW::Mechanize::Chrome->new(); $mech->get('https://metacpan.org/'); my $file_map = $mech->saveResources_future( target_file => 'this_page.html', target_dir => 'this_page_files/', wanted => sub { $_[0]->{url} =~ m!^https?:!i }, )->get();

      The only thing that happens is that after loading of the page a folder 'this_page_files' is created. The file 'this_page.html' as well as the resources are not saved anywhere. I hope that information helps.
      Best regards
      1nelly1

        Yes, indeed that helps! Thank you very much!

        The code in 0.54 had the logic for the wanted subroutine partially wrong and it didn't properly pass the callback on to where it actually matters.

        I've released version 0.55, which downloads several files in your test case, and puts them all below the this_page_files/ directory, except for the main page.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11116775]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (1)
As of 2024-04-25 00:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found