Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: How to have Perlscript scrape images from a URL that has Javascript?

by harangzsolt33 (Chaplain)
on Dec 16, 2019 at 07:50 UTC ( [id://11110220]=note: print w/replies, xml ) Need Help??


in reply to How to have Perlscript scrape images from a URL that has Javascript?

If you use Google Chrome or Firefox, you could navigate your web browser to the site where you want to scrape the images from, and enter the following line into the address bar:

javascript:$URLS=[];for($i=0;$i<document.images.length;$i++)$URLS.push +(document.images[$i].src);document.write($URLS.join("<P>"));

This is all one line with no line breaks and no spaces. When you paste this into the address bar, the "javascript:" prefix is going to disappear, and you'll have to type it in again BEFORE you hit enter. Once you hit enter, it's going to show a list of URLs to every image that has been loaded on the screen. This might do exactly what you're looking for, but it's not perl code. It's Javascript. ;-)

  • Comment on Re: How to have Perlscript scrape images from a URL that has Javascript?
  • Download Code

Replies are listed 'Best First'.
Re^2: How to have Perlscript scrape images from a URL that has Javascript?
by soonix (Canon) on Dec 16, 2019 at 09:45 UTC
    On the plus side, it's working code. However, it requires manual interaction, which defeats the purpose (which is automation).
      However, it requires manual interaction

      Yes, that's true... And I just thought of something. Some websites don't even load all the images. You have to scroll down in order for the images to load.

      If I had to automate this process, I would download AutoIt. But that's a whole different language. And it only runs on Windows.

Re^2: How to have Perlscript scrape images from a URL that has Javascript?
by Anonymous Monk on Dec 16, 2019 at 10:27 UTC
    This might do exactly what you're looking for, but it's not perl code. It's Javascript. ;-)
    Unlikely, OP will want to download the images, not use JavaScript to print their src attribute.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11110220]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2024-04-20 05:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found