Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Getting "raw" jss and css code from server using WWW::Mechanize::Chrome

by nysus (Parson)
on Apr 08, 2019 at 20:07 UTC ( [id://1232316]=perlquestion: print w/replies, xml ) Need Help??

nysus has asked for the wisdom of the Perl Monks concerning the following question:

Chromium is injecting HTML tags into css and js files that I am getting with the WWW::Mechanize::Chrome->get() method. Example:

<html><head></head><body><pre style="word-wrap: break-word; white-spac +e: pre-wrap;"> @font-face { font-family: 'nyt-stymie';

Worse, it is encoding all the downloaded files with html entities. I tried to fix this by decoding the html entities but I'm finding the javascript files don't seem to like that much. I also tried using headless chromium but that didn't help.

Is there any way to get the pure, unadulterated css an js files using WWW::Mechanize::Chrome?

use strict; use warnings; use WWW::Mechanize::Chrome; my $mech = WWW::Mechanize::Chrome->new(); $mech->get('https://www.nytimes.com/vi-assets/static-assets/vendor-454 +814a0340940dc9b42.js'); my $content = $mech->content; use Data::Dumper qw(Dumper); print Dumper $content;

Updates

So I've tried downloading the files directly to disk with $mech->get($ur, ':content_file' => $file;. Didn't work. I tried $mech->get(format => 'text'). This crashed the browser.

$PM = "Perl Monk's";
$MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
$nysus = $PM . ' ' . $MCF;
Click here if you love Perl Monks

Replies are listed 'Best First'.
Re: Getting "raw" jss and css code from server using WWW::Mechanize::Chrome
by ikegami (Patriarch) on Apr 09, 2019 at 08:37 UTC

    This was also posted to StackOverflow, where I posted a workaround.

Re: Getting "raw" jss and css code from server using WWW::Mechanize::Chrome
by Anonymous Monk on Apr 09, 2019 at 00:23 UTC
    Get the urls, then download using plain Mechanize

      Yeah, thought about that. I would have to share the cookie jar if the WMC is in a password protected site. Probably a way to do that.

      $PM = "Perl Monk's";
      $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
      $nysus = $PM . ' ' . $MCF;
      Click here if you love Perl Monks

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1232316]
Approved by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-20 00:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found