Running JavaScript from within Perl

anautismobserver has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Running JavaScript from within Perl (or just use the API) by hippo (Bishop) on Sep 13, 2019 at 08:17 UTC
Can you offer guidance Did you know that (or even consider whether) WordPress has an API? Not only that but there is already a whole range of modules on CPAN which use it. Perhaps the ability to retrieve the follower count is available via that API and will save you all this scaping and javascripting and whatnot.	[reply]
Re^2: Running JavaScript from within Perl (or just use the API) by anautismobserver (Sexton) on Sep 13, 2019 at 19:38 UTC
I tried following the A Beginners’s Guide to the WordPress REST API tutorial It didn't work for my own (free) WordPress account, but when I used the "the-art-of-autism.com" (a premium account on which I have admin privileges) in place of "yourdomain.com" I was able to follow the tutorial successfully. However, none of the Routes or Endpoints seem to give me what I want, which is the number of followers for an arbitrary WordPress account on which I don't have admin privileges. I'm encouraged by the REST API Handbook Reference page stating "The REST API provides public data accessible to any client anonymously, as well as private data only available after authentication." I can't find any way to determine the number of followers, or what public data is accessible anonymously. Can you help with either of those? Thanks.	[reply]
Re^3: Running JavaScript from within Perl (or just use the API) by Corion (Patriarch) on Sep 13, 2019 at 19:47 UTC
It seems that the URL to use is `https://developer.wordpress.com/docs/api/1.1/get/sites/$site/stats/fol +lowers/` [download] ... but you need to be authenticated: `curl "https://public-api.wordpress.com/rest/v1.1/sites/the-art-of-auti +sm.com/stats/followers" {"error":"unauthorized","message":"user cannot view stats"}` [download] So, you will either have to get permission by the respective sites or you will have to continue scraping the websites.	[reply] [d/l] [select]
Re^2: Running JavaScript from within Perl (or just use the API) by anautismobserver (Sexton) on Sep 20, 2019 at 20:35 UTC
(Updated and clarified) The following endpoint: `https://public-api.wordpress.com/rest/v1/read/feed/?url=the-art-of-aut +ism.com` [download] contains a "feed" url: `https://public-api.wordpress.com/rest/v1/read/feed/34259929` [download] that I want to read. The following code (based on this JSON Tutorial) gives an error "Use of uninitialized value $feedurl in print". `use strict; use warnings; use Mojo::UserAgent; my $url = 'https://public-api.wordpress.com/rest/v1/read/feed/?url=the-art-of-au +tism.com'; my $ua = Mojo::UserAgent->new; my $feedurl = $ua->get( $url )->result->json->{'feeds.meta.links.feed' +}; print $feedurl;` [download] Pleae tell me what I'm doing wrong. Thanks.	[reply] [d/l] [select]
Re^3: Running JavaScript from within Perl (or just use the API) by marto (Cardinal) on Sep 20, 2019 at 22:49 UTC
What happened when you tried to adapt one of the previous examples you've been given?	[reply]
Re^4: Running JavaScript from within Perl (or just use the API) by anautismobserver (Sexton) on Sep 20, 2019 at 23:23 UTC
Re^5: Running JavaScript from within Perl (or just use the API) by haukex (Archbishop) on Sep 21, 2019 at 12:37 UTC
Re^5: Running JavaScript from within Perl (or just use the API) by marto (Cardinal) on Sep 21, 2019 at 16:31 UTC
Re^4: Running JavaScript from within Perl (or just use the API) by anautismobserver (Sexton) on Sep 22, 2019 at 00:09 UTC
Re^5: Running JavaScript from within Perl (or just use the API) by marto (Cardinal) on Sep 22, 2019 at 02:17 UTC
Re^3: Running JavaScript from within Perl (or just use the API) by anautismobserver (Sexton) on Sep 21, 2019 at 15:17 UTC
The following code works to assign $subscribers to subscribers_count, but gives an error "Use of uninitialized value $feedurl in print" for the assignment of $feedurl. `use strict; use warnings; use Mojo::UserAgent; my $url = 'https://public-api.wordpress.com/rest/v1/read/feed/34259929'; my $ua = Mojo::UserAgent->new; my $subscribers = $ua->get($url)->result->json->{subscribers_count}; print "Number of subscribers: $subscribers\n"; my $feedurl = $ua->get( $url )->result->json->{'meta.links.self'}; print $feedurl;` [download] Pleae tell me what I'm doing wrong. Thanks.	[reply] [d/l]
Re^4: Running JavaScript from within Perl (or just use the API) by hippo (Bishop) on Sep 21, 2019 at 22:14 UTC
Re^5: Running JavaScript from within Perl (or just use the API) by anautismobserver (Sexton) on Sep 23, 2019 at 19:11 UTC
Some notes below your chosen depth have not been shown here
Re: Running JavaScript from within Perl by haukex (Archbishop) on Sep 13, 2019 at 05:34 UTC
Naively it seems to me that since my browser can interpret a web page using JavaScript without any a priori information, Perl should be able to as well. Is this possible? If not, why not? JavaScript has access to a ton of things implemented in the browser, like the HTML document's DOM, various JavaScript APIs, and so on. To run JS code correctly, Perl would need to provide all of those, essentially re-implementing a browser, which is of course incredibly complex. See also the "JavaScript" section in WWW::Mechanize::FAQ. (For the general case of running JS from Perl, there was a talk in Riga: Embedding JavaScript in Perl.)	[reply]
Re: Running JavaScript from within Perl by Marshall (Canon) on Sep 13, 2019 at 02:43 UTC
Perl can't run Java script itself. One solution is to use: WWW::Mechanize::Chrome. Previously it was possible to automate Firefox and I played with that, but unfortunately Firefox took out the interface that allowed the automation to happen. I haven't used the Chrome version yet. Anyway the idea is to have Perl control Chrome which will run the Javascript code. Then you read what Chrome figured out.	[reply]
Re: Running JavaScript from within Perl by harangzsolt33 (Chaplain) on Sep 14, 2019 at 05:10 UTC
The JavaScript program on a web page can dynamically modify the page, so what you see has very little or no resemblance to the HTML source code! So, if you can scrape your web page using JavaScript, you get a peek at what's actually on the screen. Here is an example. When you click on the "View HTML" button on this page, you'll see one thing. Then you click on the "Change" button which modifies the code, and then you click on View HTML again, and you'll see the code with some slight changes. The source code hasn't changed, but what's in the memory has changed, and when you get to harvest that, you get the real picture. Here is the JavaScript program that harvests the HTML code: `var DATA = document.all[0].innerHTML;` If the block of HTML code you're trying to harvest is marked with an ID tag like this: `<DIV ID="Part3"> ... OR <P ID="MyText"> ... OR <TABLE ID="Table2"> ...` [download] then you don't need to harvest the entire HTML page. All you have to do is harvest whatever is tagged. So, you would just do this: `var DATA = document.getElementById("Part3").innerHTML;` Instead of using "innerHTML," you could also use "innerText" which gives you only the plain text without all the HTML tags and whatnot: `var DATA = document.getElementById("Part3").innerText;` Once you have the code in the DATA variable, then you can run a regex or something to get the actual number you're looking for.. JavaScript regex works like perl's. <HTML> <BODY> <NOSCRIPT> <DIV STYLE="BACKGROUND-COLOR:RED; COLOR:WHITE; FONT-FAMILY:ARIAL;"><CE +NTER>This page requires JavaScript.</CENTER> </DIV> </NOSCRIPT> <H3 ID="HEADING">Welcome</H3> <DIV ID="CONTENT"> <P>This is a very simple HTML page. <P><INPUT TYPE=BUTTON VALUE=" View HTML " onClick="ViewHTML();"> <INPUT TYPE=BUTTON VALUE=" Change " onClick="DoSomething();"> </DIV> <SCRIPT> function ViewHTML() { var DATA = document.all[0].innerHTML; alert("This is the page content as seen from JavaScript:\n\n" + DATA +); } function DoSomething() { document.getElementById("HEADING").innerHTML = "<FONT COLOR=BLUE>DEA +R VISITOR</FONT>"; var MyCONTENT = document.getElementById("CONTENT"); MyCONTENT.innerHTML = "<FONT COLOR=RED>" + MyCONTENT.innerHTML; } </SCRIPT> [download] I tested the above code, and it works in Firefox 52, KMeleon 7.5, QupZilla 1.8.6, Safari 5.1.7, Google Chrome 75, Internet Explorer 6, Opera 7.5, and Vivaldi 1.0. I have also tested it with an iPhone 7, Nokia Lumia 930 Windows Phone and an old Android 6 tablet. I haven't used any "ultra modern technology" that will break your phones. Everything in this example script is pretty standard. Once you get the number you want to send back to your perl script, you could send it back by loading a picture: `<HTML> <BODY> <IMG NAME=PIX6 BORDER=0 WIDTH=1 HEIGHT=1 STYLE="POSITION:ABSOLUTE; TOP +:0; LEFT:0;"> <SCRIPT> NUMBER = 90; document.images.PIX6.src = "http://www.yourwebsite.com/yourscript.pl?" + + NUMBER; </SCRIPT>` [download] Here you're sending the number 90 back to your perl script. You could also signal to your perl script when somebody loads your web page with JavaScript turned off by putting a picture within the NOSCRIPT tags. Whatever you put between the NOSCRIPT tags will only appear when JavaScript is disabled on the page: `<NOSCRIPT> <IMG SRC="http://www.yourwebsite.com/yourscript.pl?N" BORDER=0 WIDTH=1 + HEIGHT=1 STYLE="POSITION:ABSOLUTE; TOP:0; LEFT:0;"> </NOSCRIPT>` [download]	[reply] [d/l] [select]
Re^2: Running JavaScript from within Perl by marto (Cardinal) on Sep 14, 2019 at 10:14 UTC
None of this addresses what OP is trying to achieve.	[reply]
Re^3: Running JavaScript from within Perl by anautismobserver (Sexton) on Sep 15, 2019 at 21:30 UTC
I'm having trouble understanding the WordPress.com REST API documentation. The example given for GET /sites/$site/posts/ is `curl 'https://public-api.wordpress.com/rest/v1.1/sites/en.blog.wordpre +ss.com/posts/?number=2'` [download] which I couldn't figure out how to make work. By contrast, the example provided in A Beginners’s Guide to the WordPress REST API is `curl -X GET -i http://the-art-of-autism.com/wp-json/wp/v2/posts` [download] which does work. Can you help me reconcile the two (which will hopefully help me interpret the rest of the WordPress REST API documentation)? Also, do REST API Resources only work on premium WordPress sites? I was able to execute GET /sites/$site/posts/ on the-art-of-autism.com (a premium site) but not on anautismobserver.wordpress.com (a free site). Do you know the reason for this? I really appreciate your help. You've already saved me a great deal of time and effort (and greatly increased my success chances). Thank you ever so much.	[reply] [d/l] [select]
Re^4: Running JavaScript from within Perl by marto (Cardinal) on Sep 16, 2019 at 08:48 UTC
Re^3: Running JavaScript from within Perl by anautismobserver (Sexton) on Sep 17, 2019 at 05:54 UTC
Using GET /read/feed/$feed_url_or_id I can generate a web page containing the number of followers shown as "subscribers_count". How do I read this page into a perl script? I tried HTML::TreeBuilder and got the error message: https://public-api.wordpress.com/rest/v1/read/feed/http%3A%2F%2Fthe-art-of-autism.com%2Ffeed returned application/json not HTML Should I use WWW::Mechanize::Chrome, JSON, JavaScript, or something else? How do I provide them input from a URL?	[reply]
Re^4: Running JavaScript from within Perl by marto (Cardinal) on Sep 17, 2019 at 06:24 UTC
Re^4: Running JavaScript from within Perl by bliako (Monsignor) on Sep 17, 2019 at 08:50 UTC
Re^5: Running JavaScript from within Perl by Anonymous Monk on Sep 17, 2019 at 09:16 UTC
Some notes below your chosen depth have not been shown here
Re^3: Running JavaScript from within Perl by anautismobserver (Sexton) on Sep 18, 2019 at 19:15 UTC
In response to your short example using Mojo::UserAgent: (which I couldn't figure out how to respond to directly): I modified your code as follows to read url's from a file: use strict; use warnings; use Mojo::UserAgent; my $filename = 'urls_Mojo.txt'; open(my $fh, '<:encoding(UTF-8)', $filename) or die "Could not open file '$filename' $!"; my $y = 0; # input row count while (my $row = <$fh>) { $y++; print $y; print " $row"; my $url = $row; # create a Mojo:UserAgent my $ua = Mojo::UserAgent->new; # use $ua to get the url and assign the value of 'subscriber_count' in + the json # to avariable, $subscribers my $subscribers = $ua->get( $url )->result->json->{subscribers_count}; # print the variable to screen print "Number of subscribers: $subscribers\n"; } [download] it worked when the file 'urls_Mojo.txt' contained `https://public-api.wordpress.com/rest/v1/read/feed/http%3A%2F%2Fthe-ar +t-of-autism.com%2Ffeed` [download] but gave a "Can't use an undefined value as a HASH reference" error when I added a second line to 'urls_Mojo.txt' as follows: `https://public-api.wordpress.com/rest/v1/read/feed/http%3A%2F%2Fthe-ar +t-of-autism.com%2Ffeed https://public-api.wordpress.com/rest/v1/read/feed/http%3A%2F%2Fanauti +smobserver.wordpress.com%2Ffeed` [download] Can you help me figure out how to apply this script to a list of url's in a file? Thanks.	[reply] [d/l] [select]
Re^4: Running JavaScript from within Perl by hippo (Bishop) on Sep 18, 2019 at 22:08 UTC
Re^4: Running JavaScript from within Perl by marto (Cardinal) on Sep 19, 2019 at 08:53 UTC
Re^3: Running JavaScript from within Perl by anautismobserver (Sexton) on Sep 18, 2019 at 18:24 UTC
When I type `curl --help` [download] I get a list of options. Does that mean I have curl installed? (I don't remember installing it.) If not, please tell me how to install it from the zip file. Thanks.	[reply] [d/l]
Re^4: Running JavaScript from within Perl by marto (Cardinal) on Sep 18, 2019 at 18:43 UTC
Re^3: Running JavaScript from within Perl by anautismobserver (Sexton) on Sep 19, 2019 at 17:45 UTC
Do you know of a way to indent a script automatically? I currently use Notepad++ and also have Komodo IDE installed (but would happily use another editor that can indent automatically).	[reply]
Re^4: Running JavaScript from within Perl (indentation) by hippo (Bishop) on Sep 19, 2019 at 18:08 UTC
Re^4: Running JavaScript from within Perl by marto (Cardinal) on Sep 19, 2019 at 17:55 UTC
Re^4: Running JavaScript from within Perl by anautismobserver (Sexton) on Sep 20, 2019 at 17:51 UTC
Re^5: Running JavaScript from within Perl by Corion (Patriarch) on Sep 20, 2019 at 17:58 UTC
Re^4: Running JavaScript from within Perl by Anonymous Monk on Sep 19, 2019 at 19:52 UTC
Re^2: Running JavaScript from within Perl by anautismobserver (Sexton) on Sep 15, 2019 at 08:31 UTC
I don't mind responses that go beyond the narrow bounds of what I asked. It's like learning a new language: sometimes it's best to immerse myself in the new culture and see what I can absorb. I like learning new software through following tutorials (though this runs the risk of learning outdated information). I'm starting working through The Ultimate Guide To The WordPress REST API (written in September 2015 by Josh Pollock). He recommends using Vagrant, VirtualBox, and Git, which I've downloaded and installed on my computer Is The Ultimate Guide To The WordPress REST API a good resource (obtained from here)? Do you know of any better (perhaps newer) tutorials for the WordPress REST API?	[reply]
Re^3: Running JavaScript from within Perl by marto (Cardinal) on Sep 15, 2019 at 09:26 UTC
"I don't mind responses that go beyond the narrow bounds of what I asked. It's like learning a new language: sometimes it's best to immerse myself in the new culture and see what I can absorb." The method described in the post you're replying to wont help you achieve what you asked to do. If you're interested in learning about JavaScript and HTML/DOM manipulation there are better resources (from the Mojolicious docs): "All web development starts with HTML, CSS and JavaScript, to learn the basics we recommend the Mozilla Developer Network. And if you want to know more about how browsers and web servers actually communicate, there's also a very nice introduction to HTTP." "I've downloaded and installed Vagrant, VirtualBox, and Git for Windows on my computer" What part of problem does this solve? "Is this a good resource? (https://wpengine.com/resources/the-ultimate-guide-to-the-wordpress-rest-api/)" I've no idea, you need to register to download an ebook. "Do you know of any better (perhaps newer) tutorials for the WordPress REST API?" What is missing from the official WordPress documentation? Update: Re^3: Running JavaScript from within Perl (or just use the API)/https://developer.wordpress.com/docs/api/1.1/get/sites/%24site/stats/followers/.	[reply]
Re^4: Running JavaScript from within Perl by anautismobserver (Sexton) on Sep 15, 2019 at 17:09 UTC
Re^5: Running JavaScript from within Perl by marto (Cardinal) on Sep 15, 2019 at 18:32 UTC
Some notes below your chosen depth have not been shown here
Re: Running JavaScript from within Perl by Anonymous Monk on Sep 13, 2019 at 02:40 UTC
You can get Perl to do all that JavaScript or you could ask WordPress how to enjoy WordPress.com in a way that is compliant with the W3C Web Content Accessibility Guidelines 2.0 and the Americans with Disabilities Act.	[reply]
Re: Running JavaScript from within Perl by FreeBeerReekingMonk (Deacon) on Sep 17, 2019 at 20:20 UTC
Maybe also using PhantomJS works for you? (it has JavaScript interpretation). Note that it is almost abandoned.	[reply]


"be consistent"
	PerlMonks