Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Querying Saints in our Book

by chacham (Prior)
on Jul 29, 2015 at 17:12 UTC ( [id://1136765]=monkdiscuss: print w/replies, xml ) Need Help??

Saints in our Book can be queried by morphing the data into a table, but i find myself much too lazy to go through all those steps it each time i want to look. Javascript can be used to find some things like ties but is not queryable. However, it can be used together to create a ready-to-execute query:

// ==UserScript== // @name Perlmonks Saints in our Book to Oracle // @namespace http://www.example.com/chacham/ // @description Turn the data imn Perlmonks Saints in our Book into an + Oracle quey with CTEs // @include http://perlmonks.org/?node_id=3559 // @version 1 // @grant none // ==/UserScript== var query = "WITH\n\tData(Ranking, Name, XP, Writeups)\nAS\n\t("; var table = document.getElementsByTagName('table')[4]; for(var row = 1; row < table.rows.length; row++) query += "\n\t SELECT " + table.rows[row].cells[0].textContent + ",\t'" + table.rows[row].cells[1].textContent + "',\t" + table.rows[row].cells[2].textContent + ",\t" + table.rows[row].cells[4].textContent.replace('None', 0) + "\t FROM + Dual UNION ALL"; // Chop off the extra UNION ALL query = query.slice(0, -10) + "\n\t),\n\tStep_XP_Writeups(Ranking, Name, XP, Writeups, XP_Diff, Wr +iteups_Diff) \ \nAS \ \n\t( \ \n\t SELECT \ \n\t\tRanking, \ \n\t\tName, \ \n\t\tXP, \ \n\t\tWriteups, \ \n\t\tXP - LEAD(XP) OVER(ORDER BY Ranking), \ \n\t\tWriteups - LEAD(Writeups) OVER(ORDER BY Ranking) \ \n\t FROM \ \n\t\tData \ \n), \ \n\tClose(Ranking, XP_Diff, Writeups_Diff) \ \nAS \ \n\t( \ \n\t SELECT \ \n\t\tRanking, \ \n\t\tXP - LEAD(XP) OVER(ORDER BY Ranking), \ \n\t\tWriteups - LEAD(Writeups) OVER(ORDER BY Ranking) \ \n\t FROM \ \n\t\tStep_XP_Writeups \ \n\t WHERE \ \n\t\tXP_Diff\tBETWEEN -30 AND 30 \ \n\t AND\tWriteups_Diff\tBETWEEN -5 AND 5 \ \n), \ \n\tMMA(XP_Diff_Min, XP_Diff_Max, XP_Diff_Avg, Writeups_Diff_Min, Writ +eups_Diff_Max, Writeups_Diff_Avg)\ \nAS \ \n\t( \ \n\t SELECT \ \n\t\tMIN(XP_Diff), \ \n\t\tMAX(XP_Diff), \ \n\t\tAVG(XP_Diff), \ \n\t\tMIN(Writeups_Diff), \ \n\t\tMAX(Writeups_Diff), \ \n\t\tAVG(Writeups_Diff) \ \n\t FROM \ \n\t\tStep_XP_Writeups \ \n\t), \ \n\tStats(Both, Close, XP, XP_Diff_Min, XP_Diff_Max, XP_Diff_Avg, Writ +eups, Writeups_Diff_Min, Writeups_Diff_Max, Writeups_Diff_Avg) \ \nAS \ \n\t( \ \n\t SELECT \ \n\t(SELECT COUNT(COUNT(*)) FROM Data GROUP BY XP, Writeups HAV +ING Count(*) > 1), \ \n\t(SELECT COUNT(*) FROM Close), \ \n\t(SELECT COUNT(COUNT(*)) FROM Data GROUP BY XP HAVING Count(*) +> 1), \ \n\tXP_Diff_Min, \ \n\tXP_Diff_Max, \ \n\tXP_Diff_Avg, \ \n\t(SELECT COUNT(COUNT(*)) FROM Data GROUP BY Writeups HAVING Cou +nt(*) > 1) Writeups, \ \n\tWriteups_Diff_Min, \ \n\tWriteups_Diff_Max, \ \n\tWriteups_Diff_Avg \ \n FROM \ \n\tMMA\n\t) \ \n--SELECT Both, Close, XP, XP_Diff_Min, XP_Diff_Max, XP_Diff_Avg, Wri +teups, Writeups_Diff_Min, Writeups_Diff_Max, Writeups_Diff_Avg FROM S +tats \ \nSELECT \ \n\tData.Ranking, \ \n\tData.Name, \ \n\tData.XP, \ \n\tData.Writeups \ \nFROM \ \n\tData, \n\tClose \ \nWHERE \ \n\tData.Ranking BETWEEN Close.Ranking AND Close.Ranking + 1 \ \nORDER BY \ \n\tData.Ranking;"; var textarea = document.createElement('textarea'); textarea.cols = 100; textarea.rows = 20; textarea.value = query; document.body.appendChild(textarea);

Ugly, though effective, it plops a textarea at the bottom of the page with a query (easily adaptable to your RDBMS.)

Added on OpenUsersJS at Perlmonks Saints in our Book to Oracle

The query, due to the primary CTE, should be relatively easy to modify for more specific purposes.

Replies are listed 'Best First'.
Re: Querying Saints in our Book
by hdb (Monsignor) on Jul 29, 2015 at 18:03 UTC

    Here is my Perl-script to monitor my own progress:

    use strict; use warnings; use LWP::UserAgent; use HTML::TableExtract; use DateTime; die "usage: perl $0 monk password seconds\n" unless $#ARGV == 2; my ( $monk, $pass, $secs ) = @ARGV; print "Querying the monastery once every $secs seconds for monk $monk\ +'s reputation.\n"; $secs = 10 if $secs < 10; while(1) { my $ua = LWP::UserAgent->new; $ua->agent("PMReputation"); my $res = $ua->post( 'http://perlmonks.org/?', {"node_id" => "3559", "op" => "login", "lastnode_id" => "3559", "user" => $monk, "passwd" => $pass, "expires" => "+10y", "length" => "500",} ); die "Cannot enter the monastery ($res->status_line).\n" unless $res- +>is_success; my $te = HTML::TableExtract->new( headers => [ '#', 'User', 'Experie +nce', 'Level', 'Writeups', 'User Since', 'Last Here' ] ); $te->parse( $res->content ); my $writeups = 0; my $exp = 0; my $level = ""; my $rank = 0; my $total = 0; foreach my $ts ($te->tables) { foreach my $row ($ts->rows) { $total = $row->[0]; if( $row->[1] eq $monk ) { ($rank, $exp, $level, $writeups) = @$row[0,2,3,4]; } } } my $now = DateTime->now( time_zone => 'Europe/Berlin' ); my $text = $now->dmy.",".$now->hms.": $level $monk has experience of + $exp and $writeups write ups, Saint $rank/$total.\n"; print $text; if( open my $log, ">>", "narcissistic.txt" ) { print $log $text; close $log; } sleep( $secs-3 ); }
      Less frequent data can be obtained by other methods described here and here.
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Yeah, perl would be nice, however, a script is easier to install, and seems easier for ad hoc queries.

      If only there were perlscript.

Re: Querying Saints in our Book (PL/Perl)
by erix (Prior) on Jul 29, 2015 at 22:34 UTC

    PostgreSQL has stored procedures in perl (PL/Perl - Procedural Language/Perl).

    Here is one example that retrieves the saints-page and shows the columns (sorry; I just parsed the html with a regex):

    It needs plperlu (the 'u' stands for 'untrusted' as it lifts certain security limitations. It can be installed with: "CREATE LANGUAGE plperlu;").

    I guess security is out the window when you retrieve stuff from remote sites straight into your database -- but this is just a demonstration. Be careful out there.

    create function saints_in_our_book() returns table ( rank integer , userid integer , username text , xp integer , userlevel text , writeups integer , usersince timestamp , lasthere text ) language plperlu as $plp$ use strict; use warnings; use LWP::Simple; use IO::String; my $url = "http://perlmonks.org/?node_id=3559"; my $io = IO::String->new(get($url)); while ( <$io> ) { chomp; if(m, ^ <tr(?:[ ]class="highlight")?> <td[ ]align="right">([0-9]+)</td> # + rank <td> <span[ ]class="user-([0-9]+)"> # + userid <a[ ]href="[?]node_id=[0-9]+"> ([^<]+) # + username </a> </span> </td> <td[ ]align="right">([0-9]+)</td> # + xp <td>([a-zA-Z]+\s+\([0-9]+\))</td> # + userlevel <td[ ]align="right"> <a[ ]href="[?]node_id=6364;usersearch=[^"]+"> ([0-9]+) # + writeups </a> </td> <td[ ]align="right">([^<]+)</td> # + usersince <td[ ]align="right">([^<]+)</td> # + lasthere </tr> $ ,x ) { return_next ( { rank => $1 , userid => $2 , username => $3 , xp => $4 , userlevel => $5 , writeups => $6 , usersince => $7 , lasthere => $8 } ); } } return undef; $plp$;

    Now you can query that remote HTML page with SQL. It is slow, of course; I guess it only makes sense to do this if you save it into a table (see CREATE TABLE t AS ...) or a materialized view (see CREATE MATERIALIZED VIEW mv AS ...).

    The SQL query below shows the top 50 rows:

    $ echo " select rank , userid , username , xp , userlevel , writeups , usersince , lasthere from saints_in_our_book() where rank <= 50 ; " | psql -qX rank | userid | username | xp | userlevel | wr +iteups | usersince | lasthere ------+--------+----------------------+---------+-----------------+--- +-------+---------------------+---------------- 1 | 979 | vroom | 1007449 | Pope (28) | + 607 | 1999-11-12 05:53:00 | 37 weeks ago 2 | 171588 | BrowserUk | 156800 | Pope (28) | + 22020 | 2002-06-04 18:12:00 | 13 minutes ago 3 | 381608 | ikegami | 126952 | Pope (28) | + 19570 | 2004-08-10 14:55:00 | 15 hours ago 4 | 5348 | Corion | 117813 | Pope (28) | + 9369 | 2000-03-14 13:56:00 | 4 hours ago 5 | 9073 | merlyn | 71106 | Sage (25) | + 6322 | 2000-04-25 22:41:00 | 5 weeks ago 6 | 461912 | GrandFather | 70930 | Sage (25) | + 6450 | 2005-05-31 03:02:00 | 8 seconds ago 7 | 22609 | tye | 69775 | Cardinal (24) | + 7769 | 2000-07-14 22:45:00 | 13 hours ago 8 | 616540 | moritz | 65723 | Cardinal (24) | + 6415 | 2007-05-21 12:33:00 | 23 hours ago 9 | 17000 | Ovid | 60423 | Cardinal (24) | + 2990 | 2000-06-08 04:57:00 | 1 week ago 10 | 85580 | dragonchild | 58436 | Archbishop (23) | + 5988 | 2001-06-04 23:41:00 | 4 days ago 11 | 1382 | chromatic | 58205 | Archbishop (23) | + 5941 | 1999-12-24 11:10:00 | 5 weeks ago 12 | 26179 | tilly | 57174 | Archbishop (23) | + 5646 | 2000-08-04 16:59:00 | 1 year ago 13 | 131741 | zentara | 56223 | Archbishop (23) | + 7239 | 2001-12-14 00:14:00 | 40 weeks ago 14 | 281137 | davido | 55541 | Archbishop (23) | + 4698 | 2003-08-05 19:52:00 | 16 hours ago 15 | 82147 | Zaxo | 54786 | Archbishop (23) | + 3416 | 2001-05-22 05:34:00 | 6 years ago 16 | 352046 | ww | 45944 | Bishop (22) | + 3539 | 2004-05-10 13:43:00 | 13 hours ago 17 | 290905 | Old_Gray_Bear | 45468 | Bishop (22) | + 587 | 2003-09-12 02:59:00 | 11 hours ago 18 | 29008 | grinder | 43661 | Bishop (22) | + 1570 | 2000-08-22 17:56:00 | 6 years ago 19 | 198160 | CountZero | 42943 | Bishop (22) | + 4284 | 2002-09-16 05:38:00 | 3 hours ago 20 | 324763 | marto | 42094 | Bishop (22) | + 4470 | 2004-01-28 20:01:00 | 24 hours ago 21 | 622051 | toolic | 41484 | Bishop (22) | + 3483 | 2007-06-19 16:21:00 | 10 hours ago 22 | 169744 | Abigail-II | 40866 | Bishop (22) | + 4089 | 2002-05-28 11:53:00 | 6 years ago 23 | 18800 | jeffa | 40192 | Bishop (22) | + 2764 | 2000-06-19 19:14:00 | 16 hours ago 24 | 22308 | dws | 38846 | Chancellor (21) | + 2536 | 2000-07-13 05:09:00 | 35 weeks ago 25 | 180961 | Limbic~Region | 38012 | Chancellor (21) | + 2960 | 2002-07-11 06:53:00 | 1 week ago 26 | 104919 | perrin | 36242 | Chancellor (21) | + 4171 | 2001-08-15 04:13:00 | 2 years ago 27 | 114691 | Aristotle | 35627 | Chancellor (21) | + 5335 | 2001-09-26 01:49:00 | 20 weeks ago 28 | 2680 | davorg | 34015 | Chancellor (21) | + 2988 | 2000-02-02 01:01:00 | 4 weeks ago 29 | 58196 | Fletch | 33799 | Chancellor (21) | + 3608 | 2001-02-13 23:08:00 | 2 days ago 30 | 80749 | tachyon | 32350 | Chancellor (21) | + 3268 | 2001-05-16 03:26:00 | 10 years ago 31 | 186362 | adrianh | 32159 | Chancellor (21) | + 2205 | 2002-07-30 20:57:00 | 2 years ago 32 | 44715 | graff | 31661 | Chancellor (21) | + 3880 | 2000-12-04 02:04:00 | 2 weeks ago 33 | 108447 | demerphq | 31508 | Chancellor (21) | + 3269 | 2001-08-28 18:16:00 | 1 year ago 34 | 248054 | Your Mother | 30978 | Chancellor (21) | + 2536 | 2003-04-04 13:42:00 | 14 hours ago 35 | 534893 | Gavin | 30799 | Chancellor (21) | + 598 | 2006-03-07 12:15:00 | 2 hours ago 36 | 533863 | roboticus | 30531 | Chancellor (21) | + 2888 | 2006-03-02 12:31:00 | 7 hours ago 37 | 176576 | eyepopslikeamosquito | 30092 | Chancellor (21) | + 1295 | 2002-06-23 06:02:00 | 13 minutes ago 38 | 442602 | planetscape | 30055 | Chancellor (21) | + 1290 | 2005-03-27 09:28:00 | 2 days ago 39 | 524150 | syphilis | 29621 | Canon (20) | + 2714 | 2006-01-19 07:30:00 | 4 hours ago 40 | 157432 | Joost | 29453 | Canon (20) | + 3245 | 2002-04-08 13:32:00 | 3 years ago 41 | 708738 | LanX | 29231 | Canon (20) | + 4599 | 2008-09-03 12:42:00 | 3 hours ago 42 | 1936 | japhy | 29144 | Canon (20) | + 2347 | 2000-01-10 19:44:00 | 42 weeks ago 43 | 224409 | borisz | 27898 | Canon (20) | + 913 | 2003-01-05 14:23:00 | 3 weeks ago 44 | 190859 | bart | 27165 | Canon (20) | + 2611 | 2002-08-17 14:36:00 | 51 weeks ago 45 | 528646 | Khen1950fx | 26518 | Canon (20) | + 1899 | 2006-02-07 22:12:00 | 12 hours ago 46 | 510280 | shmem | 26470 | Canon (20) | + 2519 | 2005-11-20 20:58:00 | 19 hours ago 47 | 421114 | Tanktalus | 26170 | Canon (20) | + 2178 | 2005-01-10 22:07:00 | 1 day ago 48 | 123953 | strat | 26010 | Canon (20) | + 538 | 2001-11-08 03:38:00 | 2 years ago 49 | 832495 | choroba | 25807 | Canon (20) | + 3418 | 2010-04-02 15:13:00 | 16 minutes ago 50 | 194920 | diotalevi | 25716 | Canon (20) | + 3030 | 2002-09-03 22:06:00 | 3 years ago (50 rows)

    UPDATE:Some lines are missing as I apparently made some error in the regex; I haven't the time to fix it now (maybe later).

    UPDATE 2: fixed. (error was: while (<$io>) { and then $io->getline *facepalm*)

    UPDATE 3: I added 'User Since' and 'Last Here' columns, and a computed column 'away' (=lasthere cast to postgres data type interval).

    -- clear earlier stuff drop table if exists saints_in_our_book_20150801 cascade; -- create the current html page download into a table: create table saints_in_our_book_20150801 as select * from sa +ints_in_our_book(); -- create a view out of that table (and add the 'away' colum): create or replace view saints_in_our_book as select *, replac +e(lasthere, ' ago', '')::interval as away from saints_in_our_book_201 +50801; -- query the view: $ echo " select * from saints_in_our_book -- the view on top of the table where away < '4 hours' order by now() - away desc , xp desc ; " | psql -qX rank | userid | username | xp | userlevel | wr +iteups | usersince | lasthere | away ------+---------+----------------------+--------+-----------------+--- +-------+---------------------+----------------+---------- 284 | 399498 | erix | 6400 | Vicar (15) | + 362 | 2004-10-15 13:22:00 | 4 seconds ago | 00:00:04 6 | 461912 | GrandFather | 70930 | Sage (25) | + 6450 | 2005-05-31 03:02:00 | 5 seconds ago | 00:00:05 108 | 70929 | atcroft | 15133 | Monsignor (18) | + 633 | 2001-04-09 08:59:00 | 12 seconds ago | 00:00:12 80 | 401112 | johngg | 18402 | Abbot (19) | + 1682 | 2004-10-21 10:23:00 | 17 seconds ago | 00:00:17 72 | 634253 | AnomalousMonk | 20401 | Abbot (19) | + 2819 | 2007-08-22 06:27:00 | 52 seconds ago | 00:00:52 634 | 310300 | wjw | 3212 | Curate (13) | + 371 | 2003-11-26 16:34:00 | 1 minute ago | 00:01:00 2 | 171588 | BrowserUk | 156800 | Pope (28) | + 22020 | 2002-06-04 18:12:00 | 5 minutes ago | 00:05:00 49 | 832495 | choroba | 25807 | Canon (20) | + 3418 | 2010-04-02 15:13:00 | 10 minutes ago | 00:10:00 37 | 176576 | eyepopslikeamosquito | 30092 | Chancellor (21) | + 1295 | 2002-06-23 06:02:00 | 13 minutes ago | 00:13:00 287 | 1001958 | karlgoethebier | 6345 | Vicar (15) | + 925 | 2012-11-02 12:43:00 | 20 minutes ago | 00:20:00 561 | 905403 | tangent | 3558 | Curate (13) | + 363 | 2011-05-18 02:41:00 | 21 minutes ago | 00:21:00 254 | 491819 | ioannis | 6917 | Vicar (15) | + 131 | 2005-09-14 11:10:00 | 27 minutes ago | 00:27:00 302 | 885521 | hippo | 6076 | Vicar (15) | + 505 | 2011-02-01 15:37:00 | 27 minutes ago | 00:27:00 350 | 221725 | poj | 5228 | Priest (14) | + 623 | 2002-12-22 10:42:00 | 27 minutes ago | 00:27:00 325 | 749850 | VinsWorldcom | 5667 | Vicar (15) | + 326 | 2009-03-11 12:42:00 | 38 minutes ago | 00:38:00 267 | 436161 | davies | 6628 | Vicar (15) | + 446 | 2005-03-03 11:39:00 | 42 minutes ago | 00:42:00 132 | 992570 | Laurent_R | 12485 | Monsignor (18) | + 1850 | 2012-09-09 08:56:00 | 48 minutes ago | 00:48:00 110 | 200365 | Tux | 14738 | Monsignor (18) | + 1440 | 2002-09-24 13:49:00 | 1 hour ago | 01:00:00 295 | 919638 | RichardK | 6195 | Vicar (15) | + 396 | 2011-08-10 11:34:00 | 1 hour ago | 01:00:00 35 | 534893 | Gavin | 30799 | Chancellor (21) | + 598 | 2006-03-07 12:15:00 | 2 hours ago | 02:00:00 71 | 968231 | Athanasius | 20547 | Abbot (19) | + 1233 | 2012-05-01 11:13:00 | 2 hours ago | 02:00:00 147 | 747201 | afoken | 11033 | Prior (17) | + 1139 | 2009-02-28 20:01:00 | 2 hours ago | 02:00:00 184 | 341121 | dave_the_m | 8997 | Parson (16) | + 849 | 2004-03-30 22:38:00 | 2 hours ago | 02:00:00 331 | 1015412 | soonix | 5615 | Vicar (15) | + 290 | 2013-01-25 21:42:00 | 2 hours ago | 02:00:00 367 | 222702 | parv | 5011 | Priest (14) | + 690 | 2002-12-28 08:05:00 | 2 hours ago | 02:00:00 19 | 198160 | CountZero | 42943 | Bishop (22) | + 4284 | 2002-09-16 05:38:00 | 3 hours ago | 03:00:00 39 | 524150 | syphilis | 29621 | Canon (20) | + 2714 | 2006-01-19 07:30:00 | 3 hours ago | 03:00:00 41 | 708738 | LanX | 29231 | Canon (20) | + 4599 | 2008-09-03 12:42:00 | 3 hours ago | 03:00:00 74 | 295576 | ambrus | 19603 | Abbot (19) | + 2740 | 2003-10-01 11:47:00 | 3 hours ago | 03:00:00 88 | 446266 | salva | 17322 | Abbot (19) | + 2243 | 2005-04-09 17:47:00 | 3 hours ago | 03:00:00 576 | 1039428 | Eily | 3498 | Curate (13) | + 278 | 2013-06-17 17:59:00 | 3 hours ago | 03:00:00 (31 rows)

    Now I am at the top of 'Saints in our book' ;)

      Good stuff. I forgot about external tables. Though, i'm not sure PM would appreciate that too much anyway. :)

      I guess security is out the window when you retrieve stuff from remote sites straight into your database

      Every once in a while i want to query or move data between multiple environments. Normally, this would require a database link which requires authorization by the DBAs and so on. So, i installed a local database with database links. Very convenient, at least. :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: monkdiscuss [id://1136765]
Approved by shmem
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-04-20 00:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found