Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Seekers of Perl Wisdom

( #479=superdoc: print w/replies, xml ) Need Help??

If you have a question on how to do something in Perl, or you need a Perl solution to an actual real-life problem, or you're unsure why something you've tried just isn't working... then this section is the place to ask.

However, you might consider asking in the chatterbox first (if you're a registered user). The response time tends to be quicker, and if it turns out that the problem/solutions are too much for the cb to handle, the kind monks will be sure to direct you here.

Post a new question!

User Questions
Seeking a way to access a network user's ID
2 direct replies — Read more / Contribute
by TorontoJim
on Jan 29, 2023 at 06:56
    Can a lady or gentleman please help me find the correct resource. I am looking for information that I can use in an app I'm building.

    It will be run on a LAN behind a firewall. It would be far more convenient if the user could be logged in automatically if they are already logged in to the network it is running on. So, I'm seeking information on accessing a network users's userid and login status. I'm not looking for specific code, just guidance to the proper keywords or terms that I need to research to achieve this end goal, if it is even possible.

    I don't mind doing the legwork, I just don't know what the specific legwork is that I must do right now.

Why don't some CPAN modules append perllocal?
3 direct replies — Read more / Contribute
by Anonymous Monk
on Jan 28, 2023 at 21:15
    perl -MExtUtils::Installed -le '$_=ExtUtils::Installed->new->modules;p +rint'
    perl -le '@_=grep/Module/,split/\n/,`perldoc -uT perllocal`;print scal +ar@_'
    perl -MExtUtils::Installed -le '@_=ExtUtils::Installed->new->modules;p +rint for sort{lc$a cmp lc$b}@_' >mod.ext perl -e '@_=grep/Module/,`perldoc -uT perllocal`;for(@_){s/.*?\|([^>]+ +)>/$1/}print for sort{lc$a cmp lc$b} @_' >mod.loc diff -y mod.ext mod.loc

    diff -y mod.ext mod.loc|grep '<'|wc -l
    diff -y mod.ext mod.loc|grep '>'|wc -l
What to test in a new module
3 direct replies — Read more / Contribute
by Bod
on Jan 28, 2023 at 17:16

    I've created a helper function for my own purposes and thought it would be useful to others. So CPAN seems a sensible place to put it so others can use it if they want to...

    It's function is simple - to go to the homepage of a website and return an array of URI's within that site, being careful not to stray outside it, that use the http or https scheme. It ignores things that aren't plain text or that it cannot parse such as PDFs or CSS files but includes Javascript files as links (thing like or document.location.href) might be lurking there. It deliberately doesn't try to follow the action attribute of a form as that is probably meaningless without the form data.

    As the Monastery has taught be that all published modules should have tests, I want to do it probably and provide those tests...

    But, given that there is only one function and it makes HTTP requests, what should I test?

    The obvious (to me) test is that it returns the right number of URIs from a website. But that number will likely change over time, so I cannot hardcode the 'right' answer into the tests. So beyond the necessary dependencies and their versions, I'd like some ideas of what should be in the tests, please.

    In case you're interested, this came about from wanting to automate producing and installing sitemap files.

Any reference about %SIG
3 direct replies — Read more / Contribute
by exilepanda
on Jan 28, 2023 at 03:56
    I want to study more on SIGNAL. But after half hour digging, there's nothing useful. I am looking some full documentation that explains

    $SIG{__DIE__} will fire when die() happen
    $SIG{INT} will fire when Ctrl-C pressed

    Is that anything like this exists?

Can't locate module in @INC when executing script over SSH
1 direct reply — Read more / Contribute
by Leudwinus
on Jan 27, 2023 at 13:55

    I have the following script (json_test) on a remote FreeBSD virtual machine which runs fine when executed directly on the VM:

    #!/usr/bin/env perl use lib "/home/user/perl5/lib"; use JSON; # line 4 print JSON->new->pretty->encode( { qw( a b ) } );

    However, I get the following error message when I try to run it via SSH from my local machine:

    leudwinus@localmachine:~$ ssh user@remotemachine './json_test' Can't locate in @INC (you may need to install the JSON module) + (@INC contains: /home/user/perl5/lib /usr/local/lib/perl5/site_perl/ +mach/5.32 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.32/mach /usr/loc +al/lib/perl5/5.32) at ./json_test line 4. BEGIN failed--compilation aborted at ./json_test line 4.

    Any suggestions on how I can execute this script remotely via SSH?

    On the virtual machine, I get the following information:

    user@remotemachine:~$ perldoc -l JSON /home/user/perl5/lib/perl5/ user@remotemachine:~$ perl -V Summary of my perl5 (revision 5 version 32 subversion 1) configuration +: ... @INC: /home/user/perl5/lib/perl5/5.32.1/amd64-freebsd-thread-multi /home/user/perl5/lib/perl5/5.32.1 /home/user/perl5/lib/perl5/amd64-freebsd-thread-multi /home/user/perl5/lib/perl5 /home/user/perl5/lib/perl5/5.32.1/amd64-freebsd-thread-multi /home/user/perl5/lib/perl5/5.32.1 /home/user/perl5/lib/perl5/amd64-freebsd-thread-multi /home/user/perl5/lib/perl5 /usr/local/lib/perl5/site_perl/mach/5.32 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.32/mach /usr/local/lib/perl5/5.32

    JSON is installed on my local Ubuntu machine at /home/leudwinus/perl5/lib/perl5/ but I didn't think that would matter.

    Thanks in advance!

CGI Webserver In Perl?
3 direct replies — Read more / Contribute
by sectokia
on Jan 26, 2023 at 18:52

    Hi monks,

    I have a large amount of legacy CGI Perl scripts, which were made to be executed by Apache modperl.

    What I want to do is move to a http server inside a perl script (like AnyEvent::HTTPD) that supports calling the perl CGI scripts - and then package the entire thing as a single exe using par. This is so users can run it without having to have either apache or perl installed.

    Now CGI seems fairly straight forward (content to STDIN, headers to ENV, and STDOUT is the HTTP response), but I figured someone has probably done this before? Or are there better ideas?

    I am essentially doing this to execute the cgi in the httpd request callback:

    { local *STDOUT; local *STDIN; #local %ENV; #$ENV{http-cookie} = $requestCookie; open(STDIN, "<", \$requestBody); open (STDOUT, '>>', \$response); do './cgi/'; }
help with "symbol lookup error" message
3 direct replies — Read more / Contribute
by Special_K
on Jan 26, 2023 at 13:20

    I have the following code:

    #!/tool/bin/perl -w use lib '/home/user_foo/perl_modules/lib/perl5'; use List::MoreUtils qw(uniq); my @test_array = qw(1 2 3 4 4 4 4); my @uniq_test_array = uniq(@test_array); printf("\@uniq_test_array = @uniq_test_array\n");

    The output is as follows (as expected):

    @uniq_test_array = 1 2 3 4

    Now if I add another module reference (File::Copy) to the top of the file like so:

    #!/tool/bin/perl -w use lib '/home/user_foo/perl_modules/lib/perl5'; use List::MoreUtils qw(uniq); use File::Copy;

    I receive the following error message:

    /tool/bin/perl: symbol lookup error: /home/user_foo/perl_modules/lib/p +erl5/x86_64-linux/auto/List/Util/ undefined symbol: Perl_gv_i +nit

    The error references the List::Util module, but doesn't occur unless I add the "use File::Copy", which is a completely separate module. What is going on here?

Script exponentially slower as number of files to process increases
5 direct replies — Read more / Contribute
by xnous
on Jan 25, 2023 at 15:04
    Hello, monks. With your help I've written a script that processes a large number of text files, efficiently. I run this script inside directories containing 1K to 10K files, usually less than 5K.

    However, I've noticed that attempting to process larger number of files, i.e. several directories at once, the script gets exponentially slower. For example, while a run on 3.5K files would takes around 4.5 seconds, on 35K files takes 90 instead of 45 seconds and on 350K files it runs for hours.

    This has baffled me, as I'm using subdirectories to organize the data, and filesystem operations shouldn't impact performance negatively; additionally, the data filenames are glob()bed into an array which is looped over and not slurped in at once and processed in bulk (although, in my tests I tried that approach which exhibited the same behavior).

    What's very interesting is that when I put a counter to stop processing at 1000 files, I got increasingly longer processing times with each subdirectory added to the list, despite only processing 1000 files from it. Also, I always copy my data to /tmp which is mounted as tmpfs to reduce SSD wear and achieve maximum read/write performance. Testing:

    wget html2text-cpp Nightfall.htm >nightfall.txt mkdir 00; for i in `seq -w 0 3456`; do head -$((RANDOM/128)) nightfall +.txt >00/data-$i; done
    This will create a directory ("00") with 3,456 random sized files inside. Perl script:
    #!/usr/bin/perl use strict; use warnings; use 5.36.0; use Env; use utf8; use POSIX "sys_wait_h"; #for waitpid FLAGS use Time::HiRes qw(gettimeofday tv_interval); use open ':std', ':encoding(UTF-8)'; my $benchmark = 1; # print timings for loops my $TMP='/tmp'; my $HOME = $ENV{HOME}; my $IN; my $OUT; my @data = glob("data-* ??/data-*"); my $filecount = scalar(@data); die if $filecount < 0; say "Parsing $filecount files"; my $wordfile="data.dat"; truncate $wordfile, 0; #$|=1; # substitute whole words my %whole = qw{ going go getting get goes go knew know trying try tried try told tell coming come saying say men man women woman took take lying lie dying die }; # substitute on prefix my %prefix = qw{ need need talk talk tak take used use using use }; # substitute on substring my %substring = qw{ mean mean work work read read allow allow gave give bought buy want want hear hear came come destr destroy paid pay selve self cities city fight fight creat create makin make includ include }; my $re1 = qr{\b(@{[ join '|', reverse sort keys %whole ]})\b}i; my $re2 = qr{\b(@{[ join '|', reverse sort keys %prefix ]})\w*}i; my $re3 = qr{\b\w*?(@{[ join '|', reverse sort keys %substring ]})\w*} +i; truncate $wordfile, 0; my $maxforks = 64; print "maxforks: $maxforks\n"; my $forkcount = 0; my $infile; my $subdir = 0; my $subdircount = 255; my $tempdir = "temp"; mkdir "$tempdir"; mkdir "$tempdir/$subdir" while ($subdir++ <= $subdircount); $subdir = 0; my $i = 0; my $t0 = [gettimeofday]; my $elapsed; foreach $infile(@data) { $forkcount -= waitpid(-1, WNOHANG) > 0 while $forkcount >= $maxfor +ks; # do { $elapsed=tv_interval($t0); print "elapsed: $elapsed\n"; die; + } if $i++ >1000; # 1000 files test $i++; # comment out if you uncomment the above line $subdir = 1 if $subdir++ > $subdircount; if (my $pid = fork) { # $pid defined and !=0 -->parent ++$forkcount; } else { # $pid==0 -->child open my $IN, '<', $infile or exit(0); open my $OUT, '>', "$tempdir/$subdir/text-$i" or exit(0); while (<$IN>) { tr/-!"#%&()*',.\/:;?@\[\\\]_{}><^)(|/ /; # no punct " s/^/ /; s/\n/ \n/; s/[[:digit:]]{1,12}//g; s/w(as|ere)/be/gi; s{$re2}{ $prefix{lc $1} }g; # prefix s{$re3}{ $substring{lc $1} }g; # part s{$re1}{ $whole{lc $1} }g; # whole print $OUT "$_"; } close $OUT; close $IN; defined $pid and exit(0); # $pid==0 -->child, must exit itself } } ### now wait for all children to finish, no matter who they are 1 while wait != -1; # avoid zombies this is a blocking operation local @ARGV = glob("$tempdir/*/*"); my @text = <>; unlink glob "$tempdir/*/*"; open $OUT, '>', $wordfile or die "Error opening $wordfile"; print $OUT @text; close $OUT; $elapsed = tv_interval($t0); print "regex: $elapsed\n" if $benchmark;
    Add more directories to process:

    for dir in $(seq -w 01 10); do cp -a 00 $dir; done

    Any help and insight will be greatly appreciated.

Handling reconnection with Mail::IMAPClient
2 direct replies — Read more / Contribute
by Discipulus
on Jan 25, 2023 at 04:43
    Hello folks,

    I'm using Mail::IMAPClient to write my own mail monitor and currently trying handling connection problems, simulated simply detaching the ethernet cable. Ideally I'd like the program to retry a connection for some times before giving definitevely up.

    With both intuitive solution like $imap->connect() if $imap->IsUnconnected and also reinstanciating the whole $imap object ( new calls connect or login on its own) I get a nasty loop of: Lost connection to ... retry 1 retry 2.. [I plug eth] Succesfully reconnected .. Lost connection to ..

    Here the code reduced to the minimum (I left the read password echo(0) to play nicely with your terminal security ;)

    use strict; use warnings; use Mail::IMAPClient; use Term::ReadKey; my $user = ''; my $server = ''; my $port = 993; my $ssl = 1; # generally ssl is used my $imap_folder = "INBOX"; my $sleep = 5; print "\n(use CTRL-C to permit logout)\n"; print "Enter the password for $user on $server\n"; my $password; ReadMode('noecho'); $password = ReadLine(0); chomp $password; ReadMode 'normal'; my $imap = Mail::IMAPClient->new( Server => $server, User => $user, password => $password, Port => $port, Ssl=> $ssl, Uid=> 1, ) or die "IMAP Failure: $@"; print "Logged in succesfully\n" if $imap->IsConnected(); # Handle Ctrl-C $SIG{INT} = sub{ print "\n\nLogging out..\n"; $imap->logout(); exit; }; # do NOT mark as read when handling messages $imap->Peek(1); # look in INBOX $imap->select( $imap_folder ) or die "IMAP Select Error for imap folde +r [$imap_folder]: $@"; my $now = time; my %seen = map { $_ => 1} $imap->sentsince($now); while (1){ ################################################################## +############## # Handling a lost connection ################################################################## +############## if ( $imap->IsUnconnected ){ print "Lost connection to $server\n"; for (1..100){ sleep 1 for 1..5; print "Connection retry $_..\n"; # option 1 # enters in a loop connected / unconnected # $imap->connect(); # option 2 # also this enters in a loop $imap = Mail::IMAPClient->new( Server => $server, User => $user, password => $password, Port => $port, Ssl=> $ssl, Uid=> 1, ); next unless $imap; last if $imap->IsConnected(); } # give up.. die "Reconnection impossible!" if $imap->IsUnconnected(); # if here we succeded print "Succesfully reconnected!\n"; } # as the above solutions do not work.. next if $imap->IsUnconnected(); ################################################################## +############## my @msgs = $imap->sentsince($now); foreach my $msg (@msgs){ # without this line I get an undef $msg entry in %seen next unless defined $msg; next if $seen{$msg}; $seen{ $msg }++; print "New message: ", $imap->get_header( $msg, "Subject" ), " from: ", $imap->get_header( $msg, "From" ),"\n"; } sleep 1 for 1..$sleep; $now = time; }

    Some idea on how reconnect safely and really?


    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
approximating geological problems with highway data
5 direct replies — Read more / Contribute
by Aldebaran
on Jan 25, 2023 at 02:19

    Hello Monks,

    I'm doing what people have to do in mormon country to save their sanity: study the rocks, which turns out to be, by far, the most interesting things in this region. So I came up with a script that was going to tell me something about the Bonneville flood, which I've been digging through the last 30 years of my life without having known it until last year. It is literally true that I brought the internet to Idaho in the form of walking in front of the caterpillar that ripped a channel for the fiberobtic cable from Preston to Spokane. (1989)

    For the distance calc, I used as a source. I supply a date and time that isn't relevant yet but stubbed in to be used eventually. Let me me just trot this thing out: It's behaving pretty well, I think. I don't mean to thumb my nose at the metric system by using feet and miles, but it's definitely the parlance of the locals:

    fritz@laptop:~/Documents/gitlab1$ ./ Subroutine get_logger redefined at ./ line 68. INFO: ./ INFO: pi is 3.14159265358979 INFO: Bonneville max altitude in feet: 5200 INFO: Boise 43.61 -116.2 INFO: return from the google is 821 meters INFO: Altitude in feet is 2693.56964 INFO: Difference from max Bonneville elevation is 2506.43036 ft INFO: ============== INFO: distance is 240.161359349768 miles INFO: ============== INFO: near sublett 42.3278114 -113.2104084 INFO: return from the google is 1455 meters INFO: Altitude in feet is 4773.6222 INFO: Difference from max Bonneville elevation is 426.3778 ft INFO: ============== INFO: distance is 175.16979925839 miles INFO: ============== INFO: snowville 41.9655701 -112.7105234 INFO: return from the google is 1384 meters INFO: Altitude in feet is 4540.68256 INFO: Difference from max Bonneville elevation is 659.31744 ft INFO: ============== INFO: distance is 35.8058824322112 miles INFO: ============== INFO: juniper 42.152429 -112.9842191 INFO: return from the google is 1577 meters INFO: Altitude in feet is 5173.88468 INFO: Difference from max Bonneville elevation is 26.1153199999999 ft INFO: ============== INFO: distance is 19.0729839145012 miles INFO: ============== INFO: on the outcrop 40.703684 -111.7849951 INFO: return from the google is 1752 meters INFO: Altitude in feet is 5748.03168 INFO: Difference from max Bonneville elevation is -548.03168 ft INFO: ============== INFO: distance is 117.801976807568 miles INFO: ============== INFO: Cascade way 40.7062734 -111.7941259 INFO: return from the google is 1524 meters INFO: Altitude in feet is 5000.00016 INFO: Difference from max Bonneville elevation is 199.99984 ft INFO: ============== INFO: distance is 0.510598364633098 miles INFO: ============== INFO: Mantua 41.5073303 -111.944728 INFO: return from the google is 1582 meters INFO: Altitude in feet is 5190.28888 INFO: Difference from max Bonneville elevation is 9.71111999999994 ft INFO: ============== INFO: distance is 55.8975408837833 miles INFO: ============== INFO: dry creek 41.5501001 -111.9537977 INFO: return from the google is 1810 meters INFO: Altitude in feet is 5938.3204 INFO: Difference from max Bonneville elevation is -738.3204 ft INFO: ============== INFO: distance is 2.99196958759918 miles INFO: ============== INFO: wellsville 41.6365147 -111.9288947 INFO: return from the google is 1376 meters INFO: Altitude in feet is 4514.43584 INFO: Difference from max Bonneville elevation is 685.56416 ft INFO: ============== INFO: distance is 6.1074736112501 miles INFO: ============== fritz@laptop:~/Documents/gitlab1$

    So far, so good, and I welcome any style, usage, or numeric improvements. I might have 2 significant figures, but how much width can I fill up? Let's take a look at the dependencies:

    fritz@laptop:~/Documents/gitlab1$ grep use use v5.030; # strictness implied use warnings; use feature qw[ signatures ]; use Data::Dumper; use DateTime; use DateTime::Format::ISO8601; use DateTime::TimeZone; use Log::Log4perl; use Try::Tiny; use LWP::UserAgent; use HTTP::Request; use Data::Roundtrip;

    I have to use Log4perl to keep this type of data straight, and I use this hack of Log4perl so as to not have a .conf file dependency. Otherwise heavy with a lot of software from bliako. I wouldn't have been able to get these values right in geologic time without bliako's help at this level:

    # this request was translated from Curl command-line # by [Corion]'s my $req = HTTP::Request->new( 'POST' => ' +s_Advanced_GoogleMapCoords_Component/Component/json-get-elevation', [ 'Connection' => 'keep-alive', 'Accept' => '*/*', 'Accept-Encoding' => 'gzip, x-gzip, deflate, x-bzip2, bzip2', 'Accept-Language' => 'en-US,en;q=0.5', # 'Host' => '', 'Referer' => '', 'User-Agent' => 'Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firef +ox/78.0', 'Content-Length' => $payloadlen, 'Content-Type' => 'application/x-www-form-urlencoded; charse +t=UTF-8', 'DNT' => '1', 'Origin' => '', 'X-Requested-With' => 'XMLHttpRequest' ], $payload ); die "call to HTTP::Request has failed" unless $req;

    What's the problem? 1. These values are ultimately dependent on Google, LLC. I'm happy that I can make use of them at this heartbeat, but I'm looking for other sources.

    Q1) Does anyone know of another source for elevations on the interweb, given latitude and longitude? It seems like something more than one crappy company could pull off, in the age of satellites and lidar.

    Q2) How do I make make such queries without associating it to my IP address?

    changing tack...

    Rather than continuing in this line, let me try to get where I'm going by different methods. As I followed the path of the flood from Brigham City to Pocatello, I realized that I was on US highway 91 the entire time. If I instead had a dataset that represented that path, then I would also have a gradient for the flood. Let me try to represent what I have:,+UT/Logan,+UT/Preston,+ID +/Pocatello,+ID/@42.1742739,-113.24861,8z/data=!3m1!4b1!4m26!4m25!1m5! +1m1!1s0x87537b19bce31079:0xeeb9df2a3b92a006!2m2!1d-112.0155015!2d41.5 +102129!1m5!1m1!1s0x87547de05542a865:0xa9b33d0bcbbebcd5!2m2!1d-111.833 +8359!2d41.7369803!1m5!1m1!1s0x8754f78c7cda6c31:0xf1b3b4fc465a4a3f!2m2 +!1d-111.8766173!2d42.0963133!1m5!1m1!1s0x53554f20d5d09b0d:0x70c6a2484 +ce39b0!2m2!1d-112.4506191!2d42.8621042!3e0

    I know that very few Germans will have been to Pocatello, but I find them to be some of the greatest enthusiasts for the rifted out sections of the American west. All that nothing, but then it's also so rich because it was a pluvial lake. The potatoes love it.

    Q3) If not Highway 91, is there any other highway you can glean the points from?

    Q4) Can we thus approximate the Mosel river from Trier to Koblenz? How many points do you need to drive a car next to this river (without falling in), as I remember doing many moons ago?

    Ok, let me just throw this out there and see what I get.

    Thanks for your comment and cheers,

Add your question
Your question:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2023-01-30 17:47 GMT
Find Nodes?
    Voting Booth?

    No recent polls found