Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Meditations

( [id://480]=superdoc: print w/replies, xml ) Need Help??

If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
Let's make BBQ a Saint!
5 direct replies — Read more / Contribute
by eyepopslikeamosquito
on Aug 04, 2023 at 06:28
Perl's not dead, and neither is the community
2 direct replies — Read more / Contribute
by talexb
on Jul 21, 2023 at 11:31

    Last week, I hosted The Perl and Raku Conference (TPRC) 2023 in Toronto, Canada. We had under a hundred attendees, and we had a three day schedule of sessions with three tracks. There was also a hackathon Monday and Friday, and Dave Rolsky put on a one day course in Go on the Friday.

    I've been going to these conferences on and off for about twenty years (2000, 2001, 2002, 2012, 2019 and 2022), so I had a pretty good idea how they work. Putting on my own conference was eye-opening, but what really moved me was the impressive number of volunteers that helped out. There were just people who didn't know much about Perl who came out, but I also had speakers jump in to help with A/V setup and all kinds of other details like making up badges. It was fabulous.

    Our keynote speaker was Curtis Poe (Ovid) who talked about Cor, the new object layer that's an experimental feature in Perl 5.38 (just released). We also had Paul Evans (leonerd, the current pumpking) who gave a talk about what was new in this new version of Perl. The talks, as well as a pile of Lightning Talks are in the process of being edited together and uploaded to Youtube. And next year's conference is already planned for Las Vegas, Nevada in June, 2024.

    Yeah, Perl's an old language. But it's still alive and well. :)

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

EyeBall stumps BodBall (Error Handling)
4 direct replies — Read more / Contribute
by eyepopslikeamosquito
on Jul 06, 2023 at 20:18

    However, I will not call die. I find it frustrating when modules die.

    -- Bod in Re^6: STDERR in Test Results

    While I doubt Bod, hailing from (working-class) Coventry UK, would be permitted to enter the hallowed Long Room at Lords to hurl abuse at the Australian cricket team during the Ashes test match last weekend, I'm sure he won't be stumped by this meditation's title ... unlike monks from non-cricket-playing nations, doubtless unfamiliar with Bazball :).

    Bodball, you may recall I once scolded you for asking "what should I test?" long after you'd released your module. I similarly urge you to get into the habit of thinking long and hard about your module's error handling well before you release it, and for the same reasons. Like TDD, it's crucial to perform this error-handling analysis early because doing so will likely change your module's interface.

    Further to the excellent general advice you've already recieved from afoken, I'm interested to learn more about the errors you commonly encounter in practice when using your Business::Stripe::Webhook module. I also urge you to add an expanded "Error Handling" section to your module's documentation.

    General Error Handling Advice

    Don't fail silently. Failure is inevitable; failing to report failures is inexcusable. Failing silently causes the following problems:

    • Users wonder whether something has gone wrong. ("Why did my order not go through?")
    • Customer support wonders what caused a problem. ("The log file gave no indication of a problem")

    Embrace your software's fallibility. Assume that humans will make mistakes using your software. Try to minimize ways for people to misuse your software, but assume that you can't completely eliminate misuse. Therefore, plan error messages as you design software.

    -- General error handling advice from Google Error Messages course

    Programming Tips

    What should a function do if it cannot perform its allocated task?

    • return a value indicating failure
    • throw an exception
    • terminate the program

    Return failure when:

    • an error is normal and expected (e.g. opening a file)
    • an immediate caller can reasonably be expected to handle the failure

    Throw an exception when:

    • an error is so rare that the programmer is likely to forget to check for it
    • an error cannot be handled by the immediate caller
    • new kinds of errors are added in lower modules that higher level modules were not written to cope with
    • no suitable return path for error codes is available (e.g. semipredicate problem)
    • return path of a function is made uglier by the need to return an error indicator
    • the function that found the error was a callback
    • an error requires an "undo" action (unlike RAII say)

    This is not a black and white issue. Experience and good taste are required.

    Business::Stripe::Webhook Error Handling

    Though unfamiliar with your Business::Stripe::Webhook domain, I briefly browsed your module's documentation. Good to see you've already written a short "Errors and Warnings" section in its documentation; I suggest you improve and expand this section for the next release.

    AFAICT, your basic error handling strategy is for your methods to set the error property, for example:

    $vars{'error'} = 'Missing payload data'
    with the module user expected to check this error property after calling each method. Is that right?

    I think a clear statement of your overall error-handling strategy, combined with a couple of real-world examples of handling common errors you've experienced when using your module, would be invaluable to your users ... and may cause you to tweak your module's error-handling code and interface ... which is why this step is ideally performed well before release. :)

    See Also

    Updated: minor changes to wording were made shortly after posting. Added more references to the See Also section.

Solving the Long List is Long challenge, finally?
6 direct replies — Read more / Contribute
by marioroy
on Jul 01, 2023 at 03:59

    Chuma posted an interesting Long list is long challenge, last year October 2022. eyepopslikeamosquito created a separate thread one month later. Many solutions were provided by several folks.

    My goal was keeping memory consumption low no matter if running a single thread or 20+ threads. Ideally, running more threads should run faster. It turns out that this is possible. Ditto, zero merge overhead as the keys are unique. Just move the elements from all the sub-maps over to a vector for sorting and output.

    In a nut-shell, the following is the strategy used for the hash-map solutions in latest June 2023 refresh.

    1. create many sub-maps and mutexes 2. parallel single-file via chunking versus parallel list of files 3. create a hash-value for the key and store the value with the key 4. determine the sub-map to use by hash-value MOD number-of-maps 5. there are total 963 sub-maps to minimize locking contention 6. randomness kicks in, allowing many threads to run

    Thank you, Gregory Popovitch. He identified the last one-off error in my C++ chunking logic plus shared a couple suggestions. See issue 198. Thank you, eyepopslikeamosquito for introducing me to C++. Thank you, anonymous monk. There, our anon-friend mentioned the word parallel. So, we tried running parallel in C++. Eventually, chunking too. :)

Fishnet is not a color
2 direct replies — Read more / Contribute
by ambrus
on Jun 21, 2023 at 09:38
Please review documentation of my AI::Embedding module
2 direct replies — Read more / Contribute
by Bod
on Jun 02, 2023 at 17:26

    Could you please take a look at the documentation for my new module and let me know if it makes sense? I always find that I am too close to the module and know what everything is supposed to do. In short, I have the Curse of Knowledge!

    Here is the documentation

    Why is it you only find typos after publishing?
    The second raw_embedding method should read test_embedding in both the heading and the sample code. I've corrected this error now.

    Thank you greatly for helping me get this right...

    Edit:

    Changed title from "RFC - Documentation Review" to "Please review documentation of my AI::Embedding module" as considered by erzuuli

Vote For Perl
3 direct replies — Read more / Contribute
by harangzsolt33
on May 29, 2023 at 16:30
CPAN namespace for AI Embedding module
1 direct reply — Read more / Contribute
by Bod
on May 28, 2023 at 17:41

    Wise Monks...

    Having written some code that uses Embeddings to compare pieces of text, I feel this would be useful to others. The Embeddings are generated from the OpenAI API at present.

    I plan to package this up into a module for CPAN and would like some advice on the namespace for this module...

    There is already OpenAI::API::Request::Embedding, which is just a thin wrapper to the API. I don't want to use the OpenAI namespace because my module will probably allow other Embedding providers to be used. For example, Hugging Face provides a cheaper but less precise Embeddings API. This may be better suited to some users.

    As well as providing the connection to the API, my module will also have a method to allow two pieces of text to be compared. More functionality than just a thin wrapper.

    I've looked at the AI namespace - e.g. AI::XGBoost. There is also Text::AI::CRM114

    As my module will connect to several different API providers, I am thinking AI::Embedding might be the right name for it but I am not convinced and your opinions and advice would be greatly appreciated.

Risque Romantic Rosetta Roman Race
7 direct replies — Read more / Contribute
by eyepopslikeamosquito
on May 10, 2023 at 03:17

    I've finally got around to extending to my long-running Perl vs C++ Performance series by timing some Roman to Decimal Rosetta PGA-TRAM code on Ubuntu.

    Generating the Test Data

    You'll need to install the Roman module from CPAN (or simply copy Roman.pm locally) to generate the test data by running:

    # gen-roman.pl use strict; use warnings; use Roman; for my $n (1..1000) { for my $i (1..3999) { my $r = int(rand(2)) ? uc(roman($i)) : lc(roman($i)); print "$r\n"; } }
    with:
    perl gen-roman.pl >t1.txt

    which will generate a test file t1.txt containing 3,999,000 Roman Numerals.

    Running the Benchmark

    With that done, you can run rtoa-pgatram.pl below (derived from Rosetta PGA-TRAM) with:

    $ perl rtoa-pgatram.pl t1.txt >pgatram.tmp
    which produced on my laptop:
    rtoa pgatram start read_input_files : 1 secs roman_to_arabic : 7 secs output : 0 secs total : 8 secs

    rtoa-pgatram.pl

    # rtoa-pgatram.pl # Example run: perl rtoa-pgatram.pl t1.txt >pgatram.tmp # # Convert a "modern" Roman Numeral to its arabic (decimal) equivalent. # The alpabetic input string may be assumed to always contain a valid +Roman Numeral in the range 1-3999. # Roman numerals may be upper or lower case. # Error handling is not required. # For example: # input "XLII" should produce the arabic (decimal) value 42 # input "mi" should produce the arabic (decimal) value 1001 use 5.010; # Needed for state use strict; use warnings; use List::Util qw(reduce); sub read_input_files { my $files = shift; # in: reference to a list of files containin +g Roman Numerals (one per line) my @list_ret; # out: reference to a list of the Roman Numer +als in the files for my $fname ( @{$files} ) { open( my $fh, '<', $fname ) or die "error: open '$fname': $!"; while (<$fh>) { chomp; push @list_ret, uc($_); } close($fh) or die "error: close '$fname': $!"; } return \@list_ret; } # Function roman_to_arabic # Input: reference to a list of valid Roman Numerals in the range 1.. +3999 # Output: reference to a list of their arabic (decimal) values sub roman_to_arabic { my $list_in = shift; # in: reference to a list of valid Roman Nu +merals my @list_ret; # out: a list of their integer values state %rtoa = ( M=>1000, D=>500, C=>100, L=>50, X=>10, V=>5, I=>1 ) +; for (@{$list_in}) { push @list_ret, reduce { $a+$b-$a%$b*2 } map { $rtoa{$_} } split +//, uc($_); } return \@list_ret; } @ARGV or die "usage: $0 file...\n"; my @rtoa_files = @ARGV; warn "rtoa pgatram start\n"; my $tstart1 = time; my $aref1 = read_input_files( \@rtoa_files ); my $tend1 = time; my $taken1 = $tend1 - $tstart1; warn "read_input_files : $taken1 secs\n"; my $tstart2 = time; my $aref2 = roman_to_arabic($aref1); my $tend2 = time; my $taken2 = $tend2 - $tstart2; warn "roman_to_arabic : $taken2 secs\n"; my $tstart3 = time; for my $n ( @{$aref2} ) { print "$n\n" } my $tend3 = time; my $taken3 = $tend3 - $tstart3; my $taken = $taken1 + $taken2 + $taken3; warn "output : $taken3 secs\n"; warn "total : $taken secs\n";

    I was relieved that this ran a little faster than rtoa-roman.pl, which is just a copy of rtoa-pgatram.pl above that uses Roman's arabic function instead of rtoa-pgatram.pl's pgatram algorithm; that is with:

    push @list_ret, reduce { $a+$b-$a%$b*2 } map { $rtoa{$_} } split//, + uc($_);
    above replaced with:
    use Roman; ... push @list_ret, arabic($_);

    $ perl rtoa-roman.pl t1.txt >roman.tmp rtoa roman start read_input_files : 1 secs roman_to_arabic : 11 secs output : 1 secs total : 13 secs $ diff roman.tmp pgatram.tmp

    Please feel free to reply with alternative Perl roman_to_arabic subroutines, especially if they are faster. Roman to Arabic subroutines in other languages are also welcome.

My first Perl Conference
2 direct replies — Read more / Contribute
by stevieb
on Apr 13, 2023 at 01:38

    I've been a registered Monk for 14+ years, was a lurker for eight years before that, I have 60 CPAN distributions published, over 100 Open Source projects published, am co-author on a book about programming the Raspberry Pi with Perl, and for the first time, I'm booked to attend my first Perl conference!

    I'm very excited. A client of mine asked if I'd be attending this year's Toronto Perl conference in a call today, and I thought... yeah, I think I will.

    Tickets booked, hotel booked, flights booked, I'm on my way.

    Best part is, is that Toronto is my hometown, so it'll back as a time that I can visit a bunch of people while I'm there.

    I hope to see some of my fellow Monks there!!!

    Update: Email me via my addresses found on my Github or CPAN page if you're going to attend and want to try to hook up.

    -stevieb

I failed today
5 direct replies — Read more / Contribute
by erickp
on Apr 12, 2023 at 18:40
    Today, I failed to convince my team at work to use Perl instead of Python for a new scripting project on Linux...They went with Python2 instead, imagine that.

    Oh yes and my work is pushing most of us Linux deployers to start using Windows now for our client machines, imagine that.

    What is this world coming to? Maybe I should just go flipping burgers for a living.

    : (
Automatic documentation of match operations?
1 direct reply — Read more / Contribute
by LanX
on Apr 09, 2023 at 13:46
    motivation

    I was confronted to present a condensed explanation of the use cases and side-effects of matching-regexes in Perl

     LHS = ($string =~ /RE(G)EX/MOD)

    which depend on:

    • context: SCALAR vs LIST
    • returned value(s): Boolean, Count or Capture-List
    • (capture groups) inside REGEX
    • presence of /g modifier in MOD
    for comparison, these use cases had to be implemented with multiple methods in JS
    meditation

    Wouldn't it be nice to create automatic tests for all cases to create a nice lookup-table/cheat-sheet?

    This table would have a worst case of 4 dimensions ( context, result, capture, /g ), which means many possible projections into a 2D table, here an attempt how it might look like, if the result is coded into the cell's value

    (NB: this is untested guess work and turned out to be wrong I leave it to you to spot the errors)

    +------------+-----------+----------+----------+--------------+ | | SCALAR / VOID | LIST | | m/REGEX/ +-----------+----------+-----------+-------------+ | | | (groups) | | (groups) | +------------+-----------+----------+-----------+-------------+ | / | ->BOOLEAN | ->COUNT | ->BOOLEAN | ->CAPTURES | | /g | ->BOOLEAN | ->COUNT | ->BOOLEAN | ->CAPTURES | +------------+-----------+----------+-----------+-------------+

    So in an outbreak of ADHD, hubris, and too much time b/c of Easter holidays I started to write - well hack - test code:

    These are my results so far, only as data structures.

    Imperfect because I wanna leave the house still during daylight and risks are high that the code will remain unfinished on my disk without being shared. And it even seems to be wrong. The task to condense it into a meaningful table which avoids unnecessary repetitions is even farther away...

    hack

    Edit

    Hmm, the bug may be related to pos not being properly reset. Must check when back home :)

    update

    yeah, that was it, resetting the string is fixing the issue. Next version must monitor the side effects.

    Cheers Rolf
    (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
    Wikisyntax for the Monastery

[OT] Reminder: SSDs die silently
4 direct replies — Read more / Contribute
by afoken
on Apr 04, 2023 at 05:01

    Yesterday, one of the SSDs in my main computer suddenly died, from one second to the other. It simply disappeared from the system, leaving two very confused virtual machines behind that lost access to their virtual disks stored on that SSD. This way, I lost about one hour of work. That would have been annoying, but could have been fixed easily. Shut down, rip out the SSD, replace it with a fresh SSD or a harddisk, and restore the backup.

    But: That SSD was added at the beginning of the Covid-19 pandemic, as a quick hack to have room for the VMs needed for working from home. It was never intended to work for more than a few weeks, and so I simply forgot to include that disk in the configuration of the backup software.

    I tried about an hour to read the dead SSD using two other computers, but it is dead. It identifies correctly, but reports junk when reading SMART data, and reads not a single bit of user data. I reassembled my computer, added a temporary HDD, ordered a replacement SSD, and started a 17 hours copy job to get the required VMs as huge ZIP files from work to home. It will take another hour or two to unpack and reconfigure the VMs for the new environment. And one or two hours to resync some work data from a cloud service.

    This is totally my fault, having no backup for that disk was stupid, period.

    So, take this as a warning if you are - like me - used to get an audible warning from a failing disk. SSDs die silently and suddenly. You won't get that nasty metal workshop sounds you know from failing hard disks.

    Check your backups, and check your backup configuration.

    Updates:

    Changed some wording.

    https://www.backblaze.com/blog/ssd-edition-2022-drive-stats-review/ does not look very promising for using SMART monitoring. SSD SMART data is messy at best:

    [L]et’s talk about SSD SMART stats. [...] we’ve been wrestling with SSD SMART stats for several months now, and one thing we have found is there is not much consistency on the attributes, or even the naming, SSD manufacturers use to record their various SMART data. For example, terms like wear leveling, endurance, lifetime used, life used, LBAs written, LBAs read, and so on are used inconsistently between manufacturers, often using different SMART attributes, and sometimes they are not recorded at all.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
ZARN - security code analysis for perl
2 direct replies — Read more / Contribute
by Discipulus
on Apr 03, 2023 at 04:15
    Hello folks,

    thanks to perl.social today I've stumbled upon this article about zarn: "a lightweight static code security analysis for Modern Perl Applications"

    Did you used it? Did you use other similar tools for static analysis of your perl programs?

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Diving Data without autovivification
1 direct reply — Read more / Contribute
by LanX
on Mar 28, 2023 at 10:42
    I was involved in a SO discussion, where the OP wanted to find the emails from certain users, but from a problematic data structure

    [ { foo => { browser => "firefox", contact => [{ email => "foo\@example.org" }, { phone = +> 2125551212 }], lang => "en", }, }, { bar => { browser => "lynx", contact => [{ email => "bar\@example.com" }, { phone = +> 9125551212 }], lang => "fr", }, }, ];

    One of the problems is to avoid autovivification.

    The usual answers imply

    but - to my surprise - none of them is core.

    after some meditation I came up with the following solution.

    It looks so idiomatic and generic to me that I'm asking myself if I missed something:

    use v5.12.0; use warnings; use Test::More; use Data::Dump; sub orig_data { return [ { foo => { browser => "firefox", contact => [{ email => "foo\@example.org" }, { phone = +> 2125551212 }], lang => "en", }, }, { bar => { browser => "lynx", contact => [{ email => "bar\@example.com" }, { phone = +> 9125551212 }], lang => "fr", }, }, ]; } my $data = orig_data(); my @emails; # ------ short version @emails = map { $_->{email} // () } map { @{ $_->{contact} // [] } } map { $_->{bar} // () } @$data; is_deeply(\@emails, ["bar\@example.com"], "result ok"); is_deeply($data, orig_data(), "no autovivification"); # ------ long symmetrical version @emails = map { $_->{email} // () } map { @$_ } map { $_->{contact} // () } my @users = # only necessary to check uni +queness map { $_->{bar} // () } map { @$_ } $data; is_deeply(\@emails, ["bar\@example.com"], "result ok"); is_deeply($data, orig_data(), "no autovivification"); # --------- this would vivify new elements # map { $_->{bar}{contact} // () } # map { @$_ } # $data; # is_deeply($data, orig_data ,"no autovivification"); done_testing;

    OUTPUT:

    ok 1 - result ok ok 2 - no autovivification ok 3 - result ok ok 4 - no autovivification 1..4

    NB: as an extra benefit I can check if @emails and @users have only one element to catch glitches in the data. Those arrays of one element hashes like

    • [{ email => "bar\@example.com" }, { phone => 9125551212 }],
    are prone to inconsistencies.

    Thoughts???

    Cheers Rolf
    (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
    Wikisyntax for the Monastery


Add your Meditation
Title:
Meditation:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":


  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (8)
As of 2024-04-18 09:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found