Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW


( #480=superdoc: print w/replies, xml ) Need Help??

If you've discovered something amazing about Perl that you just need to share with everyone, this is the right place.

This section is also used for non-question discussions about Perl, and for any discussions that are not specifically programming related. For example, if you want to share or discuss opinions on hacker culture, the job market, or Perl 6 development, this is the place. (Note, however, that discussions about the PerlMonks web site belong in PerlMonks Discussion.)

Meditations is sometimes used as a sounding-board — a place to post initial drafts of perl tutorials, code modules, book reviews, articles, quizzes, etc. — so that the author can benefit from the collective insight of the monks before publishing the finished item to its proper place (be it Tutorials, Cool Uses for Perl, Reviews, or whatever). If you do this, it is generally considered appropriate to prefix your node title with "RFC:" (for "request for comments").

User Meditations
Optimizing with Caching vs. Parallelizing (MCE::Map)
5 direct replies — Read more / Contribute
by 1nickt
on Apr 05, 2020 at 11:17

    Mon cher ami Laurent_R recently blogged about his solution to the "extra credit" problem in the Perl Weekly Challenge #54. He showed a solution using memoizing, or caching, to reduce the number of repeated calculations made by a program.

    I wondered about the strategy. Obviously calculating the sequences for numbers up to 1,000,000 without some optimization would be painfully or maybe unworkably slow. But the task looks computation-intensive, so I wanted to see whether more cycles would be more beneficial than caching.

    Here is the solution presented by Laurent:

    This runs on my system pretty quickly:

    real 0m22.596s user 0m21.530s sys 0m1.045s

    Next I ran the following version using mce_map_s from MCE::Map. mce_map_s is an implementation of the parallelized map functionality provided by MCE::Map, optimized for sequences. Each worker is handed only the beginning and end of the chunk of the sequence it will process, and workers communicate amongst themselves to keep track of the overall task. When using mce_map_s, pass only the beginning and end of the sequence to process (also, optionally, the step interval and format).

    use strict; use warnings; use feature 'say'; use Data::Dumper; use MCE::Map; my @output = mce_map_s { my $input = $_; my $n = $input; my @result = $input; while ( $n != 1 ) { $n = $n % 2 ? 3 * $n + 1 : $n / 2; push @result, $n; } return [ $input, scalar @result ]; } 1, 1000000; MCE::Map->finish; @output = sort { $b->[1] <=> $a->[1] } @output; say sprintf('%s : length %s', $_->[0], $_->[1]) for @output[0..19];

    This program, with no caching, runs on my system about five times faster (I have a total of 12 cores):

    real 0m4.322s user 0m27.992s sys 0m0.170s

    Notably, reducing the number of workers to just two still ran the program in less than half the time than Laurent's single-process memoized version. Even running with one process, with no cache, was faster. This is no doubt due to the fact MCE uses chunking by default. Even with one worker the list of one million numbers was split by MCE into chunks of 8,000.

    Next, I implemented Laurent's cache strategy, but using MCE::Shared::Hash. I wasn't really surprised that the program then ran much slower than either previous version. The reason, of course, is that this task pretty much only makes use of the CPU, so while throwing more cycles at it it a huge boost, sharing data among the workers - precisely because the task is almost 100% CPU-bound - only slows them down. Modern CPUs are very fast at crunching numbers.

    I was curious about how busy the cache was in this case, so I wrapped the calls to assign to and read from the hash in Laurent's program in a sub so I could count them. The wrappers look like:

    my %cache; my $sets = my $gets = 0; sub cache_has { $gets++; exists $cache{$_[0]} } sub cache_set { $sets++; $cache{$_[0]} = $_[1] } sub cache_get { $gets++; $cache{$_[0]} }

    The result:

    Sets: 659,948 Gets: 16,261,635
    That's a lot of back and forth.

    So the moral of the story is that while caching is often useful when you are going to make the same calculations over and over, sometimes the cost of the caching exceeds the cost of just making the calculations repeatedly.

    Hope this is of interest!

Perl joke heard on television
3 direct replies — Read more / Contribute
by Anonymous Monk
on Mar 28, 2020 at 07:41
    Stacy Herbert: You know the flu is way more complicated than this corona virus. I think it's like four strands of RNA. It's so simple apparently the code for it fits on one single page, and this simple little tiny virus is taking down our hyper-complex globalized just-in-time system.

    Max Keiser: Yeah I think the COVID-19 is written in Perl, and the flu is written in C++.

    Keiser Report E1520 Gold: Problems with Exchange for Physical
Perl Automateaching -- part 1: brainstorming
2 direct replies — Read more / Contribute
by Discipulus
on Mar 15, 2020 at 13:20
    Hello monks!

    I'm away from active perl coding since last june but I have this idea hunting me and, being forced to stay at home, as many others europeans, maybe is a good ocasion to start coding again on this.

    The main goal is to produce a module able to ask perl question and evaluate answers given by the user. The code provided by the user will not arrive on STDIN but instead I would opt to a document based approach: the user is provided with a file to edit and the possibility to submit it to be reviewed, multiple time if needed.

    The module name will be something like Perl::Tutor or Camel::Tamer or Camel::Trainer or Perl::Teacher or I can use the automateaching word.. but for the moment let's assume the main object will be $tutor for brevity.

    The module will provide some general methods to build up the configuration (path to perl executable, folder to save works..), others to ask questions and to read user's input but these are trivial and I dont want to bother you with such details (for the moment ;).

    The part where I want to hear from you is the judging process of the provided perl document. I imagine something like this:

    $tutor->assignement( question => "Create an array with 5 elements and fill it with +first 5 letters of the English alphabet.\n". "Then remove the first and last one elements using + two perl list operator.\n". "Join these two removed elements and assign the re +sult to a scalar named \$result\n". "Both the array and the scalar have to be lexicall +y scoped.\n", file => '', initial_content => '#nothing atm. strict and warnings and ot +her content more on', hints => [ 'declare viariables using "my"', 'see shift and pop documentation' ], documentation => [ ' +ions/my.html', ' +l#Perl-Functions-by-Category', ' +ions/shift.html', ' +ions/pop.html', ], solution => 'some text to provide when the task is succesful +ly completed and to add as comment to the resulting script', tests => \@basics_tests, \@tests, );

    And hic sunt leones infact the hard part is how @tests is constructed and how tests are run. This first meditation is about a general brainstorming on which tools to use and how build up the process of judging. My ideas:

    1) PPI will be very useful and is the main reason for the $tutor being document oriented. This has also the plus that user will end with a lot or recipes to review. PPI is able to find every kind of statement inside a perl program with PPI::Statement::Variable for example.

    2) Perl-Critic which I must confess and didnt love, probably because I dont know it well, can be handy, becuase, if I understand it correctly, $tutor can apply standard and custom policies to the user provided perl program.

    3) Simple scripts can be just inspected by the mean of their output and general funcionality: in this light Test::Script can be the right tool.

    4) Testing.. I'd love to use the powerfull perl testing framework in its whole but will be problematic being standalone scripts and not modules. This is a problem hunting me since years.. More complex tasks given to the pupil possibly can be modules but not at start. Code provided can be copied into a temporary file and modified to be a Modulino but his seems complicate and fragile solution. Well PPI can be used to extract all subs of a script and to wrap the rest into a main one, but is maybe too much an artificious.

    I want to hear from you about the above idea and its possible pitfalls. I still dont know how to implement it: any suggestion will be welcome! Help on Perl-Critic and PPI and on testing implementation in the module will be very welcome!!


    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Code challenge: Route planning on a 2D grid
2 direct replies — Read more / Contribute
by bliako
on Mar 13, 2020 at 13:52

    There are few things more exciting than a code challenge. So here is a problem similar to Highest total sum path problem. I say similar because the initial problem constraints were a bit fuzzy.

    The problem is to plan an orthogonal route on a 2D rectangular grid in order to maximise point collection from the cells of the grid and minimise the distance travelled from a specified starting cell to a finishing cell. Points can be positive or negative integers (or zero). The added twist is that moving from A to B can be done in a "normal" mode where points are collected from intermediate cells (A+1 , .., B-1). Alternatively, moving can be done in a "sliding" mode where the points on said squares are not collected (perhaps to avoid negative points which will reduce the total score). But the distance counts.

    If people want to modify these initial rules either because they can make the problem more generic, more useful or aid the original poster in his/her quest, please feel free to make a suggestion.

    I am not sure how to post a grid here so I will assume that all our Perls native RNG will produce the same sequence given the seed 42. So, here is a grid and a path (again if people know a better way then suggest it):

    srand 42; my $W = 1024; my $H = 1024; my $maxscore = 10; my $Grid = []; for(my $i=0;$i<$W;$i++){ $Grid->[$i] = [(0)x$H]; for(my $j=0;$j<$H;$j++){ $Grid->[$i]->[$j] = $maxscore - int(rand(2*$maxscore+1)) } } # now add a highscore to stand out for just 1 cell in each column my $highscore = 21; for(my $i=0;$i<$W;$i++){ $Grid->[$i]->[int(rand($H))] = $highscore; }

    bw, bliako

Is Perl dead ? YAIPD Thread
2 direct replies — Read more / Contribute
by ait
on Mar 07, 2020 at 21:17

    Of course not! But for those who insist on it:

    It made my day seeing Perl so high on that chart!!

2010-2019 From Perl to ?
3 direct replies — Read more / Contribute
by jeffa
on Feb 18, 2020 at 17:29

    Greetings. It has been a long long while. I mostly stopped professional software development with Perl around 2013. Since then i mostly have worked in the DevOps field, using a number of dynamic languages to create pipelines for various teams. I did manage to accept a Perl gig in 2018 but that turned out to be quite possibly THE worst job i have ever worked. The environment was incredibly oppressive and my cow-orkers were either evil, incompetent or lethargic. Before i took that gig however, i re-discovered my love for synthesizers and electronic music and i had already acquired a large number of desktop modules, effects pedals and nice 24 track mixer. After that awful job in 2018 however, i had it in my mind to stop programming and do something else. I bought a soldering iron and a few kits. A few kits led to a few more kits including a ring modulator and delay pedal. Those kits led to errors, and to correct those errors i needed education. I bought some components and breadboards. I bought a dual power supply kit and successfully put it together without getting shocked and/or getting deathed. I read Art of Electronics and the Forrest M Mimms III field notebooks. I watched tons of Colin's Lab and EEVBlog videos on Youtube. I built my own mixer, power amps, MIDI thru boxes, a MIDI sequencer (with Arduino) and a MIDI synth, a dual distortion+delay pedal and DC to DC distribution box. All in all, this previous decade was a long slow ride to the bottom -- and once there i used it as an opportunity to learn new skills. Not sure how much Perl programming in 2020 and beyond will offer me ... but here we still are. I am going to use this opportunity to relearn computers and programming from the ground up this time. :)


    (the triplet paradiddle with high-hat)
Perl in data science: could a grant from Perl foundation be useful?
2 direct replies — Read more / Contribute
by zubenel0
on Feb 18, 2020 at 14:15

    Recently I was thinking about if it is possible to make Perl a more attractive option for data science. I know that some great initiatives exist like RFC: 101 Perl PDL Exercises for Data Analysis or RFC: 100 PDL Exercises (ported from numpy). On my part, I will try to write a blog post with a particular machine learning task I have chosen. Nevertheless, as Ovid wrote falling short in data science field is a significant drawback of Perl. How to fix this?

    What I thought about as a way to to proceed could be a grant from Perl foundation. It could work only if it would be possible to find someone interested in a project related to Perl and data science and capable to do it. IMO one of the solutions that could help would be to write a book on How to use Perl in Data Science. Again, this idea is not mine as it was mentioned in perlblogs as a desire to have a new PDL book. Maybe with a help from Perl foundation such a project could encompass even more than PDL and include several other modules suited for data science.

    Another interesting idea that I have encountered was to create Perl/XS graphics backend as there is a need to have graphic library which can create 2D/3D chart easily - see the comments on perlblogs. Unfortunately, I know very little about this but I guess that it might be a very hard task... So these are just a couple of examples but actually the main issue is if it is feasible in general - to have a grant for data science using Perl? What do you think? Do you know someone that could be interested in it? Or do you think that this approach is flawed and have some other suggestions?

Let's finish Imager::GIF
1 direct reply — Read more / Contribute
by Anonymous Monk
on Feb 06, 2020 at 16:42
    Imager::GIF - a handy module for animated GIF processing - is a nice thought, with one semi-working method and problematic documentation (Re^2: Imager::GIF seems broken), that needs some help, as the docs say:


      Implement the rest of the transformations (cropping, rotating etc).

    I needed to non-proportionally scale animated GIFs and implemented type=>nonprop in the scale method. Other desirable features include crop, watermark, and sharpening. Please share your mods and methods here.



    Your local file:

    perl -MImager::GIF -le 'for (keys %INC) { print $INC{$_} if /GIF\.pm/ +}'
    My scale method:
    sub scale { my ($self, %args) = @_; my $ratio = $args{scalefactor} // 1; my $qtype = $args{qtype} // 'mixing'; # add qtype support $self->_mangle(sub { my $img = shift; my $ret = $img->scale(%args, qtype => $qtype); my $h = $img->tags(name => 'gif_screen_height'); my $w = $img->tags(name => 'gif_screen_width'); # add non-proportional scaling if ( $args{xpixels} and $args{ypixels} and $args{type} and $args{type} eq 'nonprop') { my $xratio = defined $args{xpixels} ? $args{xpixels} / $w : $ratio; my $yratio = defined $args{ypixels} ? $args{ypixels} / $w : $ratio; $ret->settag(name => 'gif_left', value => int($xratio * $img->tags(name => 'gif +_left'))); $ret->settag(name => 'gif_top', value => int($yratio * $img->tags(name => 'gif +_top'))); $ret->settag(name => 'gif_screen_width', value => int($xr +atio * $w)); $ret->settag(name => 'gif_screen_height', value => int($yr +atio * $h)); } else { # proportional scaling, from the original unless ($ratio) { if (defined $args{xpixels}) { $ratio = $args{xpixels} / $w; } if (defined $args{ypixels}) { $ratio = $args{ypixels} / $h; } } $ret->settag(name => 'gif_left', value => int($ratio * $img->tags(name => 'gif +_left'))); $ret->settag(name => 'gif_top', value => int($ratio * $img->tags(name => 'gif +_top'))); $ret->settag(name => 'gif_screen_width', value => int($ra +tio * $w)); $ret->settag(name => 'gif_screen_height', value => int($ra +tio * $h)); } return $ret; }); }
    Thank you!
Looking for testers who use Microsoft compilers
2 direct replies — Read more / Contribute
by syphilis
on Feb 03, 2020 at 06:10

    As this is essentially a request for some testing to be done, I thought "Meditations" was probably the best place for it.
    I'm not actively seeking wisdom with this post but, of course, receiving wisdom is always fine, even if it hasn't been requested ;-)

    If you have a perl that you've built using a Microsoft Compiler, I'd be most interested to learn of any problems or failures involved in running:
    cpan -i List::Uniqnum
    In fact, feel free to provide feedback for any build of perl that you have.
    The cpantesters smokers have been happily chewing on List-Uniqnum-0.04 for a couple of days, but there are very few Windows smokers out there.
    And, AFAIK, none of those smokers employ Microsoft compilers.

    Of course, Darwin and Solaris are probably also missing from those cpantesters systems - so join in with them, too .... or anything else that takes your fancy.

    I released List::Uniqnum to test changes that I want to make to the dual-life module List::Util's uniqnum() function - in order to improve that function's portability.
    A new release of List::Util (Scalar-List-Utils-1.54) hit cpan over the weekend. It still doesn't utilize the changes I was hoping would be included.
    If you run cpan -i List::Util you'll probably find that it passes all tests and installs cleanly.

    List-Util-1.54's uniqnum function actually works correctly on Linux, unless perl was built with -Duselongdouble or -Dusequadmath - in which case the test suite still passes, but only because it doesn't run tests that will reveal the problem.
    1.54 works fine on Windows, too, but again only if perl's nvtype is double.
    For it to work correctly on Windows if perl's ivtype is long long, it also requires that perl was built with __USE_MINGW_ANSI_STDIO, which only started happening wih the release of 5.26.0.
    Thankfully, Strawberry Perl 5.26 onwards is built with __USE_MINGW_ANSI_STDIO defined.
    Try cpan -i List::Util on a 64-bit-integer build of Strawberry perl-5.24.0 or earlier, and you'll see a test failure.

    If you want to know what's failing with your particular installation of List::Util's uniqnum function, here is something you can run:
    use Config; # for test 5 use strict; use warnings; use List::Util qw(uniqnum); #use List::Uniqnum qw(uniqnum); my $count; # test 1 if(1.4142135623730951 != 1.4142135623730954) { $count = uniqnum(1.4142135623730951, 1.4142135623730954); print "test 1 failed (returned $count)\n" unless $count == 2; } # test 2 if(10.770329614269008063 != 10.7703296142690080625) { $count = uniqnum(10.770329614269008063, 10.7703296142690080625); print "test 2 failed (returned $count)\n" unless $count == 2; } # test 3 if(1005.1022829201930645202916159776901 != 1005.10228292019306452029161597769015) { $count = uniqnum(1005.1022829201930645202916159776901, 1005.10228292019306452029161597769015); print "test 3 failed (returned $count)\n" unless $count == 2; } # test 4 $count = uniqnum(0, -0.0); print "test 4 failed (returned $count)\n" unless $count == 1; # test 5 if($Config{ivsize} == 8) { # These 2 (the first is an IV, the second is an NV) # both exactly represent the value 762939453127 * (2 ** 21) $count = uniqnum(100000000000262144, 1.00000000000262144e+17); print "test 5 failed (returned $count)\n" unless $count == 1; }
    It only announces failures. If there's no output, then everything is good.
    If you install List::Uniqnum, you can then modify the script to test List::Uniqnum.
    If you do that, and it produces some output, please let me know.

Artificial Intelligence experiment
4 direct replies — Read more / Contribute
by PerlGuy(Tom)
on Feb 03, 2020 at 00:05
    I'm not really sure why, life experience I guess, but while studying and practicing Perl programming, an idea for artificial intelligence flashed into my mind.
    Bot 2
RFC: List of first day of each month
7 direct replies — Read more / Contribute
by TieUpYourCamel
on Jan 31, 2020 at 10:17
    My task here is to make a list of all of the "first day of the month" days between a start date and an end date. I'm interested to hear thoughts about my solution:
    use strict; use warnings; use Time::Piece; use feature 'say'; my $date = Time::Piece->strptime( localtime->year() - 4 . " 01 01", "%Y %m %d" ); my $endDate = localtime(); while ( $date <= $endDate ) { say $date; # Get the next first of the month by going to the end of the # month and add one day my $year = $date->year; my $mon = $date->mon; my $day = $date->month_last_day; $date = Time::Piece->strptime( "$year $mon $day", "%Y %m %d" ) +; $date += Time::Seconds::ONE_DAY; }
RFC: Peer to Peer Conceptual Search Engine
2 direct replies — Read more / Contribute
by PerlGuy(Tom)
on Jan 28, 2020 at 05:37

    I consider myself a very novice Perl programmer, though I've been studying and using Perl for, I don't even know how many years. 30 maybe. Still so much to learn and so little time.

    I got into programming because I wanted a better search engine than any that were available, back in the day, say 1995. (I still think a better search engine is needed and possible).

    But all the programmer's I approached about my idea said

Reinventing Moops
No replies — Read more | Post response
by tobyink
on Jan 14, 2020 at 05:20

    It seems every few years, I come up with some kind of weird syntax extension for doing OO programming in Perl. Moops was the most recent but while it's cool, it's built on some shaky foundations.

    I've been working on this thing MooX::Press for a little while now. It allows you to define a bunch of classes in one use statement. Like:

    use MooX::Press ( prefix => 'MyApp', role => [ 'Livestock', 'Pet', 'Milkable' => { can => [ 'milk' => sub { print "giving milk\n"; }, ], }, ], class => [ 'Animal' => { has => [ 'name' => { type => 'Str' }, 'colour', 'age' => { type => 'Num' }, 'status' => { enum => ['alive', 'dead'], default => 'alive' }, ], subclass => [ 'Panda', 'Cat' => { with => ['Pet'] }, 'Dog' => { with => ['Pet'] }, 'Cow' => { with => ['Livestock', 'Milkable'] }, 'Pig' => { with => ['Livestock'] }, ], }, ], ); my $porky = MyApp->new_pig(name => 'Porky'); print $porky->status, "\n";

    It's designed to be as declarative as possible; with the exception of a coderefs for defining your methods, it's pretty much just a big hash that could be serialized as JSON or YAML or whatever. Indeed, I've written portable::loader as a way of loading MooX::Press classes/roles from JSON or TOML and deciding their package namespace at runtime.

    It's also very opinionated about how your classes and roles should be interacted with. Although MyApp::Pig->new works, you are encouraged to use MyApp->new_pig instead. And if a Panda object needs to create a Pig (because that happens in nature, right?) then it should call $self->FACTORY->new_pig to do the business. MyApp is the factory package, and objects get created via that; objects can find their factory package using $self->FACTORY. There are ways to override some of MooX::Press's opinions, but it steers you in this direction.

    Anyway, recently I started looking at how to combine this with Keyword::Declare to create something Moops-like. This is the syntax I have currently got working:

    use v5.14; use strict; use warnings; use Data::Dumper; use MooX::Press::Declare prefix => 'MyApp', toolkit => 'Moo'; class Quux { version 3.1; extends Quuux; with Xyzzy; has foo : ( is => ro, type => 'Foo' ); has bar : ( type => 'Barrr' ); has nooo!; # exclamation mark means required constant yeah = 42; method say_stuff { my $self = shift; say $self->yeah + $self->nooo; } } my $obj = MyApp->new_quux( foo => MyApp->new_foo, bar => MyApp->new_bar_baz, nooo => 1, ); print Dumper($obj); $obj->say_stuff; # Note the order you define stuff mostly doesn't matter. # We used these classes above and define them now. class Quuux; role Xyzzy; class Foo; class Bar::Baz { type_name Barrr; }

    It's still early days, but it's coming along pretty nicely too. I'm impressed with how easy Keyword::Declare makes syntax extensions.

    Still to do: method signatures, method modifiers (before, around, after), type coercions, and custom factory methods. (These are all supported by MooX::Press, but not by the declarative syntax yet.)

Mini-Tutorial: Formats for Packing and Unpacking Numbers
1 direct reply — Read more / Contribute
by ikegami
on Jan 06, 2020 at 03:49

    pack and unpack are useful tools for generating strings of bytes for interchange and extracting values from such strings respectively. What follows is a table that represents the relevant formats in a convenient form.

    Category Type Byte Order Mnemonic
    Native Little-Endian (<) Big-Endian (>)
    Unsigned C "C" for char
    Signed c
    Unsigned S S< or v S> or n "S" for short
    Signed s s< or v! s> or n!
    Unsigned L L< or V L> or N "L" for long
    Signed l l< or V! l> or N!
    Unsigned Q Q< Q> "Q" for quad
    Signed q q< q>
    Types Used
    By This Build
    of perl
    UV (unsigned integer) J J< J> "J" is related to "I"
    IV (signed integer) j j< j>
    NV (floating-point) F F< F> "F" for float
    C Types for
    This Build
    of perl
    unsigned short int S! S!< S!> "S" for short
    signed short int s! s!< s!>
    unsigned int I! or I I!< or I< I!> or I> "I" for int
    signed int i! or i i!< or i< i!> or i>
    unsigned long int L! L!< L!> "L" for long
    signed long int l! l!< l!>
    float f f< f> "f" for float
    double d d< d> "d" for double
    long double D D< D> A bigger double


    • < and > indicate byte order. The small end of the bracket is at the least significant end of the number. (< for little-endian byte order, and > for big-endian byte order.) Can't be used with N/n and V/v.
    • For integers, ! signifies using the C types of this build of perl. N/n and V/v excepted.
    • For integers, uppercase indicates unsigned, and lowercase indicates signed. N/n and V/v excepted.
    • N and n are used for network (i.e. internet) byte order (BE), with the uppercase letter being used for the larger bitsize.
    • V and v are used for VAX byte order (LE), with the uppercase letter being used for the larger bitsize.
"exists $hash{key}" is slower than "$hash{key}"
3 direct replies — Read more / Contribute
by swl
on Jan 05, 2020 at 19:23

    UPDATE 2020-01-10: Actually, it's not. See subthread starting at 11111117.


    I decided to run some benchmarking on hash exists after some code profiling showed a reasonable amount of time spent on lines with next if exists $hash{$key}.

    This is largely in the context of code structured like the (very contrived) example below which uses the common idiom of skipping slow code if it has already been done or is not needed based on a tracking hash.

    my %done; my @data = (1..100); for (1..100) { push @data, int (rand() * 100); } for my $item (@data) { next if exists $done{$item}; # do something time consuming # ... $done{$item}++; }

    The code below tries combinations of exists and value checking. Assignment to variables is used to avoid "Useless use of hash element in void context" warnings, and the assignment to globals is to get a sense of how much the timings are related to bookkeeping of lexicals. I could disable warnings but it's the relative timing differences that are useful here, not the absolute times.

    Code was run using Strawberry perl 5.28.0, and the results are given in the table below (see code for key explanation).

    The main take home is that the value checks (v prefix) are all faster than the exists checks (e prefix). Assigning to global is faster, presumably because there is less bookkeeping involved, but it will be rare that one would use such a construct anyway.

    Rate evksvl evkrvl ecksvl eckrvl evksvg evkrvg vvksvl vvk +rvl vckrvl vcksvl vvksvg vvkrvg evksvl 10733145/s -- -5% -7% -12% -15% -18% -29% - +31% -32% -33% -41% -48% evkrvl 11290643/s 5% -- -2% -7% -10% -14% -25% - +27% -28% -29% -38% -45% ecksvl 11570664/s 8% 2% -- -5% -8% -12% -23% - +25% -27% -27% -36% -44% eckrvl 12176232/s 13% 8% 5% -- -3% -8% -19% - +21% -23% -23% -33% -41% evksvg 12572221/s 17% 11% 9% 3% -- -5% -17% - +19% -20% -21% -31% -39% evkrvg 13168623/s 23% 17% 14% 8% 5% -- -13% - +15% -17% -17% -28% -36% vvksvl 15082826/s 41% 34% 30% 24% 20% 15% -- +-2% -4% -5% -17% -27% vvkrvl 15461840/s 44% 37% 34% 27% 23% 17% 3% + -- -2% -3% -15% -25% vckrvl 15777625/s 47% 40% 36% 30% 25% 20% 5% + 2% -- -1% -13% -23% vcksvl 15909705/s 48% 41% 38% 31% 27% 21% 5% + 3% 1% -- -13% -23% vvksvg 18207860/s 70% 61% 57% 50% 45% 38% 21% +18% 15% 14% -- -12% vvkrvg 20580512/s 92% 82% 78% 69% 64% 56% 36% +33% 30% 29% 13% --

    So why is it that exists is slower than checking the value? My starting assumption was that exists should be faster, as getting a value requires checking that it exists first. However, looking that the source code, most of the hash key and value calls are passed through the same function, hv_common. So far as I can tell from reading the code, and based on my limited comprehension of the details, hv_common prioritises getting values over checking key existence and value assignment.

    So does this all matter and should code that uses exists $hash{$key} be changed to use $hash{$key}? Given that even the slowest of the benchmark snippets is running more than 10,000,000 per second, it does not matter at all for most use cases. One would need to be running hundreds of millions of calls for such a change to start to make a meaningful difference, and some would quite reasonably argue that billions of calls are needed.

    Maybe the perl source code could be optimised so exists is not slower, but whether this justifies any additional maintenance burden is not something I can answer.

Add your Meditation
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chilling in the Monastery: (5)
    As of 2020-04-08 18:40 GMT
    Find Nodes?
      Voting Booth?
      The most amusing oxymoron is:

      Results (45 votes). Check out past polls.