http://qs321.pair.com?node_id=11143129

Perl optimizations are pretty basic, and it compiles fast - which is great for utility scripts, but sometimes it feels like a waste of potential. A lot of Perl code actually runs persistently, either behind mod_perl, plack server, mojolicious server or just as a daemon.

I've been thinking - what if we had a switch (executable flag?) that would turn on additional set of compile time optimizations? It could hypothetically be a big performance win, and not affect programs that need to boot fast. I have close to no experience in XSUB, so that's a big if.

For example, one of my projects has I think close to 100 .pm files and compiles in 800 ms. It will then run for a long time (in production) and require performance. Other than testing and development, I wouldn't mind spending 10 seconds extra if that would mean some crucial parts would run 50% faster. Or maybe I could care less about using small subroutines in tight loops, as they would get inlined.

Comments? Was it attempted already?

  • Comment on Trading compile time for faster runtime?

Replies are listed 'Best First'.
Re: Trading compile time for faster runtime?
by dave_the_m (Monsignor) on Apr 21, 2022 at 08:23 UTC
    The easy bit is adding a compile-time switch to say "do more optimisations". The harder bit is deciding what new optimisations could be added under that umbrella. The hardest thing is actually implementing those optimisations in a way that doesn't break everything, once you take all the quirks of perl into account, such as tieing, magic, tainting and overloading. The impossible bit is finding people with enough knowledge of those quirks and the perl internals to implement them. There's currently only a handful of people in the world with those skillsets, and we're already occupied with other stuff or coping with depression or whatever.

    As an example, a few years ago I added the 'multiconcat' operator, which merges a series of concatenations (such as $x .= "-$y-") into a single op. Since the op can see the whole picture, it can be a lot more efficient - such as just allocating a final string buffer of the right size once, rather than repeatedly growing and reallocating the string. It should have been simple, but turned out to be really hard, and broke a whole bunch of CPAN modules. The runtime implementation of the multiconcat operator is about 700 lines of C code - it turns out that concatenating strings in perl is non-trivial.

    Dave.

      This is an excellent answer, thanks Dave. I suspected that might be hard, but not that hard. Yes, perl is very dynamic, and I imagine a lot of paths need to be repeated every time (so not really optimizable) just to make sure nothing changed, or else stuff breaks.

Re: Trading compile time for faster runtime?
by stevieb (Canon) on Apr 20, 2022 at 18:06 UTC
    Other than testing and development, I wouldn't mind spending 10 seconds extra if that would mean some crucial parts would run 50% faster.

    You first need to do some major profiling to find out exactly what you need/want to speed up.

    No sense trying to optimize away things in perl when your slowdowns are coming from reading from disk or DB or something else along these lines, or re-instantiating objects in loops etc. Profile, then fix the code (cache objects or frequently used data, reduce calls to disk etc).

Re: Trading compile time for faster runtime?
by hv (Prior) on Apr 20, 2022 at 20:59 UTC

    Certainly much has been done in this area. Dave Mitchell did a lot of work in recent years (some of it under a targetted "make perl faster" grant from Booking.com, I believe) that included creating new compound opcodes for certain common patterns. However it is very hard to identify patterns that would benefit from this _in the general case_, and each one requires a lot of work followed by a lot more debugging.

    For specific cases, as stevieb says, the starting point is benchmarking to identify where the time is going. Once you have the benchmarks, it's worth spending a lot of time thinking about different algorithms that could improve things - these are the things that could have effects in orders of magnitude, but might require some reengineering of the code - before starting to look at micro-optimizations at the perl level, or maybe rewriting certain core loops in C.

Re: Trading compile time for faster runtime?
by swl (Parson) on Apr 21, 2022 at 07:22 UTC

    Others have already noted profiling as the way forward. Devel::NYTProf is the go-to for that.

    The best optimisation is to use an algorithm that avoids doing much of the work in the first place, but sometimes you just need faster implementations.

    If you are on a recent-ish perl then the refaliasing feature can be used to avoid repeated dereferencing of array items inside loops (see also Data::Alias). It is only really worthwhile when there are very large numbers of derefs that can be avoided. It is also still experimental if that is a concern.

    Data::Recursive has some fast methods to merge data structures (the difference is a few percent so it is more useful for larger data sets or frequent mergers). It depends on some complex modules so installation does not always go smoothly, and hence it is not safe to assume it is available on end-user machines. This means fallback code is needed, which leads to extra maintenance load.

    I have not tested these next few but they might be useful. Devel::GoFaster speeds up some common ops and is in the spirit of your question. There are also a few modules from PEVANS, the contents of which might make their way into future perl releases: Faster::Maths speeds up some mathematical processing, List::Keywords provides faster versions of some List::Util subs (but currently fails tests on Windows).

    Version 5.36 will also have the option to disable taint checking. This will apparently speed up a lot of processing. I haven't seen any benchmarking yet but the current development release includes it as an option: https://metacpan.org/release/SHAY/perl-5.35.11/view/pod/perldelta.pod#Configuration-and-Compilation.

      I actively use NYTProf, it is a great tool. But others you mentioned are new to me, and seem really useful! Will try them out later, but this is pretty much what I asked for.

      But actually, you can disable taint checking for some time now, however it is not a configure option, which 5.36 will fix. I heard it gives like +5% speed, and together with no threads support (~ +15%?) it gives even more incentive to build your own perl.

        The speedup you might get for not supporting threads very much depends on your compiler, your OS, your OS thread support (Linux, Windos, VMS, z/OS, HP-UX, AIX, …) and the configuration of your perl. There is no single number that can be pinned to how much slower/faster perl will be (runtime) when built with/without a configuration option.

        For the systems I run a smoke-test on, the gain for non-threaded perl is about

        See https://tux.nl/perl5/smoke/index.html for smoke results, and a summary at the page bottom regarding performance differences:

        threaded is 6.1% slower than non-threaded DEBUGGING is 2.5% slower than non-DEBUGGING gcc/g++ is 29.7% slower than cc (cc vs. gcc on non-Li +nux and gcc vs. g++ on Linux combined) stdio is 8.5% slower than perlio

        Enjoy, Have FUN! H.Merijn
Re: Trading compile time for faster runtime?
by LanX (Saint) on Apr 20, 2022 at 14:16 UTC
    You should be more specific which kind of optimization at compile time you are missing, I doubt there is much potential left.

    Perl has dynamic typing which means higher performance is only possible by JIT-ing once the types are (statistically) known (i.e."code paths" are recorded at run-time)

    The other way is explicit typing of variables by the author, allowing to optimize crucial sub-routines at compile time.

    JS (JIT) resp Typescript (Typing) can do both now.

    But as I said, please show us a way to improve ahead of time compilation of vanilla Perl without types.

    edit

    Furthermore I'm not sure if Perl can even write such optimized machine code without being bundled with a C-compiler.

    Theoretically once could use typed variables to create Inline::C blocks. But I'm ignorant about other possibilities here...

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      I'm not talking about compiling perl into machine code. I'm talking about spotting some common patterns and replacing them with optimized versions that do the same thing. Still with perl OPs.

      Perl already does this. Subroutines with empty prototype and constant return value are inlined (not actually called on the runtime). If statements that have constant false condition will not be included in the internal compiled structure at all. Currently existing optimizations are pretty straightforward, I think mostly constant folding, to avoid long startup time of a program. What I mean is to have more of those with less focus on fast compile time, as an executable flag for easy on/off. Yes, potential gains may not be worth it or there might be some technical reasons why this isn't viable.

        could you show an example, please?

        perl -MO=Concise FILE will show you the OP tree.

        NB: the tree doesn't mean the order of execution, you have to read the line numbers.

        C:\>perl.exe -MO=Concise -e"map {$_++} 1..3" d <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v:{ ->3 7 <|> mapwhile(other->8)[t5] vK ->d 6 <@> mapstart K ->7 3 <0> pushmark s ->4 - <1> null lK/1 ->4 - <1> null lK/1 ->7 c <@> leave lKP ->7 8 <0> enter l ->9 9 <;> nextstate(main 2 -e:1) v:{ ->a b <1> postinc[t2] sK/1 ->c - <1> ex-rv2sv sKRM/1 ->b a <#> gvsv[*_] s ->b 5 <1> rv2av lKPM/1 ->6 4 <$> const[AV ] s ->5 -e syntax OK

        more options in B::Concise

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

Re: Trading compile time for faster runtime?
by hippo (Bishop) on Apr 20, 2022 at 16:29 UTC
    Was it attempted already?

    Perhaps. It does sound like you might be interested in RPerl.


    🦛

      Thanks, I know about RPerl's existence. AFAIK it is a subset of perl that compiles into C++ - so CPAN doesn't work. Surely got nice performance though

        What performance does it produce for numerical processing, compared to PDL? I always wondered.
Re: Trading compile time for faster runtime?
by cavac (Parson) on Apr 22, 2022 at 08:51 UTC

    I don't know anything about your exact requirements and what your scripts do. But i have a feeling you are doing a lot of data munching. I assume you tried putting your data in a database like PostgreSQL and implementing the time critical parts in SQL?

    Example: For a year now i had some performance problems on my DNS server written in Perl. It had to do with white/blacklisting of domains. Basically, it had to do about a million string matches for every request, plus a few thousand regexp matches. I moved the whole matching algorithm into PostgreSQL. It isn't all that optimized yet, but it runs in a fraction of the time:

    CREATE OR REPLACE FUNCTION pagecamel.nameserver_isforcenx(search_domai +n_name text) RETURNS boolean LANGUAGE plpgsql AS $function$ DECLARE tempvar boolean := false; BEGIN -- Check whitelist (non-regex) SELECT INTO tempvar EXISTS(SELECT 1 FROM pagecamel.nameserver_forc +enxdomain_whitelist WHERE is_regex = false AND search_domain_name = doma +in_match); IF tempvar = true THEN -- whitelisted RETURN FALSE; END IF; -- Check whitelist (regex) SELECT INTO tempvar EXISTS(SELECT 1 FROM pagecamel.nameserver_forc +enxdomain_whitelist WHERE is_regex = true AND search_domain_name ~* doma +in_match); IF tempvar = true THEN -- whitelisted RETURN FALSE; END IF; -- Check blacklist (non-regex) SELECT INTO tempvar EXISTS(SELECT 1 FROM pagecamel.nameserver_forc +enxdomain WHERE is_regex = false AND search_domain_name = doma +in_match); IF tempvar = true THEN -- blacklisted RETURN TRUE; END IF; -- Check blacklist (regex) SELECT INTO tempvar EXISTS(SELECT 1 FROM pagecamel.nameserver_forc +enxdomain WHERE is_regex = true AND search_domain_name ~* doma +in_match); IF tempvar = true THEN -- blacklisted RETURN TRUE; END IF; -- Neither, so NOT blacklisted RETURN false; END; $function$;

    Perl is quite good in general. But when it comes to handling large amounts of data, no "normal" scripting language comes even close to a modern SQL database engine like PostgreSQL. People on those projects spent the last few decades optimizing every last tenth of a percent of performance.

    Yeah, there is probably a way to optimize that function using the WITH() clause and/or having some special type of INDEX on the table that's somehow optimized to handle regular expressions or something.

    perl -e 'use Crypt::Digest::SHA256 qw[sha256_hex]; print substr(sha256_hex("the Answer To Life, The Universe And Everything"), 6, 2), "\n";'