Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^2: Substitution unexpectedly very slow, in Strawberry

by syphilis (Archbishop)
on Aug 07, 2023 at 02:13 UTC ( [id://11153752] : note . print w/replies, xml ) Need Help??


in reply to Re: Substitution unexpectedly very slow, in Strawberry
in thread Substitution unexpectedly very slow, in Strawberry

There was a possibly related thread here a few months ago about character repetitions and COW but I can't locate it right now.

It would be interesting to compare with a perl that was built without the COW.
How might that be achieved on Windows ? (In fact, how is it even done on Linux ? IIRC it's fairly easily do-able on *nix but I can't find the relevant documentation.)

On Windows, I'm seeing the same thing, irrespective of whether perl-5.38.0 is built with a mingw-w64 port of gcc or with Microsoft's Visual Studio 2022. Unthreaded builds of perl fare just as poorly as the multi-thread builds.
The one thing that does make a big difference is to use a 32-bit build of perl.
For example, with a 64-bit MSVC-built perl-5.38.0:
>perl time.pl 6.13913488388062 0.530965089797974 v5.38.0
For a 32-bit perl-5.38.0 built using the same MSVC compiler (VS2022) in 32-bit mode:
>perl time.pl 0.0664510726928711 0.0533270835876465 v5.38.0
But even with that 32-bit build of perl, the same issue becomes evident when "1e6" is changed to "2e6":
>perl time.pl 6.67774796485901 1.07769107818604 v5.38.0
Cheers,
Rob

Replies are listed 'Best First'.
Re^3: Substitution unexpectedly very slow, in Strawberry
by swl (Parson) on Aug 07, 2023 at 02:43 UTC

    It also might not be related to COW and instead be something in the regex engine that is specific to windows. It's probably worth flagging with p5p.

      It also might not be related to COW and instead be something in the regex engine that is specific to windows.

      I wouldn't yet assume that it doesn't afflict Linux.
      Sure, there's nothing much in the figures that hippo choroba posted earlier, but I wonder if the same thing might become evident on Linux if "1e6" is increased to "2e6" or beyond.

      It's probably worth flagging with p5p

      Probably, yes.
      The 5.38.0 perlvar docs still warn about the potential performance hits of using $&, but then they also say (regarding $`, $& and $')
      In Perl 5.20.0 a new copy-on-write system was enabled by default, +which finally fixes most of the performance issues with these three vari +ables, and makes them safe to use anywhere.
      But note that it says "most of the performance issues".
      And I'm still a bit curious to know whether a perl-5.38.0 windows perl built without COW would suffer the same slowdown.

      Cheers,
      Rob

        There seems not to be a threshold value at which it kicks in. This suggests it's not COW related?.

        Modifying the code to take an argument to use as a power of 2 gives these results on Strawberry Perl 5.38. Code is below inside readmore tags.

        A doubling of the n leads to a more than doubling of the time on my machine. One would need to run it many times to be confident in the true rate of change but these numbers are close to a linear relationship when plotted with time log scaled. (For those interested in such things, the power function fitted using MS Excel is t = 8E-28**(n*21.743).)

        Edit: n in this case is the argument to the script, so the x-axis is n, not 2**n.

        Edit 2: And a polynomial function gives a better fit when using actual number of repetitions: t = 2E-11*n**2 - 5E-06*n + 0.387, with R^2 = 0.9989 (indicative-only given the sample size).

        C:\user\perlmonks>perl 11153747.pl 15 32768 0.0403292179107666 0.0388741493225098 v5.38.0 C:\user\perlmonks>perl 11153747.pl 16 65536 0.0916790962219238 0.0764880180358887 v5.38.0 C:\user\perlmonks>perl 11153747.pl 17 131072 0.361366987228394 0.142138957977295 v5.38.0 C:\user\perlmonks>perl 11153747.pl 18 262144 1.22035908699036 0.37365198135376 v5.38.0 C:\user\perlmonks>perl 11153747.pl 19 524288 4.05149698257446 0.628846883773804 v5.38.0 C:\user\perlmonks>perl 11153747.pl 20 1048576 21.3440728187561 3.00999999046326 v5.38.0

        Code: