http://qs321.pair.com?node_id=276567


in reply to Naughty match variables in CPAN?

Curious question: Exactly how big impact does those variables actually have? I never use them, since it's been hammered into me that I shouldn't because of performance issues. This makes sense, and one rarely needs them anyways. But I'm just a bit curious on what size performance hit are we talking about here? Microseconds, seconds, minutes?

Also, perlre says: once you've used them once, use them at will, because you've already paid the price. If I read that right, it means that the performance hit is only triggered once, the first time one uses them.

I could make a point here about coding for simplicity instead of (unnecessary) performance, but mainly, I am just curious. Is this advice something that is given as a knee-jerk response, and because we all want to have great performance, or is the impact really so large that it matters in usual cases?

All that aside, I do agree that it is a bad idea to use them in any module that might be used by someone else - there is no telling what the performance considerations might be for that script. If you are using English, performance is probably not what you are looking for. But there are probably other examples.


You have moved into a dark place.
It is pitch black. You are likely to be eaten by a grue.

Replies are listed 'Best First'.
Re: Re: Naughty match variables in CPAN?
by sauoq (Abbot) on Jul 22, 2003 at 02:41 UTC
    If I read that right, it means that the performance hit is only triggered once, the first time one uses them.

    No. What that means is that once you use them, all matches will incur the overhead of using them whether or not you actually do. It's not a one-time hit but it is all-or-nothing.

    (This is my 1000th post! :-)

    -sauoq
    "My two cents aren't worth a dime.";
    
      Gotcha! I guess it is too late over here to read documentation. :)

      But I still wonder how much of a penalty there is.


      You have moved into a dark place.
      It is pitch black. You are likely to be eaten by a grue.

        Its apparently the same penalty incurred on a per-regex basis by using capturing. While working with Metacode::Reader I was able to get some very nice speedups by removing everything that copies data around. So instead of /H=(\d+)$/ I had substr $_, $-[0] + 2, $+[0] - $-[0] - 3. This is much less clear and not nice to read. In my case I was writing a high volumn database filter and the obfuscation was ok. I don't advocate at all taking this sort of step outside of really critical code that Devel::DProf has already highlighted as being slow.

Re: Re: Naughty match variables in CPAN?
by jsprat (Curate) on Jul 22, 2003 at 09:39 UTC
    How big is the impact? Better than 10x in a simple test (5.8.0 on Win2K).

    Just for fun, here's the benchmark. It took some guesswork to get it to run the subs in the right order - clean first, then use English;, then naughty. If anyone uncomments the print statements to test the order, use 1 as an argument so you don't have to wait forever. Here are the results:

    use strict; use Benchmark qw/cmpthese/; my $time = shift || -5; my $text = 'x' x 10_000; sub clean { # print "clean"; $text =~ m/^x/; } sub make_dirty { # print "md"; eval "use English;"; } sub naughty { # print "naughty"; $text =~ m/^x/; } my %hash = ( clean => 'clean', naughtify => 'make_dirty', sawamp => 'naughty', ); cmpthese ( $time, { clean => 'clean', naughtify => 'make_dirty', sawamp => 'naughty', }); __END__ results: C:\s\pldir>naughty.pl -5 Rate naughtify sawamp clean naughtify 433/s -- -98% -100% sawamp 24153/s 5481% -- -92% clean 300603/s 69366% 1145% --

    Someone with more benchmark-fu may correct me on this, but it looks right to me.