in reply to Perl regexp matching is slow??
The author is clearly an idiot :)
The Perl porters have long known about pathological expressions that will continue to run after the heat-death of the universe. This doesn't mean those patterns should be banished, but with a little care, and a little thought, Perl probably offers the informed programmer techniques for supplying hints to the regular expression engine to avoid such exponential blowups.
I also find it interesting that Perl allowed me to write a very small amount of code in order to explore the problem. And the solution here is to not use greedy matching.
Change 'aaa' =~ /a?a?a?aaa/ to /a??a??a??aaa/ and you have to push n out to about 100000 before you begin to notice any appreciable slowdown in the matching. Take the following for a spin:
use strict; use Time::HiRes 'time'; my $n = shift || 3; my $opt = shift || '?'; my $str = 'a' x $n; my $re = ("a$opt" x $n) . $str; my $begin = time; print $str =~ $re, ' ', time-$begin, "\n";
Try running perl rematcher 20 and then perl rematcher 20000 ??. I get a better than three orders of magnitude improvement. Not bad for five minutes thought. Seeing how the Thomson graph climbs slowly but steadily from the beginning, I would not be surprised if Perl continued to outperform it all the way out until things begin to swap.
update: after finishing the article, I retract the statement that the author is an idiot. He does know his stuff, however, I still don't agree with the conclusion. He says that DFAs are better because you don't run into grief with patterns like /(.*)(.*)(.*)(.*)(.*)/ since it doesn't go exponential. This is just silly. If anything, it encourages sloppy programming.
Backreferences allow one to solve a much larger range of problems that DFAs cannot. That's why NFAs are in such widespread use. And if it takes a little education and experience to wield NFAs with assurance, then so be it.
• another intruder with the mooring in the heart of the Perl
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Perl regexp matching is slow??
by Aristotle (Chancellor) on Feb 07, 2007 at 10:05 UTC | |
Re^2: Perl regexp matching is slow??
by Anonymous Monk on Sep 27, 2012 at 21:57 UTC |