http://qs321.pair.com?node_id=87274


in reply to Help on using alternation grouping star versus dot star.

Although your application is in JavaScript, the question itself is about regular expressions. Benchmark reports (output edited for briefness, code at the end):
Benchmark: timing 100000 iterations of 1, 2, 3... 1: 22 wallclock secs @ 4553.73/s 2: 13 wallclock secs @ 7575.76/s 3: 15 wallclock secs @ 6765.90/s
So it seems that method 2 is the best, but not by much, followed closely by method 3, and method 1 is a distant third. Almost the opposite of what you originally thought!

I'll let the experts (if they want) explain why this is, in terms of how the regex parser operates, but my guess is that 2 is faster because most characters are not opening angle brackets, so the [^<]* absorbs them.

I also have the feeling that the comparison may change when confronted with real data, because of the length and contents of whatever is inside the <div> tags.

Update: and of course, all of this only applies to the Perl implementation of regular expressions, so these results are probably mostly useless, because different JavaScript implementations could do things differently.

--ZZamboni

use strict; use Benchmark; my $str=q(this is some text <div class="blockqte">a block <span>quote</span></div> some more text <b>maybe</b> some other tags <div class="blockqte">another block</div> we end here); sub method1 { my $s=shift; $s=~s@<div class="blockqte">((?:[^<]|<\/?s)*)<\/div>@<pre>$1</pre>@g; } sub method2 { my $s=shift; $s=~s@<div class="blockqte">((?:[^<]*|<\/?s)*)<\/div>@<pre>$1</pre>@g +; } sub method3 { my $s=shift; $s=~s@<div class="blockqte">((?:.|\n)*?)<\/div>@<pre>$1</pre>@g; } timethese(100000, {'1' => sub { method1($str) }, '2' => sub { method2($str) }, '3' => sub { method3($str) }});