note
ZZamboni
Although your application is in JavaScript, the question itself
is about regular expressions. [Benchmark] reports (output edited for
briefness, code at the end):
<code>
Benchmark: timing 100000 iterations of 1, 2, 3...
1: 22 wallclock secs @ 4553.73/s
2: 13 wallclock secs @ 7575.76/s
3: 15 wallclock secs @ 6765.90/s
</code>
So it seems that method 2 is the best, but not by much, followed
closely by method 3, and method 1 is a distant third. Almost
the opposite of what you originally thought!<p>
I'll let [japhy|the experts] (if they want) explain why this is,
in terms of how the regex parser operates, but my guess is
that 2 is faster because most characters are not opening angle
brackets, so the <tt>[^<]*</tt> absorbs them.<p>
I also have the feeling that the comparison may change when
confronted with real data, because of the length and contents
of whatever is inside the <div> tags.
<p><b>Update:</b> and of course, all of this only applies to the Perl implementation
of regular expressions, so these results are probably mostly useless, because
different JavaScript implementations could do things differently.
<p>--<A HREF="/index.pl?node=ZZamboni&lastnode_id=1072">ZZamboni</A>
<code>
use strict;
use Benchmark;
my $str=q(this is some text
<div class="blockqte">a block <span>quote</span></div>
some more text <b>maybe</b> some other tags
<div class="blockqte">another block</div>
we end here);
sub method1 { my $s=shift;
$s=~s@<div class="blockqte">((?:[^<]|<\/?s)*)<\/div>@<pre>$1</pre>@g;
}
sub method2 { my $s=shift;
$s=~s@<div class="blockqte">((?:[^<]*|<\/?s)*)<\/div>@<pre>$1</pre>@g;
}
sub method3 { my $s=shift;
$s=~s@<div class="blockqte">((?:.|\n)*?)<\/div>@<pre>$1</pre>@g;
}
timethese(100000, {'1' => sub { method1($str) },
'2' => sub { method2($str) },
'3' => sub { method3($str) }});
</code>
87257
87257