Although your application is in JavaScript, the question itself
is about regular expressions.
Benchmark reports (output edited for
briefness, code at the end):
Benchmark: timing 100000 iterations of 1, 2, 3...
1: 22 wallclock secs @ 4553.73/s
2: 13 wallclock secs @ 7575.76/s
3: 15 wallclock secs @ 6765.90/s
So it seems that method 2 is the best, but not by much, followed
closely by method 3, and method 1 is a distant third. Almost
the opposite of what you originally thought!
I'll let the experts (if they want) explain why this is,
in terms of how the regex parser operates, but my guess is
that 2 is faster because most characters are not opening angle
brackets, so the [^<]* absorbs them.
I also have the feeling that the comparison may change when
confronted with real data, because of the length and contents
of whatever is inside the <div> tags.
Update: and of course, all of this only applies to the Perl implementation
of regular expressions, so these results are probably mostly useless, because
different JavaScript implementations could do things differently.
--ZZamboni
use strict;
use Benchmark;
my $str=q(this is some text
<div class="blockqte">a block <span>quote</span></div>
some more text <b>maybe</b> some other tags
<div class="blockqte">another block</div>
we end here);
sub method1 { my $s=shift;
$s=~s@<div class="blockqte">((?:[^<]|<\/?s)*)<\/div>@<pre>$1</pre>@g;
}
sub method2 { my $s=shift;
$s=~s@<div class="blockqte">((?:[^<]*|<\/?s)*)<\/div>@<pre>$1</pre>@g
+;
}
sub method3 { my $s=shift;
$s=~s@<div class="blockqte">((?:.|\n)*?)<\/div>@<pre>$1</pre>@g;
}
timethese(100000, {'1' => sub { method1($str) },
'2' => sub { method2($str) },
'3' => sub { method3($str) }});
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.