Re^3: Regex and question of design

A bunch of simple matches can be more efficient than a single, more complicated, match.

True, but I must take you up on two issues.

Firstly, you are paying for the cost of the construction of the assembled pattern each time through the loop. In practise one would do this only once per run. Hoisting that out of the benchmarked code would make the figures more accurate.

Secondly, I wouldn't bother with such an approach for two patterns. It only starts to come into its own for a larger number. Where the sweet spot lies, I don't know... my educated guess is more than 10, less than 20.

But even when you have as few as ten patterns, you have to start worrying about putting /foobar/ before /foo/. Failing to do so will result in 'foobar' never being matched ('foo' will succeed instead). If you have /bin/, /bat/, /bar/, /bong/, ... it is rather wasteful to match against all four and still have it fail just because the target string happens to be is 'bone'. That is what I meant when I talked of efficiency.

- another intruder with the mooring in the heart of the Perl

Comment on Re^3: Regex and question of design

Replies are listed 'Best First'.

Re^4: Regex and question of design
by Anonymous Monk on Apr 14, 2005 at 13:20 UTC

Firstly, you are paying for the cost of the construction of the assembled pattern each time through the loop. In practise one would do this only once per run. Hoisting that out of the benchmarked code would make the figures more accurate.

If you have /bin/, /bat/, /bar/, /bong/, ... it is rather wasteful to match against all four and still have it fail just because the target string happens to be is 'bone'.

        Rate regex    ra
regex 9.90/s    --  -30%
ra    14.2/s   43%    --
[download]

        Rate    ra regex
ra    5.25/s    --  -49%
regex 10.2/s   95%    --
[download]

        Rate    ra regex
ra    4.45/s    --  -56%
regex 10.1/s  126%    --
[download]

        Rate    ra regex
ra    2.53/s    --  -48%
regex 4.89/s   93%    --
[download]

        Rate    ra regex
ra    1.74/s    --  -35%
regex 2.69/s   54%    --
[download]

Now, I'm not claiming that a bunch of simpler regexes are always faster, not at all. All I'm saying is that the trade-off isn't as clear cut as you presented it.

[reply]
[d/l]
[select]


laziness, impatience, and hubris
	PerlMonks