Your skill will accomplish what the force of many cannot |
|
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
I couldn't leave well enough alone on this... The regexp solution benefits from the very efficient regexp engine. But it is a solution that is built upon a big-O polynomial algorithm. If we expand the problem to finding uniqueness in strings consisting of three-character-wide groups of alphabetical characters, that gives us a lot of room for dataset growth while maintaining a string of unique groups. The hash solution grows at O(n) since each hash insert occurs at an average of O(1). I can't quite figure out how bad the regular expression approach gets as the string grows, but it's probably something like O(n^2) or worse. For short test strings the raw speed of the regexp engine wins over the complexity of the hashing algorithm. But for longer strings, there's literally no comparison. Here's some test code:
And the results on my slow Pentium-II laptop:
At first I thought my eyes were decieving me. 1.84e-002 iterations per second? That's horrible. But then I realized that the regexp solution was so slow that Benchmark switched to showing seconds per iteration. So it takes 1.15 seconds per iteration for the regexp approach in my test example, and a blink of an eye (1.84e-002) for the hash approach with a test string of 1353 groups. Try testing 'aaa' .. 'faa'. You'll have to increase the testing time about a minute to even get reliable results out of Benchmark at that point because the regexp approach becomes so sluggish. Of course this is a contrived example, but aren't they all? ;) And I did have to modify the RE a little so that it would maintain proper framing. But the discussion caught my attention and I just had to prove to myself what I already suspected. Dave In reply to Re: Determining uniqueness in a string.
by davido
|
|