Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: Matching data against non-consecutive range

by bart (Canon)
on Jan 27, 2005 at 22:07 UTC ( [id://425787]=note: print w/replies, xml ) Need Help??


in reply to Re: Matching data against non-consecutive range
in thread Matching data against non-consecutive range

Using Regex::PreSuf, you can convert that list of numbers into a regex (or at least, a string you can use as a regex). Like this:
use Regex::PreSuf; my $re = presuf(10..374, 376..379, 382..385, 388..499, 530..534, 541.. +543, 618, 619, 700..704, 707..709); print "Regex: /$re/\n";
This prints:
Regex: /(?:1(?:0[0123456789]|1[0123456789]|2[0123456789]|3[0123456789] +|4[0123456789]|5[0123456789]|6[0123456789]|7[0123456789]|8[0123456789 +]|9[0123456789]|[0123456789])|2(?:0[0123456789]|1[0123456789]|2[01234 +56789]|3[0123456789]|4[0123456789]|5[0123456789]|6[0123456789]|7[0123 +456789]|8[0123456789]|9[0123456789]|[0123456789])|3(?:0[0123456789]|1 +[0123456789]|2[0123456789]|3[0123456789]|4[0123456789]|5[0123456789]| +6[0123456789]|7[012346789]|8[234589]|9[0123456789]|[0123456789])|4(?: +0[0123456789]|1[0123456789]|2[0123456789]|3[0123456789]|4[0123456789] +|5[0123456789]|6[0123456789]|7[0123456789]|8[0123456789]|9[0123456789 +]|[0123456789])|5(?:3[01234]|4[123]|[0123456789])|6(?:1[89]|[01234567 +89])|7(?:0[01234789]|[0123456789])|8[0123456789]|9[0123456789])/

Wow, that's big: length($re) is 746 bytes. It won't hurt though, you can just use it to match:

$zip = '34'; if($zip =~ /\b$re\b/o) { print "Got a match for $zip\n"; }
which prints:
Got a match for 34

Considered by jmanning2k - Break Long Line
Unconsidered by castaway - Keep/Edit/Delete: 7/7/0 - Get a working browser, see Re^2: Monastery Gates page is too wide (causes)

Note by bart: I use Firefox as my standard browser, and it renders correctly for me. I asked on the Chatterbox, and both castaway and corion said it looked normal to them, too. They suggested it could be caused by a difference in site settings for wrapping... *shrug* I don't know any more.

Replies are listed 'Best First'.
Re^3: Matching data against non-consecutive range
by demerphq (Chancellor) on Jan 28, 2005 at 14:17 UTC

    IMO its a good idea to attach "don't do this" notices to a solution like this. Its hopelessly inefficient in comparison to the hash lookup.

    ---
    demerphq

      Have you benchmarked it? I haven't... it might not be that bad, and it's most likely a lot better than just
      $re = join '|', @list;
      anyway...

      I could try and optimize it a little by making sure it doesn't even attempt to match if it doesn't see a digit:

      /\b(?=\d)$re\b/o

      Ideally, I'd think this would be an excellent problem to solve using a regexp assertion in perl code, thus matching all digits first and then testing if it's within range with a hash, all inside the regexp — if only assertions weren't that hard to implement.

      p.s. grinder has yet another alternative solution to using Regex::PreSuf, it's currently up on his scratchpad, but for some otherwise noble but in this case silly reason of policy (because he frontpaged the thread) he doesn't dare to post it. Please urge him to post it, it'd be a waste to the site if he didn't. He won't listen to me. :)

      It's probably time-efficient enough for most purposes. And the hash lookup is probably memory-efficient enough for most purposes.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://425787]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-23 09:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found