Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Regex simplification

by mephit (Scribe)
on Aug 26, 2002 at 20:00 UTC ( [id://192994]=note: print w/replies, xml ) Need Help??


in reply to [untitled node, ID 192753]

Hmm, isn't substr usually faster than a regex? If so, how about the following approach:

  • Use rindex to find the indeces of the last and second-to-last spaces, as the OP requires.
  • Find the difference between those indeces to get the length of the desired string, and use that value (along with the index of the second-to-last space) in a <substr> call to get the required data
Well, I'm sure that it would work, but would it be faster? I'll probably benchmark this myself sometime when I have the time to create data and code to test.

Anyway, that's my (Not-So-)Good Idea for the day.

Update I just ran some benchmarks on a few of the methods suggested. Here's my code and results:

my $str = '<!-- USER 20 - donkey_pusher_6 -->'; my $data; my $re = qr/--\s*USER\s+\d+\s*-\s*(\w+)/; my ($start, $end); sub by_re_noback { ($data) = ($str =~ / ^ (?>\s*) <!-- (?>\s+) USER (?>\s+) (?>\d+) (?> +\s+) - (?>\s+) (\S+?) (?>\s+) --> (?>\s*) $ /ix); } sub by_re { ($data) = ($line =~ m/<!-- USER \d+ - (\S+)/i); } sub by_re_comp { ($data) = ($str =~ $re); } sub by_substr { $end = rindex($str, ' '); $start = rindex($str, ' ', $end - 1); $data = substr($str, $start + 1, $end - $start); } timethese (100000, { subst => \&by_substr, re_comp => \&by_re_comp, re => \&by_re, re_noback => \&by_re_noback, }); --results-- Benchmark: timing 100000 iterations of re, re_comp, re_noback, subst.. +. re: 1 wallclock secs ( 0.46 usr + 0.00 sys = 0.46 CPU) @ 21 +7391.30/s (n=100000) re_comp: 4 wallclock secs ( 4.35 usr + 0.00 sys = 4.35 CPU) @ 22 +988.51/s (n=100000) re_noback: 6 wallclock secs ( 6.27 usr + 0.00 sys = 6.27 CPU) @ 15 +948.96/s (n=100000) subst: 1 wallclock secs ( 1.40 usr + 0.00 sys = 1.40 CPU) @ 71 +428.57/s (n=100000)

--

There are 10 kinds of people -- those that understand binary, and those that don't.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://192994]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (2)
As of 2024-04-24 23:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found