Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: parsing question

by kilinrax (Deacon)
on May 28, 2003 at 08:50 UTC ( [id://261256]=note: print w/replies, xml ) Need Help??


in reply to parsing question

Unfortunately your question isn't terribly clear, so I'm not entirely sure what you're looking for.
However, one thing I would suggest - if you want to match after the last occurance of something, it may be easier to apply a regex to a reversed string, e.g:
my $reverse = reverse $line; $reverse =~ s| \w* ; \s* > |>|x; $line = reverse $reverse;

Replies are listed 'Best First'.
Re: Re: parsing question
by Chady (Priest) on May 28, 2003 at 09:30 UTC

    or maybe a greedy regex?

    $line =~ s/^(.*>) ;.*/\1;/;

    He who asks will be a fool for five minutes, but he who doesn't ask will remain a fool for life.

    Chady | http://chady.net/

      In a word, no.

      Reversing the regex is much faster.
      Have a look at these benchmarks:

      #!/usr/bin/perl -w use strict; use Benchmark; my $string = "<<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf> ;strip_ +me"; sub reversed { my $reverse = reverse(shift); $reverse =~ s| \w* ; \s* > |>|x; return scalar reverse $reverse; } sub greedy { my $line = shift; $line =~ s|^ (.*>) \s* ; \w* |$1|x; return $line; } print "Reversed: ", reversed($string), "\n"; print "Greedy: ", greedy($string), "\n"; timethese( -10,{ reversed => sub { reversed( $string ) }, greedy => sub { greedy( $string ) }, } );

      Output:

      Reversed: <<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf>
      Greedy: <<HTML>;nbsp dont_strip_me</HTML>> <xyzfdgfghgf>
      Benchmark: running greedy, reversed, each for at least 10 CPU seconds...
          greedy: 10 wallclock secs ( 9.98 usr + 0.02 sys = 10.00 CPU) @ 78480.80/s (n=784808)
        reversed: 11 wallclock secs (10.46 usr + 0.00 sys = 10.46 CPU) @ 167660.04/s (n=1753724)

      As you can see, it's over twice the speed. On longer strings, the difference would be even greater.

      Also, your regex is wrong. Read through perldoc:perlre (specifically, the section marked 'Warning on \1 vs $1') to discover why.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://261256]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-25 10:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found