http://qs321.pair.com?node_id=184228


in reply to Re: Re: New regex trick...
in thread New regex trick...

Okay, I know it's because I'm dumb, but I still don't get it. Please don't yell at me, but why does the \K anchor keep .* from matching .jkl? And if it backtracks like normal, then where does the speed come from? I think that may be the essence of my confusion - why is this faster?

If I don't get it this time, I'll give up and just trust it ;-).

Cheers,
Erik

Light a man a fire, he's warm for a day. Catch a man on fire, and he's warm for the rest of his life. - Terry Pratchet

Replies are listed 'Best First'.
Re: Re: Re: Re: New regex trick...
by japhy (Canon) on Jul 22, 2002 at 20:49 UTC
    Oh, \K doesn't stop .* from matching the entire string. Perl is smart enough to back off to the last "." when the \. node comes up.

    What \K is doing is faking WHERE in the string (and the pattern) the regex started to match. Compare:

    $str = "Match 9 the 1 last 6 digit 2 blah"; $str =~ /.*\d/; print "[$`] [$&] [$']\n"; $str =~ /.*\K\d/; print "[$`] [$&] [$']\n"; __END__ [] [Match 9 the 1 last 6 digit 2] [ blah] [Match 9 the 1 last 6 digit ] [2] [ blah]
    See, \K tells $& that THIS is where it begins. This is useful in substitutions:
    # you go from this: s/(saveme)deleteme/$1/; # to this: s/saveme\Kdeleteme//;
    And you save time on replacing "saveme" with itself.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      Ah, excellent explanation. That clears things up nicely. ++ to erikharrison for asking the dumb questions that I needed answered.
Re: Re: Re: Re: New regex trick...
by shelob101 (Sexton) on Jul 23, 2002 at 20:11 UTC
    Erik, the bit about .* not matching .jkl is just plain old regex engine rules. That is to say, the match wouldn't succeed unless the last literal period (\.) is followed by 0 or more things (.*), eh? I'm assuming you're talking about the first .*, not the second one.
    Paul

    When there is no wind, row.