Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

I understand (I think) why this is fast. What I don't understand is how it works. I'll admit that I am not much of regex hacker, but how does the variable length look behind now how far to look, in the case of quantifiers, especially greedy ones? It seems that the goal of the anchor is to prevent backtracking (hence the speed gains) but how does it now when to stop? If the \K anchor tells the regex engine to watch for the oncoming regex and stop matching there, whats the difference between \K and a minimal match, or negated character class? If the engine doesn't stop at the first instance of, say '.' (as in your example) how do you keep it from backtracking (which is where I'm presuming the speed gains come from)? Or is \K an optimization using sexegers?

Now, japhy I have absolute faith in your pattern - foo, so I take it on faith that this works. What I'm curious about is how. Am I missing something in thinking that the performance gains here are related to backtracking?

Cheers,
Erik

Light a man a fire, he's warm for a day. Catch a man on fire, and he's warm for the rest of his life. - Terry Pratchet


In reply to Re: New regex trick... by erikharrison
in thread New regex trick... by japhy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-19 04:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found