laziness, impatience, and hubris | |
PerlMonks |
Matching against a partially known stringby Anonymous Monk |
on Sep 19, 2003 at 17:44 UTC ( [id://292728]=perlquestion: print w/replies, xml ) | Need Help?? |
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question: Here I have a RE which I will call $pattern; it begins with a ^. Over there is a string which I will call $fullstring. I would like to know if $pattern matches $fullstring; that is: $fullstring =~ /$pattern/. The challenge is that I only know $buff, a prefix of $fullstring. I want a programmatic way to answer the following question: Can I be certain that, no matter what I add to $buff, that $pattern will not match? Asked again, more formally and in the inverse (I don't care what $remainder is, I just want to know if it's possible for $pattern to match with some oracle-given $remainder.): Does there exist a string $remainder such that ($buff.$remainder) =~ /$pattern/? Please ponder the above for a minute before I clutter your mind with the practical bits and my particular ideas for a solution. Okay, now I'll tell a bit more about the practical side. I am breaking a stream coming over a network into tokens, each token matching one of a set of patterns. I want to be able to eliminate patterns that can't match the next token, so that when I have eliminated all the patterns I recognize that there has been a protocol error. Most of the patterns are quite simple, usually fixed-length and fixed-position:
Ideally, I'd like to get some kind of hook into the RE engine, such that when the engine reaches the end of $buff, rather than backtracking, it should immediately report a (potential) match. Alas, nothing in man pages, my Nutshell books, CPAN, Usenet searches, or Perl Monks has hinted at a way to call my code from within the RE engine (In particular, yes, I've read man overload and "Creating custom RE engines" in man perlre.) Is there a way to programmatically transform $pattern into another RE that I can apply against $buff and get an answer to my formal question? One way to transform $pattern would be to replace every node (named $node) of its AST with ( $ | $node ). After such a transformation, the RE engine would claim a match whenever it hit the end of $buff. I can't think of an elegant way to perform such a transformation without making my own RE parser. Can I do something with $^R, (?{ code }), and/or (??{ code })? The important thing for me right now is that it be correct. Once I get that, I'll worry about making it work efficiently.
Back to
Seekers of Perl Wisdom
|
|