Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^3: Applying regexes to streams: Perl enhancement idea

by Aristotle (Chancellor)
on Jan 08, 2003 at 08:15 UTC ( [id://225214]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Applying regexes to streams: Perl enhancement idea (not that easy)
in thread Applying regexes to streams: Perl enhancement idea

How would I go about telling /z to wrap it up and accept the end of string as end of match? There are really two things you are asking of the engine: to continue where it left off last time, and to fail without forgetting where it's at when it hits the end of string. You need a way to be able to ask for the first without the latter. Otherwise, as a silly example (but let's pretend it isn't), /.+/z would always fail, even at the end of my input stream where I'd want it to successfully match at end of string.

Makeshifts last the longest.

  • Comment on Re^3: Applying regexes to streams: Perl enhancement idea

Replies are listed 'Best First'.
Re^4: Applying regexes to streams: Perl enhancement idea (bug+fix)
by tye (Sage) on Jan 08, 2003 at 17:06 UTC

    Good point. My original example code didn't handle that case correctly in part because it started out as an example of using a regular expression to match record terminators and in part because I had not fully considered the effect of //z on greedy matches until I replied to theorbtwo's node.

    We already have a separate "continue where it left off last time" feature for regular expressions: //g in a scalar context and pos(). So my example is easy to fix by dropping /z once I've found end-of-stream. I'll update it shortly to reflect this.

    Note that my example fetches pos() in order to strip stuff from the front of the buffer, therefore each match is performed with pos()=0. If, for example, you were instead matching record terminators, then you would instead fetch pos() in order to restore it before the next match (since the sysread updates the contents of the buffer which also resets its pos).

    Thanks,
                    - tye

      But how does the regex engine know that that search sans /z is supposed to be the finishing search of the previous series of /z searches, as opposed to an entirely new pattern to be applied? What I'm talking about is analogous to using /gc and then finally only /c to conclude the series of matches. If there was no /g, the (lack of) presence of /c alone would be ambiguous.

      Makeshifts last the longest.

        Because

        We already have a separate "continue where it left off last time" feature for regular expressions: //g in a scalar context and pos().
        and that is what we'd use. //z would not imply "start where you left off last time". There is no reason for it to because we already have //g and pos(). Why do you think I've been using //g and messing with pos() in my example?

        //z would only tell the regex engine to fail [and set pos()] if it looks at the end of the string/buffer. It wouldn't tell it to start where it left off and it wouldn't keep track of where it left off other than by setting pos().

                        - tye

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://225214]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2024-04-19 08:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found