Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Regexes on Streams (a partial solution?)

by BrowserUk (Patriarch)
on Oct 12, 2003 at 02:14 UTC ( [id://298578]=note: print w/replies, xml ) Need Help??


in reply to Regexes on Streams

Starting with tilly's idea, and attempting to generalise it, I came up with this.

#! perl -slw use strict; use re 'eval'; sub Re_Stream { my( $re_user, $extend ) = @_; die "Usage: Re_Stream( regex, coderef )" unless defined $re_user and ref $extend eq 'CODE'; return qr[ (?: \Z (?(?{ $extend->() })|(?!) }) ) | $re_user ]x; } my $buf = 'abcdefghijklmnopqrstuvwxyz'; my $c = 'A'; sub extend{ $buf .= ($c++) x 100; return length $c < 2 } my $re_stream = Re_Stream( qr[(..)(...)], \&extend ); print $re_stream; my $i = 0; print "${ \++$i }: $1|$2" while $buf =~ m[$re_stream]g;

The sub Re_Stream(), takes a regex and a coderef. The regex can be any regex (in theory:), and the coderef should be a function that will extend the stream beyond it's current limit. This function should return true if it has extended the stream, and false if there is no more to come.

As coded, the while running the regex will continue to match against the stream until the extender function returns false. I'm not sure if this is progress. The upside is that you no longer have to inspect the user's regex in ordr to work out where to insert the code block to extend the buffer. In fatc you don't have to modify the user regex at all. However, there are a couple of problems with it as it stands.

  1. If the match crosses the boundary of the buffer being extended, a null match is returned.
  2. Ay attempt I made to shorten the pre-trucate the string, Ie. To discard some part of the front of the string that had already been processed seemed to "confuse" the regex.
  3. As is, it requires use re 'eval'; which may or may not be a problem.

I've only made a half-hearted attempt at fixing these so far, but thought that I would throw it open to see if anyone else can take it further, or dismiss it as unworkable.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://298578]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-18 05:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found