Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: Why are 5.10's named captures read only?

by blazar (Canon)
on Oct 20, 2008 at 12:45 UTC ( [id://718207]=note: print w/replies, xml ) Need Help??


in reply to Re: Why are 5.10's named captures read only?
in thread Why are 5.10's named captures read only?

First on why numbered captures are read-only. I'm not quite sure about the reason, but I think it's a performance issue. Captures aren't actually stored as copies, but as indexes into the string. This saves a lot of copying, and gives better performance.

I personally believe this pretty much explains it all, and in a reasonably and well acceptable manner. I still think that perhaps one could still retain performance while allowing modification of either numbered or named captures by "copying-on-modification" the actual meaning of which is obvious. (But then I admit I don't have the slightest idea of the difficulties that may arise in the actual implementation, so apologies in advance to those who hack down there, should they find offensive the fact that I put it down in such simple terms...)

That reasoning I don't understand. How does using named captures give you more control? If numbered captures can be clobbered, how come named captures don't?

Well, I must confess that it's part of a thinko. I was reasoning much in the context of regexp-like operations in the substitution part of an s/// operator, which admittedly is not something you do everyday. I posted an example which in fact would be more of a counter-example yesterday. If I had a modifiable %+, then the code there would become:

s/ ^ \b (?<head> [ \w \s \[ \] ]+ \s+ \( ) $ (?<body> .*? ^\)$ ) / $+{body} =~ s|QUALIFIED|| unless $+{head} ~~ m|^\w+?clk\[\d\]|; $+{head} . $+{body} /gemsx;

It's clear that I also naively expect %+ not to be clobbered by the match, which would hardly be the case, since as you say named captures are nothing but other names for numbered ones! (I still think it would be a nice thing if the above could work as expected.) Indeed, from the UI POV, if %+ were retained across match operations, then you would have more control. In fact you may want to do... [/me's thinkering of some not too convoluted example...]

doit if $x ~~ / (?<x1> \w+)\s+(?<x2> \w+) /x and $y ~~ / (?<y1> \w+)\s+(?<y2> \w+) /x and $+{x1} . $+{y2} eq $+{y1} . $+{x2};

(Please, do not point out OWTDI!) The point here is that if I didn't have named captures, necessarily the second match's $1 and $2 would clobber the first ones'. Now, this was a thinko because both numbered captures and %+ (its implementation's details apart) must be reset or else the latter would grow indefinitely across the program...


Re the last point of the previous paragraph, one crazy idea I'm having now is that occasionally it would be nice to have that behaviour, as in the previous code example, and that it may be triggered by a lexical %+, by analogy with the new lexical $_. Thus

{ my %+; doit if $x ~~ / (?<x1> \w+)\s+(?<x2> \w+) /x and $y ~~ / (?<y1> \w+)\s+(?<y2> \w+) /x and $+{x1} . $+{y2} eq $+{y1} . $+{x2}; }

would do what I mean, and restore the "normal" behaviour upon exiting the lexical scope. How 'bout this idea?

--
If you can't understand the incipit, then please check the IPB Campaign.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://718207]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-04-18 18:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found