Lexical %+ %- and more?

This is basically a repost of the last part of a reply of mine which appearently went unnoticed and anyway didn't get any answer: indeed it went slightly OT WRT the main thread.

So we have named captures and it's not much time since we do, but I already ask for them not to be read-only: what people tell me basically is that there are implementation details that force them to be instead. Now I go one step further so this may well be taken as sci-fi, but nevertheless I think it exposes an interesting idea: from 5.10 onward we have a lexical $_ so I wonder whether we could have lexical %+ and %- such that:

their entries would not be reset across matches until the end of the lexical scope they live in;
(they would be modifiable - I still insist on that!)

Thus one may have the following example (which explains the whole thing better than many abstract descriptions...) working as naively expected:

{
    my %+;
    doit if $x ~~ / (?<x1> \w+)\s+(?<x2> \w+) /x and
            $y ~~ / (?<y1> \w+)\s+(?<y2> \w+) /x and
            $+{x1} . $+{y2} eq $+{y1} . $+{x2};
}
[download]

Assuming e.g.:

$x = 'fo  ar!';
$y = '?foob obar  ar';
[download]

at the end of the scope, if I printed Data::Dumper's Dumper \%+ I would get

$VAR1 = {
          'y1' => 'foob',
          'x2' => 'ar',
          'y2' => 'obar',
          'x1' => 'fo'
        };
[download]

Please don't point out that wrt the example above there are tons of other WTDI: it's obvious that there are - we're talking about Perl anyway! I just think we could have one more, and with a very clear syntax too. Also, the idea sprang in the context of that other thread dealing with %+ and %- but there may be other special variables that may allow a lexical incarnation with a modified semantics associated to to it.

--
~~If you can't understand the incipit, then please check the IPB Campaign.~~

Comment on Lexical %+ %- and more? Select or Download Code

Replies are listed 'Best First'.
Re: Lexical %+ %- and more? by TimToady (Parson) on Oct 22, 2008 at 17:56 UTC
The original design of `@+` and `@-` was a complete botch, and 5.10 extends that botch to the use of hashes. Perl 5 should move toward the Perl 6 model of a single lexical variable containing all the information from the last match, and then any variables like `$1` are just aliases into that structure. Parallel global arrays and hashes are madness, even if I could keep straight which one is the beginning and which one is the end, which I can't. And parallel hashes force you to do the hash lookup twice. Madness...	[reply] [d/l] [select]
Re: Lexical %+ %- and more? by JavaFan (Canon) on Oct 22, 2008 at 11:41 UTC
`my %+; my %-; 'foo' =~ /(?<w>\w+)/; 'bar' =~ /(?<w>\w+)/; use YAML; print Dump \%+; print Dump \%-; __END__` [download] What should that print? What if the last match was `'--' =~ /(?<\w>\w+)/`? What if %+ is lexical, but %- isn't? And if lexical %- and %+ works as you want, should this work as well? `'foo' =~ /(?<w>\w+)/ && 'foofoo' =~ /\g{w}\g{w}/;` [download] But that begs the question, what about: `'oo' =~ /(?<w>\w+)/ && 'oo' =~ /\g{w}\g{w}/;` [download]	[reply] [d/l] [select]
Re^2: Lexical %+ %- and more? by blazar (Canon) on Oct 22, 2008 at 13:38 UTC
What should that print? I personally believe: `--- w: bar --- w: - bar - bar` [download] The same "variable" is used and thus it is natural for it to be clobbered: if I didn't want, then I would have used a different one, especially since "now" it is so easy, whereas it wouldn't be an option were it only for numbered captures. What if the last match was `'--' =~ /(?<\w>\w+)/`? I beg your pardon, but... I don't see the difference! Maybe I'm just tired... What if `%+` is lexical, but `%-` isn't? Well, they should behave independently, although of course this would be very inconsistent if one need both. (But I bet some hacker would find a cool way to exploit it for something weird and insane! ;) And if lexical %- and %+ works as you want, should this work as well? `'foo' =~ /(?<w>\w+)/ && 'foofoo' =~ /\g{w}\g{w}/;` [download] I don't see any reason why it shouldn't. But that begs the question, what about: `'oo' =~ /(?<w>\w+)/ && 'oo' =~ /\g{w}\g{w}/;` [download] Well, this should plainly fail. I think you're asking me what should be of `%+` and `%-` after this, right? Well: no named captures are attempted in the second match, so they should stay like: `%+ = ( w => 'oo'); %- = ( w => ['oo']);` [download] But if it were `'oo' =~ /(?<w>\w+)/ && 'oo' =~ /(?<w>\g{w}\g{w})/; # Which I think +is possible!` [download] then they would become `%+ = ( w => undef); # or not existing at all? I'm half hearted... %- = ( w => []);` [download] `--` If you can't understand the incipit, then please check the IPB Campaign.	[reply] [d/l] [select]


Problems? Is your data what you think it is?
	PerlMonks