http://qs321.pair.com?node_id=62419


in reply to Perl is psychic?!

Excellent question!

Since everybody else seems to have missed your (subtle) point by quoting irrelevant documentation that you clearly understood in great detail, allow me to repeat your point. Perl is supposed to have an important optimization. If you never use $&, $`, and $' in your script, Perl is not supposed to calculate them ever. This is important because it makes matches against long strings an order of magnitude faster. If you use them ever, they are calculated from then on. Caveat programmer. (I don't use them, ever. I wish I could make attempting to use them optionally fatal just to smoke out people who use them, but I can't.)

With this optimization there should be no way that the above code will work since when you do the match, Perl is dealing with a script that has no $&, $`, or $' in it. And so when it goes to display the answer, the necessary data should not exist yet. But you run it and it does.

For the record I ran it under 5.004, and got the output that you describe. I ran it under 5.005 and got no output at all as you would expect. I ran it under a slightly modified 5.6 and got a segmentation fault. (Not good, but in this case understandable.) A slight modification of your code to test $' and $` had similar results. With 5.005 when I look at perldelta I see that there were a number of changes to the RE engine including the following:

Changes in Perl code using RE engine: More optimizations to s/longer/short/; study() was not working; /blah/ may be optimized to an analogue of index() i +f $& $` $' not seen; Unneeded copying of matched-against string removed; Only matched part of the string is copying if $` $' + were not seen;
The last 2 items sound like the behaviour fix. I guess that the optimization wasn't really being done in 5.004, or it was done but not done as fully as it was done later.

For the record I was seriously impressed with Ruby's optimization for this case. What they did is lazily calculated $&, $', and $` as needed. You only pay on the matches where you use those, or on cases where you try to modify a string in place that you matched against before you go to match again. Don't use it one place, pay no price even if you use it elsewhere. I tried, but couldn't find a way to break it. I suspect that this approach (which is much cleaner) would be harder to do in Perl. Still it was a nice surprise...

UPDATE
This seems to be very, very specific to the code. I actually assumed I knew what should happen and wanted to check $` and $' as well, so I changed the code to

'string' =~ /ri/; print eval <STDIN>;
for my tests. As confirmed on several platforms in chatter, the behaviour switches between versions of Perl. But the original code snippet always seems to work, and I have not a clue how or why.