Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Substitution with regex and memory consumption

by dave_the_m (Monsignor)
on Feb 29, 2020 at 20:27 UTC ( [id://11113586]=note: print w/replies, xml ) Need Help??


in reply to Substitution with regex and memory consumption

In general after a successful match (or the match part of a substitution), the regex engine keeps a copy of the original string so that it can dynamically generate values for $1, $2, $&, $` etc on demand. This string needs to be kept for at least as long as the surrounding scope - i.e. the scope of $1 etc. The details are far more complex, but internally perl's Copy-on-Write mechanism often (but not always) avoids having to do a real copy. But it doesn't always work out for the most efficient use of memory.

Dave.

  • Comment on Re: Substitution with regex and memory consumption

Replies are listed 'Best First'.
Re^2: Substitution with regex and memory consumption
by k-mx (Scribe) on Mar 01, 2020 at 09:19 UTC

    Okay, thank you! Some assumptions, please correct me if i'm wrong:

    1. Prior 5.18.0, s/// will copy original only if one of these was set: $&, $`, $'. Interpreter will set global PL_sawampersand flag that can't be disabled later. m// is also affected by this flag.
    2. Between 5.18.0 and 5.20.0, Perl can track usage of mentioned variables separately and copy only requested part of string.
    3. Perl 5.20.0+, successful s/// match always changes string, so COW mechanism always had to copy original.

    So, before 5.20 we have choice: avoid $&, $`, $' and use /p modifier to explicitly copy ${^*MATCH}. Now we can use $&, $`, $', m// don't suffer from PL_sawampersand anymore, but s/// will always copy original string, PL_sawampersand state doesn't matter, and nothing we can do with that.

      That's roughly it, yes.

      Dave.

        Isn't this behavior incorrect? Why we must copy variable on each substitution?

        I think we can: respect PL_sawampersand and /p flag for substitution and have only one copy for string (last one) in current scope.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11113586]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (None)
    As of 2024-04-18 23:42 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found