Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Memory leak in unicode substitution

by vr (Curate)
on Aug 30, 2019 at 09:09 UTC ( [id://11105298]=note: print w/replies, xml ) Need Help??


in reply to Memory leak in unicode substitution

"x" =~ / [\x{1234}] /x for 0 .. 100_000; "x" =~ /(?: \x{1234} | \x{1234} )/x for 0 .. 100_000; "\x{4321}" =~ / \x{1234} /x for 0 .. 100_000;

Curious, the bug doesn't bite if character is put in a class or dummy alternation. Most important, there's no bug if target string is utf8 itself. That's why, I think, it wasn't found sooner. Unicode in regexes most often means Unicode in texts.

Replies are listed 'Best First'.
Re^2: Memory leak in unicode substitution
by holli (Abbot) on Aug 30, 2019 at 17:06 UTC
    Good catch, can confirm. Surprisingly, simply adding use utf8; doesn't fix it though. Aren't all string literals to be Unicode when utf8 is in effect?


    holli

    You can lead your users to water, but alas, you cannot drown them.
      No. use utf8; just means UTF-8 is used in the source code (both in string literals or identifiers). But even use feature qw( unicode_strings ); doesn't help here.

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11105298]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-04-25 21:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found