comment on

Hi All,

I've recently written some parser code and today discovered what appears to be a memory leak somewhere in the code. Debugging and stripping down the code, I've managed to construct a toy example that illustrates the problem. Here's the code:

#!/usr/bin/perl
use strict;
use warnings;

open(my $fh, '<', 'EXAMPLE.TXT');

my $regexp = qr/(?<value1>\d+)\s+(?<value2>\d+)/;

while(<$fh>)
{
    next unless /$regexp/;
    my $value1 = $+{value1};
    my $value2 = $+{value2};
    print "GOT $value1 $value2\n";
}
[download]

The code is simply using 2 named capture buffers in a regexp to parse out numeric values.

'EXAMPLE.TXT' is just a text file consisting of a pair of numbers on each line. I used

1 2
3 4
5 6
7 8
...
[download]

And so on, for about 100,000 lines, though it doesn't really have to be that long.

I'm working in ActiveState perl, v5.10.0, WinXP x86, and using the task manager to observe how much memory perl uses as it parses the file. Usage steadily increases until the script finishes. For this toy example, it's not so much of an issue, but in my actual project it gets out of hand rather fast.

I've noticed that switching over to $1 and $2 rather than $+{value1} and $+{value2} eliminates the problem, but I prefer using the named capture buffers for clarity as things get big & hairy.

My question is...why? I was assuming that the my-scoped variables within the loop would go out of scope each iteration and free up any references to %+'s elements. I'm aware that %+ is a tied hash, but am not familiar enough with the details of tied hashes to figure out what's going wrong.

Thanks
-Maph

In reply to Memory Leaks and %+ by MaphsterB

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Syntactic Confectionery Delight
	PerlMonks