I ran into a gotcha the other day; I noticed data "laundered" through a regex suddenly took more space:
use Test::More "no_plan";
use Devel::Size "total_size";
my $key = "aa";
my $val = "a00";
my %hash1;
my %hash2;
while (length($key) == 2) {
$hash1{$key} = $val;
"$key$val" =~ /(..)(...)/ and $hash2{$1} = $2;
++$key;
++$val;
}
is(keys(%hash1), keys(%hash2), "same number of keys");
is_deeply(\%hash1, \%hash2, "is_deeply same");
is(total_size(\%hash1), total_size(\%hash2));
__END__
ok 1 - same number of keys
ok 2 - is_deeply same
not ok 3
# Failed test (sizer.pl at line 18)
# got: '39316'
# expected: '58244'
1..3
# Looks like you failed 1 test of 3.
and had to thump myself with a cluestick when I realized why. $2 is a magic variable that fetches the 2nd capture group from the last matched regex in scope. But that magic comes at a cost in storage; as vaguely shown in
http://search.cpan.org/perldoc/B#SV-RELATED_CLASSES, a magic variable (a PVMG or subclass thereof) has fields to not only store a string, but also an integer, a floating point value, and in addition, a pointer to a list of magic that applies to this variable. And when perl does an assignment, the target scalar is upgraded to allow it to store at least as much info as the source scalar could, whether or not it actually needs to. So, in my example, all the hash values are also PVMG's, though only the PV (string) fields are actually used.
Making $2 be "$2" makes the hash values simple PV types, and makes all tests pass.