http://qs321.pair.com?node_id=1217269

pudge has asked for the wisdom of the Perl Monks concerning the following question:

Long story short: we want to mark strings so that later we can do something with them, even if they get embedded in other strings. So we figured, hey, let's try overloading. It is pretty neat. I can do something like:
my $str = str::new('<encode this later>'); my $html = "<html>$str</html>"; print $html; # <html><encode this later></html> print $html->encode; # <html>&lt;encode this later&gt;</html>
It does this by overloading the concatenation operator to make a new object array with the plain string "<html>", the object wrapping "<encode this later>", and the plain string "</html>". It can nest these arbitrarily. On encode, it will leave the plain strings, but encode the object strings. But if you stringify the object, it just spits it all out as plain strings. This works well, except that in some cases, it stringifies for no apparent reason. The script below shows the behavior, which I've duplicated in 5.10 through 5.22.
#!/usr/bin/perl use strict; use warnings; use 5.010; use Data::Dumper; $Data::Dumper::Sortkeys=1; my $str1 = str::new('foo'); my $str2 = str::new('bar'); my $good1 = "$str1 $str2"; my $good2; $good2 = $good1; my($good3, $good4); $good3 = "$str1 a"; $good4 = "a $str1"; my($bad1, $bad2, $bad3); $bad1 = "a $str1 a"; $bad2 = "$str1 $str2"; $bad3 = "a $str1 a $str2 a"; say Dumper { GOOD => [$good1, $good2, $good3], BAD => [$bad1, $bad2, $ +bad3] }; $bad1 = ''."a $str1 a"; $bad2 = ''."$str1 $str2"; $bad3 = ''."a $str1 a $str2 a"; say Dumper { BAD_GOOD => [$bad1, $bad2, $bad3] }; package str; use Data::Dumper; $Data::Dumper::Sortkeys=1; use strict; use warnings; use 5.010; use Scalar::Util 'reftype'; use overload ( '""' => \&stringify, '.' => \&concat, ); sub new { my($value) = @_; bless((ref $value ? $value : \$value), __PACKAGE__); } sub stringify { my($str) = @_; #say Dumper { stringify => \@_ }; if (reftype($str) eq 'ARRAY') { return join '', @$str; } else { $$str; } } sub concat { my($s1, $s2, $inverted) = @_; #say Dumper { concat => \@_ }; return new( $inverted ? [$s2, $s1] : [$s1, $s2] ); } 1;
I want all of these to be dumped as objects, not strings. But the "BAD" examples are all stringified. All of the "BAD" examples are when I'm assigning a string object I am concatenating at the moment to a variable previously declared. If I declare at the same time, or concatenate the strings previously, or add in an extra concatenation (beyond the interpolated string concat), then it works fine. This is nuts. The result of the script:
$VAR1 = { 'BAD' => [ 'a foo a', 'foo bar', 'a foo a bar a' ], 'GOOD' => [ bless( [ bless( [ bless( do{\(my $o = 'foo')}, ' +str' ), ' ' ], 'str' ), bless( do{\(my $o = 'bar')}, 'str' ) ], 'str' ), $VAR1->{'GOOD'}[0], bless( [ $VAR1->{'GOOD'}[0][0][0], ' a' ], 'str' ) ] }; $VAR1 = { 'BAD_GOOD' => [ bless( [ '', bless( [ bless( [ 'a ', bless( do{\(my $o + = 'foo')}, 'str' ) ], 'str' ), ' a' ], 'str' ) ], 'str' ), bless( [ '', bless( [ bless( [ $VAR1->{'BAD_GOOD +'}[0][1][0][1], ' ' ], 'str' ), bless( do{\(my $o = 'bar') +}, 'str' ) ], 'str' ) ], 'str' ), bless( [ '', bless( [ bless( [ bless( [ bless( [ + 'a ', + $VAR1->{'BAD_GOOD'}[0][1][0][1] ] +, 'str' ), ' a ' ], 'str' ) +, $VAR1->{'BAD_GOOD +'}[1][1][1] ], 'str' ), ' a' ], 'str' ) ], 'str' ) ] };
The behavior makes no sense to me. I'd like to understand why it works this way, and I'd like to find a workaround.