Long story short: we want to mark strings so that later we can do something with them, even if they get embedded in other strings.
So we figured, hey, let's try overloading. It is pretty neat. I can do something like:
my $str = str::new('<encode this later>');
my $html = "<html>$str</html>";
print $html; # <html><encode this later></html>
print $html->encode; # <html><encode this later></html>
It does this by overloading the concatenation operator to make a new object array with the plain string "<html>", the object wrapping "<encode this later>", and the plain string "</html>". It can nest these arbitrarily. On encode, it will leave the plain strings, but encode the object strings. But if you stringify the object, it just spits it all out as plain strings.
This works well, except that in some cases, it stringifies for no apparent reason. The script below shows the behavior, which I've duplicated in 5.10 through 5.22.
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Data::Dumper; $Data::Dumper::Sortkeys=1;
my $str1 = str::new('foo');
my $str2 = str::new('bar');
my $good1 = "$str1 $str2";
my $good2;
$good2 = $good1;
my($good3, $good4);
$good3 = "$str1 a";
$good4 = "a $str1";
my($bad1, $bad2, $bad3);
$bad1 = "a $str1 a";
$bad2 = "$str1 $str2";
$bad3 = "a $str1 a $str2 a";
say Dumper { GOOD => [$good1, $good2, $good3], BAD => [$bad1, $bad2, $
+bad3] };
$bad1 = ''."a $str1 a";
$bad2 = ''."$str1 $str2";
$bad3 = ''."a $str1 a $str2 a";
say Dumper { BAD_GOOD => [$bad1, $bad2, $bad3] };
package str;
use Data::Dumper; $Data::Dumper::Sortkeys=1;
use strict;
use warnings;
use 5.010;
use Scalar::Util 'reftype';
use overload (
'""' => \&stringify,
'.' => \&concat,
);
sub new {
my($value) = @_;
bless((ref $value ? $value : \$value), __PACKAGE__);
}
sub stringify {
my($str) = @_;
#say Dumper { stringify => \@_ };
if (reftype($str) eq 'ARRAY') {
return join '', @$str;
}
else {
$$str;
}
}
sub concat {
my($s1, $s2, $inverted) = @_;
#say Dumper { concat => \@_ };
return new( $inverted ? [$s2, $s1] : [$s1, $s2] );
}
1;
I want all of these to be dumped as objects, not strings. But the "BAD" examples are all stringified. All of the "BAD" examples are when I'm assigning a string object I am concatenating at the moment to a variable previously declared. If I declare at the same time, or concatenate the strings previously, or add in an extra concatenation (beyond the interpolated string concat), then it works fine.
This is nuts.
The result of the script:
$VAR1 = {
'BAD' => [
'a foo a',
'foo bar',
'a foo a bar a'
],
'GOOD' => [
bless( [
bless( [
bless( do{\(my $o = 'foo')}, '
+str' ),
' '
], 'str' ),
bless( do{\(my $o = 'bar')}, 'str' )
], 'str' ),
$VAR1->{'GOOD'}[0],
bless( [
$VAR1->{'GOOD'}[0][0][0],
' a'
], 'str' )
]
};
$VAR1 = {
'BAD_GOOD' => [
bless( [
'',
bless( [
bless( [
'a ',
bless( do{\(my $o
+ = 'foo')}, 'str' )
], 'str' ),
' a'
], 'str' )
], 'str' ),
bless( [
'',
bless( [
bless( [
$VAR1->{'BAD_GOOD
+'}[0][1][0][1],
' '
], 'str' ),
bless( do{\(my $o = 'bar')
+}, 'str' )
], 'str' )
], 'str' ),
bless( [
'',
bless( [
bless( [
bless( [
bless( [
+ 'a ',
+ $VAR1->{'BAD_GOOD'}[0][1][0][1]
]
+, 'str' ),
' a '
], 'str' )
+,
$VAR1->{'BAD_GOOD
+'}[1][1][1]
], 'str' ),
' a'
], 'str' )
], 'str' )
]
};
The behavior makes no sense to me. I'd like to understand why it works this way, and I'd like to find a workaround.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.