I did a bit of studying (DBM-Deep-1.0016).
- write_value uses class DBM::Deep::Engine::Sector::Scalar for everything but references and undef.
- ::Scalar::_init receives the value and passes it to print_at.
- print_at expects a string of bytes. It's getting a string that contains non-bytes.
No encoding is done anywhere, as far as I've seen. Definitely a major bug. Two possible fixes:
- Have DBM::Deep::Engine::Sector::Scalar's _init encode values.
- Add another Sector type for strings with UTF8=1.
The latter should be simpler, more efficient, and allows the preservation of the UTF8 flag. Basically, adjust write_value and add
package DBM::Deep::Engine::Sector::Unicode;
use 5.006_000;
use strict;
use warnings FATAL => 'all';
no warnings 'recursion';
use base qw( DBM::Deep::Engine::Sector::Scalar );
sub type { $_[0]{engine}->SIG_UNICODE }
sub _init {
my $self = shift;
utf8::encode( $self->{data} )
if $] >= 5.008 && defined($self->{data});
$self->SUPER::_init();
}
sub data {
my $self = shift;
my $data = $self->SUPER::data();
utf8::decode( $data )
if $] >= 5.008;
return $data;
}
1;
__END__
And that's just for values. A separate fix is needed for the keys, I believe.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|