Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

md5_hex changes its argument

by tinita (Parson)
on Aug 15, 2007 at 09:30 UTC ( [id://632683]=perlquestion: print w/replies, xml ) Need Help??

tinita has asked for the wisdom of the Perl Monks concerning the following question:

hello monks,

i recently wondered why some of my utf8 strings missed their utf8 flag. i found the point where they were used as arguments to Digest::MD5::md5_hex.

$ perl -wle' use Digest::MD5 qw(md5_hex); use Devel::Peek; use Encode; my $string = "äöü"; Encode::_utf8_on($string); Dump $string; my $md5 = md5_hex($string); Dump $string ' SV = PV(0x8153b00) at 0x8153684 REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8) PV = 0x8174d48 "\303\244\303\266\303\274"\0 [UTF8 "\x{e4}\x{f6}\x{fc +}"] CUR = 6 LEN = 8 SV = PVMG(0x81ee3e0) at 0x8153684 REFCNT = 1 FLAGS = (PADBUSY,PADMY,SMG,POK,pPOK) IV = 0 NV = 0 PV = 0x8174d48 "\344\366\374"\0 CUR = 3 LEN = 8 MAGIC = 0x81cbca0 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 3
shouldn't the function leave its arguments alone?

Replies are listed 'Best First'.
Re: md5_hex changes its argument
by Joost (Canon) on Aug 15, 2007 at 10:14 UTC
      this has been reported in rt.cpan.org over a year ago
      oh thanks =)
      i should have been looking there myself.
      too bad though the bug doesn't seem to get solved.
        In the meantime you could use my $md5 = md5_hex("$string"); to work around the problem.
        too bad though the bug doesn't seem to get solved.

        "Patches speak louder than words", as the saying goes...

        I have looked into the thing and made up a tentative patch:

        that seems to fix it. There are still problems with two failing tests. One of the fails can be fixed in a straightforward manner, the other one I haven't followed up yet.

        If you want the fix quickly you're invited to pick it up for submission at p5p. You can mail me through my berlin.pm address for more details if you want to. Otherwise I'll come back to it later, which may be much later.

        Anno

Re: md5_hex changes its argument
by graff (Chancellor) on Aug 16, 2007 at 04:48 UTC
    Until the bug is fixed, you might want to consider a small change in how you use the "md5_hex" function. There are a variety of ways to do this, depending on your preference, but they would all boil down to something like:
    my $md5 = md5_hex( encode( 'utf8', $string ));
    (update: the right function to use here is "encode", not "decode" as originally posted -- sorry for the confusion)

    That will pass a copy of the original string to md5_hex, and the copy will have the utf8 flag already turned off.

    (update: probably the best way to do this is to write your own "wrapper" module for Digest::MD5 -- the functions in "MyMD5.pm" would check the string being passed in, and only de encode() if the utf8 flag is on. Then you just need to change the module name in the scripts that run md5 on ut8 strings.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://632683]
Approved by clinton
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-04-24 20:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found