Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

MD5 Peculiarities

by skazat (Chaplain)
on Jun 22, 2003 at 06:43 UTC ( [id://267926]=perlquestion: print w/replies, xml ) Need Help??

skazat has asked for the wisdom of the Perl Monks concerning the following question:

I'm having a time banging my head on this one...

Is there anything that would cause :

sub create_checksum { my $self = shift; my $data = shift; my $foo = $$data; use Digest::MD5; my $ctx = Digest::MD5->new; my $cs = $ctx->md5_hex($foo); warn "data: " . $foo; warn "checksum: " . $cs; return $cs; }

From actually giving me different checksum each time it's run?

This seems to be what I'm seeing. If I create a small script that calls just this method in this Module, it seems that everything works correctly, but when I call it from a larger (too large to post) script, it gives me a different checksum!

The only thing I can fathom is that something in the script is mucking about with something in Perl. The only thing out of the ordinary in the script is that it's calliing many time related functions, time() localtime, etc.

Is there anything I should look out for when created Digest::MD5 checksums? I'm on FreeBSD 4.5, perl 5.8, MD5 version 2.22. I've also seen these results using just Digest::Perl::MD5. I'm totally stumped, and I know that this isn't much to go by, but that's my problem too.

Is there something I"m just totally missing? I'm not really a green thumb with this Perl thing...

Cheers,

 

-justin simoni
!skazat!

Replies are listed 'Best First'.
Re: MD5 Peculiarities
by gmax (Abbot) on Jun 22, 2003 at 10:11 UTC

    Debugging exercise

    Here is how I would tackle the problem.

    Finding a suspect.

    I can see that the output of Digest::MD5 in OO mode is incorrect.

    By comparing with some independent programs, you realize that the output of md5sum and mysql is the same as the functional interface of Digest::MD5

    $ perl -e 'use Digest::MD5; print Digest::MD5->new->md5_hex("foobarbaz +") ,$/' e05e07ceb87ddb19ccba8a51a57ac120 $ perl -e 'use Digest::MD5 qw(md5_hex); print md5_hex("foobarbaz"),$/' 6df23dc03f9b54cc38a0fc1483df6e21 $ echo -n foobarbaz | md5sum 6df23dc03f9b54cc38a0fc1483df6e21 *- $ mysql -e "select md5('foobarbaz')" +----------------------------------+ | md5('foobarbaz') | +----------------------------------+ | 6df23dc03f9b54cc38a0fc1483df6e21 | +----------------------------------+

    Getting the evidence

    Now, let's apply a constitutional principle, according tro which, everybody is innocent until proved guilty. So I would say that the OO interface is innocent, and look at the docs. They say that this is the regular format of the OO interface.

    use Digest::MD5; $md5 = Digest::MD5->new; $md5->add('foo', 'bar'); $md5->add('baz'); $digest = $md5->hexdigest;

    This is different from what you have done. Let's try it the "official" way.

    sub create_checksum { my $self = shift; my $data = shift; my $foo = $$data; my $ctx = Digest::MD5->new; $ctx->add($foo); my $cs = $ctx->hexdigest(); return $cs; }

    This works fine. Proof of concept:

    $ perl -e 'use Digest::MD5; my $x= Digest::MD5->new; $x->add("foobarba +z"); print $x->hexdigest,$/' 6df23dc03f9b54cc38a0fc1483df6e21

    The verdict

    The OO interface is innocent.

    Your implementation is guilty. :)

     _  _ _  _  
    (_|| | |(_|><
     _|   
    
      Excellent investigation, gmax. In fact, the problem is that md5_hex is purely a standalone function, never a method. By calling it as a method, the original poster has been calculating the MD5 checksum of the scalar representation of the object $ctx, likely something similar to Digest::MD5=SCALAR(0x804c340), which of course will be different each time.

      --isotope

      gmax,

      It is over and above your monastery duties to post this and I have to tell you that I am grateful. Thanks for pointing out the woes of my ways. I am one step closer to enlightenment.

      Is it neglectful for me to think that the OO interface is exactly reflective of the Functional interface? In this case, yes.

       

      -justin simoni
      !skazat!

Re: MD5 Peculiarities
by cchampion (Curate) on Jun 22, 2003 at 07:57 UTC
    You are calculating the MD5 of a reference, not the data.

    my $foo = $$data; # ^ # |

    Therefore, if you run a simple script several times in a row, the result is likely to be the same. In a complicate one, the reference to that variable may more easily change. Thus, the varying results you get.

      I'm passing a reference to the method (although this isn't aparent), doublin' up the '$" dereferences the variable, no?

      Anyways, in the actual MD5 method call, I have "foobarbaz".

      Anyways again, I switch to the non OO way of doings things in Digest::MD5 and everything works out OK.

      File under, "Why did this stupid thing waste my saturday night?"

       

      -justin simoni
      !skazat!

Re: MD5 Peculiarities
by antirice (Priest) on Jun 22, 2003 at 07:09 UTC

    Are you always calling create_checksum() from an instance? If you are just calling this after exporting or calling it with My::Module::create_checksum("value") then it will return the checksum of undef (which happens to be d41d8cd98f00b204e9800998ecf8427e). Two ways to fix it are to create an instance and call it from that instance, call it as My::Module->create_checksum("value"), you can call it as create_checksum(undef,"value"), or you can remove the $self = shift; line.

    Of course, you should also use warnings; to alert you when you get undef values.

    UPDATE: Late night and I put strict for detecting undef values rather than warnings. DOH!

    antirice    
    The first rule of Perl club is - use Perl
    The
    ith rule of Perl club is - follow rule i - 1 for i > 1

      Of course, you should also use strict; to alert you when you get undef values.
      strict doesn't warn for undef values. warnings does, for many circumstances (except when using an undefined value is perfectly fine, for example, in boolean context: undef is a perfectly fine false value) — or simply use the -w command line switch, for example in the shebang line:
      #!/usr/local/bin/perl -w
Re: MD5 Peculiarities
by skazat (Chaplain) on Jun 22, 2003 at 07:12 UTC

    To make this even more simpler to understand, here is a module, called, "CheckSum.pm":

    package MOJO::MailingList::Schedules::CheckSum; sub new { my $class = shift; my %args = (@_); my $self = {}; bless $self, $class; return $self; } use Digest::MD5; sub checksum { my $self = shift; my $data = shift; my $ctx = Digest::MD5->new; return $ctx->md5_hex("foobarbaz"); } 1;

    (NOTE: it will always make the checksum from "foobarbaz" )Here is a script that tests this module:

    require MOJO::MailingList::Schedules::CheckSum; my $cs = MOJO::MailingList::Schedules::CheckSum->new(); print $cs->checksum("foobarbaz") ."\n";

    That script prints out, "d79e1dc81fb80407e0824bf7854b7544",

    When almost the exact code is put into another script:

    sub create_checksum { my $self = shift; my $data = shift; require MOJO::MailingList::Schedules::CheckSum; my $cs = MOJO::MailingList::Schedules::CheckSum->new(); return $cs->checksum("foobarbaz") ."\n"; }

    this returns all sorts of things, a few of the checksums I've received:

    203156774c6b5b2300971c5abed8f0d6
    46b8a55039b48cac0955106b0987a3d
    6f0e7e2d63ce536cc381698a9f6a9524

    I'm pretty stumped on why I'm receiving this. Oy.. It must be because of something else in that module, but what? and why? Again, I would post the module, but it's almost 800 lines long and the script that calls it is 1500.

     

    -justin simoni
    !skazat!

Re: MD5 Peculiarities
by gellyfish (Monsignor) on Jun 22, 2003 at 09:14 UTC

    I'm not absolutely sure of the reason for this but if you change your program to use Digest::MD5 in the functional style then you will find it returns the same digest each time - i.e.:

    use Digest::MD5 qw(md5_hex); #... my $cs = md5_hex($foo);
    I'm guessing its something to do with the way that the module keeps state in OO usage, but I can't be sure without looking at the code.

    /J\
    
Re: MD5 Peculiarities
by Zaxo (Archbishop) on Jun 22, 2003 at 16:16 UTC

    Perl 5.8 with PerlIO gives a neat encapsulation of md5 digests with PerlIO::via::MD5:

    use PerlIO::via::MD5; my ($string, $digest) = 'foobarbaz'; { open my $fh, '<:via(MD5)', \$string; $digest = <$fh>; }
    That is obviously more useful for getting digests of files.

    After Compline,
    Zaxo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://267926]
Approved by antirice
Front-paged by gmax
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2024-04-25 06:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found