Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Using MD5 and the theory behind it

by lhoward (Vicar)
on Jan 10, 2001 at 06:19 UTC ( [id://50842]=note: print w/replies, xml ) Need Help??


in reply to Using MD5 and the theory behind it

MD5 (and other one-way hash functions like CRC32) are designed to take in a string and convert it to a shorter string, kind of a fingerprint of the original string. Diffrent one-way hash functions produce fingerprints of diffrent lengths. But the following criteria should hold for all good one-way hash functions:
  • you can not learn anything about the input string by examining its fingerprint except for the fact that it has that fingerprint
  • a small change (even a single bit) in the input string should cause a dramatic change in the output of the hash function

I deal with a good bit of datacomm and file transfers. I use MD5 to identify when I have received suspect duplicate files. I keep a DB table with the MD5 values of all the files that have been transmitted to me. Whenever I get a new file, I compare its MD5 valye to those stored in the table. If the value is not in the table, I process the file and store its MD5 value in the table. If the value is in the table I set the file asside for special handling and notify an operator.

If you really want to learn about exactly how the (and other hash algorighms) work I recomend checking out Applied Cryptography by Bruce Schneier.

Replies are listed 'Best First'.
Re: Re: Using MD5 and the theory behind it
by r.joseph (Hermit) on Jan 10, 2001 at 06:43 UTC
    You say that you 'compare its MD5' value to the values in a table. How do you get an MD5 value for a file? What exactly do you mean by this process (I believe that this process is very similar to the one that I am attempting). Thanks for the help!

      For reasonable-sized files (ones that fit comfortably in system memory): load the file's contents into a perl scalar, say $foo. Then $fingerprint = md5($foo);

      If you look through the documentation you have for it, you'll get some advice on other methods; e.g. (the object-oriented versions) :

      my $file ="/file/to/hash"; my $md5 = Digest::MD5->new(); $md5->addfile($file); $md5->add("seekrit passwerd"); # not the best choice for one, but ... my $digest = $md5->digest;

      I got this straight out of the docs, more or less. HTH

      Philosophy can be made out of anything. Or less -- Jerry A. Fodor

        Small correction:
        my $file = "/file/to/hash"; my $md5 = Digest::MD5->new(); open(MD5, $file) || die "Unable to open file: $!\n"; binmode(MD5); $md5->addfile(*MD5); $md5->add("seekrit passwerd"); # tee hee my $digest = $md5->digest;
        Your original code will not work with the latest Digest::MD5, producing the error "Not a valid filehandle." I know this because I'm currently writing a utility script that uses MD5 to verify downloaded files (for the Slackware distrib, actually) and I tried it your way to no avail. =)

        'kaboo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://50842]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-26 05:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found