Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^2: On showing the weakness in the MD5 digest function and getting bitten by scalar context

by halley (Prior)
on Aug 27, 2004 at 18:37 UTC ( [id://386462]=note: print w/replies, xml ) Need Help??


in reply to Re: On showing the weakness in the MD5 digest function and getting bitten by scalar context
in thread On showing the weakness in the MD5 digest function and getting bitten by scalar context

No doubt. There are a LOT of namby-pamby Chicken Littles running around crying about MD5's weaknesses.

It's a HASH, for crying out loud. It's not meant to be provably perfect at identifying unique data streams.

  • It is clear that the MD5 digest cannot be unique for every data stream, since it is potentially far shorter than the original data;
  • It is clear that an endless variety of data streams will have the SAME digest, because you can construct pretty much any data stream you want;
  • It is clear that some yahoo might even be able to make a couple of "falsified" data streams which produce the same MD5 hash;
  • It is clear that the MD5 hash is as mathematically valid as it ever has been, since the algorithm hasn't changed at all.

Say you were expecting message M, with hash H. You instead get message N which also happens to hash to H.

  • If you were expecting M to be about 20 MB, what's the chances of N being exactly, or even approximately the same size?
  • If you were expecting M to be a tarball or other application data, what's the chances of N being uncompressable or otherwise parsable? That is, the falsified data also happens to conform to the protocol?
  • If you were expecting M to be executable, what's the chances of N being executable? That is, no introduction of obviously broken execution flaws?

You're worried about MD5 digests for showing falsification of data, right? Where some attacker alters the message? I contend that it will be pretty darned hard to find a useful attack on a message while maintaining MD5 integrity.

To Allied Commanders: Raiders Expected on Supply Lines in Sector 5. Keep on guard. --HQ
To Allied Commanders: No Raids Reported on Supply Lines for Sector 5, 6, or 8. Let Freedom Reign. --HQ

Until someone shows that you can (1) take any arbitrary data set M, (2) falsify it to data set N, by (3) modifying a limited portion of M in an application-useful way and (4) adding less than a gigabyte of additional data, and (5) still come out with M=>H and N=>H hash equivalence, I'll trust MD5, thanks.

--
[ e d @ h a l l e y . c c ]

  • Comment on Re^2: On showing the weakness in the MD5 digest function and getting bitten by scalar context
  • Select or Download Code

Replies are listed 'Best First'.
Re^3: On showing the weakness in the MD5 digest function and getting bitten by scalar context
by Anonymous Monk on Aug 27, 2004 at 19:10 UTC
    It's a HASH, for crying out loud. It's not meant to be provably perfect at identifying unique data streams.
    It IS meant to be computationally infeasible to find two messages with the same hash. If someone has found a way to do that, MD5 has failed at its original purpose.
    Until someone shows that you can (1) take any arbitrary data set M, (2) falsify it to data set N, by (3) modifying a limited portion of M in an application-useful way and (4) adding less than a gigabyte of additional data, and (5) still come out with M=>H and N=>H hash equivalence, I'll trust MD5, thanks.
    This is because you are ignorant. "I can't imagine an attack, therefore there is no attack."
      t IS meant to be computationally infeasible to find two messages with the same hash.

      Total bollocks!

      It is mathematically IMPOSSIBLE to reduce every 32 byte message, to a 16-byte hash and and not have duplicates. If you believe otherwise I have an infinite compression algorithm that you might like to fund development of?

      The only ignorance is that on behalf of those that don't understand simple logic--like you apparently.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
        Which part of "computationally infeasible" did you not understand? But I think it's better stated, "meant to be computationally infeasible to find a different message with the same hash", with two subcategories of infeasibility: where the original message is known, and where it is not.
        You are missing the point of the attack. One of the main properties of a cryptographic hash is that is hard to FIND a collision. There are certainly collisions, but with a secure hash function and a large enough size, it is practically impossible to find another input that produces the same hash. This is very important for many cryptographic protocols.

        This attack takes finding an MD5 hash collision from difficult brute force to relatively easy. This doesn't matter too much for many uses of MD5 where security isn't important.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://386462]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2024-04-24 17:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found