Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do the security problems with MD5 make it bad as a hashing algorithm?


Not unless hash collisions are also a security issue (see http://bugs.python.org/issue13703). But it's much slower than a traditional hashtable hashing function and not much slower than a sha family hash.

Maybe that's the sweet spot for your project, but I think most non-security uses of MD5 are because it already exists in most languages and is pretty universal


Depends. Can an attacker introduce files? If so, he can probably introduce collisions. In most uses of a hash, that's an issue—a performance (DoS) issue if nothing else.

If the application was just looking for a longer CRC/Adler32/etc. and not depending on the hash to be strong against attack, then MD5 wasn't an appropriate choice in the first place, as its much slower than need be. But the security problems are irrelevant.


Do you know the contexts your hashing requirement is likely to be used in, and/or may be adapted to?

The worst and most persistent security problems emerge when someone defends half-assery as acceptable because, take your pick, it's a quick hack / it's a personal project / it's a small project / it's not used for critical infrastructure.

Until eventually is one or more of the above are violated.

The primary advantage of MD5 is that the hashes are (slightly) shorter than those of other checksum methods. This makes it slightly more convenient to manually compare or transmit hashes.

My problem is that I happen to have used md5sum so often and for so long that it's wired into my own wetware and muscle-memory, and I'm not sure which of the alternative SHA sums I should use, and which of those are widely available. I honestly don't know the answer to that off the top of my head, and "what to use instead of md5sum" as a DDG or Google search doesn't turn up a clearly useful answer. "sha1 sha256 sha384 sha512 which to use" does better.

And locally I've got utilities for SHA 1, 224, 256, and 512 installed, that I can tell.

Looks as if SHA-2 and SHA-3, which include keylengths of 224-512, are considered secure:

http://en.wikipedia.org/wiki/Secure_Hash_Algorithm

Then there are some openssl utilities. But let's not go there.


Generally the tuple (<length>,<hash>) is unique since the attacks on MD5 all involve changing the number of octets in the hashed source. That said, it isn't secure if you can change the length independently of changing the hash. Hence the challenge of using it cryptographically. If you're confident you know the correct 'length' value (acts as a sort of openly shared secret in this case) then you trust the hash.


I believe this is wrong, and in this case it's very dangerous information. See for example the Wikipedia article for two chunks of data that only differ in a few bits (not length!) and hash the to same MD5 hash: https://en.wikipedia.org/wiki/MD5#Collision_vulnerabilities


Not necessarily, but if that's your use case (and the hashing speed isn't extremely critical), why not use a better algorithm with a larger output?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: