war story: parallel(1) command

Eric B gyre-Lmt0BfyYGMw at public.gmane.org
Tue Jul 30 22:33:44 UTC 2013


> On Tue, Jul 30, 2013 at 01:40:34PM -0400, D. Hugh Redelmeier wrote:
>> Oh, but it does.  In just the way that Eric pointed out.
>>
>> People will have created files with MD5 collisions to demonstrate the
>> problem.  Those files *might* end up in your filesystem for some
>> reason.
>
> Do you KNOW what a collision is?  All it is, is that they managed to
> create a file that has the same checksum as another file that they had.
> Being able to create a file with a specific checksum is very interesting,
> if the only check of a file's integrety is the checksum.  The idea
> of a good hash function of course is that you are not supposed to be
> able to create a file to get a specific checksum, and the fact this has
> been done in the case of md5 means that it is no longer good enough for
> ensuring a file hasn't been tampered with.  It is still perfectly fine
> for detecting if files are likely the same and whether they are likely
> to have been changed.

Your "likely the same" is context dependent.
I agree with what you say above in the context of random file
corruption or in the case of files containing random bits.

For Hugh's case, he wants to hash all the files in a real filesystem
to find real differences.

If one calculates the SHA-N hash for each file, that would
answer the question ("Are these files the same or different?")
with virtual certainty.  There is NO need for an additional
compare if the same hash is found.

When probabilities are too astronomically unlikely,
they never happen in reality.



--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list