war story: parallel(1) command

D. Hugh Redelmeier hugh-pmF8o41NoarQT0dZR+AlfA at public.gmane.org
Sun Jul 28 18:29:28 UTC 2013


| From: Mauro Souza <thoriumbr-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org>

| I would say you should use crc32 instead of md5sum. But before saying that,
| I made a simple test, hashing a 64MB video file:

Interesting.

I didn't know that there was a crc32 command.  It wasn't installed on
my Fedora 19  system.  It turns out that it is in the package
perl-Archive-Zip-1.30-11.fc19.noarch

It is written in Perl.  That may make it slow.  But it does process
32k at a time, so the perl overhead might not be a problem.

I don't like the fact that "man crc32" yields nothing.

I don't like this misleading output:
	$ crc32 --help
	/usr/bin/crc32: No such file or directory
Note: there is a file /usr/bin/crc32.  There is no file "--help", and
that's what the message is about.

I don't like the fact that crc32, with no arguments, does not process
standard in.

I used to use CRC for hashing
<https://groups.google.com/d/msg/net.sources/4ERvPT6oxdA/nY6T761u2h0J>
That code made it into a couple of IETF RFCs.  I originally wrote it
to check transfers from my Altair -- I had a version that was part of
the ROM I wrote.  But there are some unfortunately easy collisions.
For example, initial zero bytes make no difference to the result.

| My explanation: reading the data is way more slow than hashing on md5 or
| crc32. But sha256 was indeed slower.

There are reasons to want a cryptographic hash to be intrinsically
slow.  In particular, you want this to reduce the effectiveness of
brute force attacks.  By "intrinsically slow", I mean not just one
implementation, but all possible implementations must be slow.

For my application, slow is not a virtue.
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list