war story: parallel(1) command

D. Hugh Redelmeier hugh-pmF8o41NoarQT0dZR+AlfA at public.gmane.org
Sun Jul 28 19:01:30 UTC 2013


| From: William Muriithi <william.muriithi-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org>

| For your information,  git handle this the same way, all files are hashed
| and only one copy is kept if any share a hash. They use SHA-1, apparently
| because its more collision resistant without being too CPU intensive

SHA-1 is probably more resistant to adversaries creating collisions.
If you aren't in an adversarial situation, that doesn't matter.

| > When I did this, I soon found that there was plenty of CPU left over,
| > so apparently md5sum is disk-bound on my machine (Core Quad Duo 6600,
| > 2.5" external drive connected via USB 3.0).

Sorry, the CPU is a Core 2 Quad Q6600.  I accidentally garbled the
name.

| Interesting, I would have guessed its CPU bound too. Goes a long way to
| show it don't help buying cutting edge CPU now unless its for energy
| efficient.

I still foolishly think of this as cutting edge.  But of course it was
released six years ago!  Things aren't getting faster very fast.

| What filesystem is on the USB drive? NTFS by any chance? I have found that
| git seem more responsive on Linux than windows. Either the windows port
| sucks or ntfs is just too slow compared to ext4.

My impression (not measured) is that NTFS on Linux is sluggish.  I
think that it goes through a userland process (using FUSE), perhaps
for patent reasons.  I try to avoid it because I don't trust it to be
problem-free on Linux.  That is likely a no-longer-justified fear.

This filesystem was ext3 on 2.5" external USB3.0 drive.  2.5" drives
are slow, but ext3 and USB3.0 should be fast (untested opinions).
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list