war story: parallel(1) command
D. Hugh Redelmeier
hugh-pmF8o41NoarQT0dZR+AlfA at public.gmane.org
Sun Jul 28 19:01:30 UTC 2013
| From: William Muriithi <william.muriithi-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org>
| For your information, git handle this the same way, all files are hashed
| and only one copy is kept if any share a hash. They use SHA-1, apparently
| because its more collision resistant without being too CPU intensive
SHA-1 is probably more resistant to adversaries creating collisions.
If you aren't in an adversarial situation, that doesn't matter.
| > When I did this, I soon found that there was plenty of CPU left over,
| > so apparently md5sum is disk-bound on my machine (Core Quad Duo 6600,
| > 2.5" external drive connected via USB 3.0).
Sorry, the CPU is a Core 2 Quad Q6600. I accidentally garbled the
name.
| Interesting, I would have guessed its CPU bound too. Goes a long way to
| show it don't help buying cutting edge CPU now unless its for energy
| efficient.
I still foolishly think of this as cutting edge. But of course it was
released six years ago! Things aren't getting faster very fast.
| What filesystem is on the USB drive? NTFS by any chance? I have found that
| git seem more responsive on Linux than windows. Either the windows port
| sucks or ntfs is just too slow compared to ext4.
My impression (not measured) is that NTFS on Linux is sluggish. I
think that it goes through a userland process (using FUSE), perhaps
for patent reasons. I try to avoid it because I don't trust it to be
problem-free on Linux. That is likely a no-longer-justified fear.
This filesystem was ext3 on 2.5" external USB3.0 drive. 2.5" drives
are slow, but ext3 and USB3.0 should be fast (untested opinions).
--
The Toronto Linux Users Group. Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists
More information about the Legacy
mailing list