Utility for finding duplicate files?

Lennart Sorensen lsorense-1wCw9BSqJbv44Nm34jS7GywD8/FfD2ys at public.gmane.org
Mon Jun 21 14:27:21 UTC 2010


On Sun, Jun 20, 2010 at 06:31:30PM -0400, Walter Dnes wrote:
>   Last week, after my main machine's hard drive started making ominous
> noises, I copied over just about all data from the machine.  There's a
> ton of duplication with the major backups on my backup USB drive.  I
> could do something like...
> 
> #!/bin/bash
> for file1 in *
> do
> if diff -q ${file1} ../dir2/${file1}; then
>   echo "rm ../dir2/${file1}" >> removelist
> fi
> done
> 
> ...and then source removelist
> 
>   Is there a utility program already written that can generate a list of
> duplicate files?

Package: fdupes
Priority: optional
Section: utils
Installed-Size: 80
Maintainer: Sandro Tosi <matrixhasu-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org>
Architecture: i386
Version: 1.50-PR2-1
Depends: libc6 (>= 2.7-1)
Filename: pool/main/f/fdupes/fdupes_1.50-PR2-1_i386.deb
Size: 17536
MD5sum: 157fc2684c6c169ae4cd4c967af7f48d
SHA1: 5d75b0eae0128496e49fb1c59603f7c51acdf58a
SHA256: e1b5ccc9fda20a0f8a610cdb8fd1413c32950ca8f0d2c86a4c285a36d2c5cbf4
Description: identifies duplicate files within given directories
 FDupes uses md5sums and then a byte by byte comparison to find
 duplicate files within a set of directories. It has several useful
 options including recursion.
Homepage: http://netdial.caribe.net/~adrian2/programs/fdupes.html
Tag: implemented-in::c, interface::text-mode, role::program, scope::utility, use::searching, works-with::file

I love the -H (hardlink) option.  It will then hardlink the identical
files together.  Can save a large amount of space if you have lots of
data that is duplicated between directories.

-- 
Len Sorensen
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list