finding same files across hardrives
D. Hugh Redelmeier
hugh-pmF8o41NoarQT0dZR+AlfA at public.gmane.org
Sat Nov 29 19:38:36 UTC 2008
| From: Jose <jtc-vS8X3Ji+8Wg6e3DpGhMbh2oLBQzVVOGK at public.gmane.org>
| I've been trying to find files with the same name
[Some of your typos make it a bit harder to understand what you are
asking.]
| Is there any linux rpm or souce to compile utility that may help to do this?
This kind of thing is easy to do with a shell script. For that reason
I've never investigated if there are utilities to make this easier.
Looking for matching names is a bit scary to me. I'd prefer to look
for duplicate contents.
To find non-obvious matching files, I do a md5sum or sha1sum of each
file and then find files with identical hashes. Being paranoid, I
actual do a cmp before I'm sure that they match (the chance of
cryptographic hashes matching but the contents differing is VERY
slight).
Note: a lot of files are empty: the fact that all of them have
identical contents really doesn't say that they the "same" file in a
semantic sense.
Are your duplicates systematically placed?
Here is a shell script that I just whipped up WITHOUT TESTING.
It requires that the file contents match, not just the name.
Since I don't actually know what you want, I don't know whether this
script could be useful.
================================================================
# stop if anything goes wrong
set -ue
# good directory:
GD=$HOME/good
# bad directory:
BD=/somewhere/else
cd $GD
find . -type f -print |
while read p
do
if [ -f "$p" ] && cmp -s "$p" "$BD/$p"
then
#### after testing this, change this to
#### actually rm
echo rm "$BD/$p"
endif
done
================================================================
--
The Toronto Linux Users Group. Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists
More information about the Legacy
mailing list