finding same files across hardrives
Jose
jtc-vS8X3Ji+8Wg6e3DpGhMbh2oLBQzVVOGK at public.gmane.org
Sat Nov 29 20:03:00 UTC 2008
D. Hugh Redelmeier wrote:
> | From: Jose <jtc-vS8X3Ji+8Wg6e3DpGhMbh2oLBQzVVOGK at public.gmane.org>
>
> | I've been trying to find files with the same name
>
> [Some of your typos make it a bit harder to understand what you are
> asking.]
>
> | Is there any linux rpm or souce to compile utility that may help to do this?
>
> This kind of thing is easy to do with a shell script. For that reason
> I've never investigated if there are utilities to make this easier.
>
> Looking for matching names is a bit scary to me. I'd prefer to look
> for duplicate contents.
>
> To find non-obvious matching files, I do a md5sum or sha1sum of each
> file and then find files with identical hashes. Being paranoid, I
> actual do a cmp before I'm sure that they match (the chance of
> cryptographic hashes matching but the contents differing is VERY
> slight).
>
> Note: a lot of files are empty: the fact that all of them have
> identical contents really doesn't say that they the "same" file in a
> semantic sense.
>
> Are your duplicates systematically placed?
>
> Here is a shell script that I just whipped up WITHOUT TESTING.
> It requires that the file contents match, not just the name.
> Since I don't actually know what you want, I don't know whether this
> script could be useful.
>
> ================================================================
> # stop if anything goes wrong
> set -ue
>
> # good directory:
> GD=$HOME/good
> # bad directory:
> BD=/somewhere/else
>
> cd $GD
> find . -type f -print |
> while read p
> do
> if [ -f "$p" ] && cmp -s "$p" "$BD/$p"
> then
> #### after testing this, change this to
> #### actually rm
> echo rm "$BD/$p"
> endif
> done
> ================================================================
> --
> The Toronto Linux Users Group. Meetings: http://gtalug.org/
> TLUG requests: Linux topics, No HTML, wrap text below 80 columns
> How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists
>
Hi Hugh
Thanks for the script, basically not having tape backups I got to copy
the same files on different hard drives as "backups", but now I have a
backup solution and I would like to consolidate a single copy of the
data and properly back it up
Thanks again,
JOse
--
The Toronto Linux Users Group. Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists
More information about the Legacy
mailing list