On Sun, 04 Dec 2005 03:09:13 GMT, haphzrd <> wrote:
>In alt.computer Boscoe Pertwee <> wrote:
>> Try this freeware utility called HDCleaner. It is programmed to look
>> for the commonest type of junk files and space wasters and it also
>> identifies duplicate files.
>> http://home.tiscali.de/zdata/hdcleaner_e.htm
>
>I'm curious if you know how this program and others like it, detect
>duplicate files.
>
>I use a program called fdupes which compares file sizes and md5 sums to
>detect a dupe, and acts accordingly. Unfortunately, even the slightest
>change between files will produce a different sum and wont meet the
>criteria for a dupe. So essentially, a dupe, is only the same file with
>possibly different filenames.
>
>I'm assuming most dupe finding programs act this way, though maybe there
>is one slightly more "smart" out there?
The few such programs that I have been able to examin
properly worked by comparing information available in
the file summaries and deleting files that match the set
criteria - the most common criteria being a match in the
file name , or a match in the file name and file size, or a
match in the file name, file size and file creation date.
I have heard of one, very expensive, that works with text
based files that will compare the text in defined areas and
when a match is found will delete the one with the older
creation date - intended for cleaning out text file directories
for scripts etc. This one actually examined a set aprt of the
file (eg 10th to 20th lines of text) and ignored the file name
and file size info. Sorry but forget the name, it was being
used at place were they wrote lots of program scripts and
codes; they wanted to get rid of older versions etc.