Tim Hunter wrote:
> For finding dups, I wonder if it's useful to compare checksums unless
> you've already computed them in advance. I notice that Ruby's own
> FileUtils.install checks filea == fileb by simply comparing the files
> until it finds a difference or gets to EOF.
It depends. If you want to find duplicates in a set of files then using
the digest as hash key can make finding duplicates much faster. OTOH if
you can detect candidates by looking at other attributes (size,
mtime...) then the additional overhead for the checksum calculation
might slow things down. It depends - as always.
Btw, I don't see a reason to use sysread in this scenario. read will do.
Kind regards
robert