"Jürgen Exner" <> wrote:
>Forget File::Find, you don't need it
>because you already have the comprehensive list of all directories.
Sorry I didn't make that part clear. I know the odd-ball directory and I know the
parent directory of the other directories of interest. However, I do not know, a
priori, what their names are.
>For your purposes a file consists of the name including the full path, the
>file size, and the date.
Makes sense.
>The obvious data structure would be an array of hash where each hash
>contains three items, namely the qualified file name, the size, and the
>date.
I thought that a hash matched a single key with a single value. What would you have as
the key? Would I have the value be an array reference with the array holding the other
two? Or, am I as confused as I think I am?
>In step two you simply add all the sizes to determine your total used space.
>Or you can do that while collecting the files in step 1 already.
Yes, during collection makes sense to me.
>Then sort the array by the date element.
Perhaps when I better understand how you are picturing the data structure this will
become clearer. It sounds like the date is the hash key. I'm thinking that if this is
the case, I'll want to use the "raw" UNIX style seconds-since-epoch date value. But, I
think I'll still need to be careful of potential collisions, where multiple files have
the same modification date. This should happen rarely, and if I just increment the date
value of the colliders until the date is unique, that won't be a problem. Maybe there's
no reason why the date has to be the key, though. the full pathname of each file is
already unique, and could probably be the key just as well. I'm still confused about
having two values for each key in the hash, though.
>And then beginning with the oldest file delete files (you got the fully
>qualified name in the hash) until the added size of all deleted files is
>larger than the difference between desired size and actual size as
>determined in step 2.
Speaking of size -- I think the size that matters here is the number of Kbytes that the
file is actually taking up on the drive, which is likely slightly larger than its
length might imply. On the other hand, if that's a real pain, I can pretty easily
ignore that slop, as this does not have to be completely exact. If I leave a few of the
files lying around an extra day, it's no problem.
A couple other things I failed to mention earlier that may be useful to know -- The
typical size of each of these files will be in the 50-100 Kbyte realm. We're talking
about keeping around a configurable amount of these files, with the default being 250
Megabytes.
Thanks!