Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > finding directory sizes

Reply
Thread Tools

finding directory sizes

 
 
Zebee Johnstone
Guest
Posts: n/a
 
      08-23-2004
I want to archive directories to CD. I have many of them in
various places, I don't care if one from /data/web is on the
same CD as one from /home as long as the specified directory is
not split any further.

The important point is that there are things I need to exclude, such as
log files.

I'm currently getting the size by using du in an open, and munging
the result, is there a better way?

open (DU,"find $snapshot -type d -maxdepth 1 -exec du -sk --exclude=access_log* --exclude=error_log* {} \\;|") || die "can't do find for $snapshot $!\n";

I did think of using stat to add up every file, but if I'm talking
a few hundred per directory, is that wise? And how would I exclude
files, considering that each main directory set has more than one file
pattern to exclude? (this has 2, others have 3 or 4)

Zebee
 
Reply With Quote
 
 
 
 
Damian James
Guest
Posts: n/a
 
      08-23-2004
On Mon, 23 Aug 2004 06:48:43 GMT, Zebee Johnstone said:
>...
>open (DU,"find $snapshot -type d -maxdepth 1 -exec du -sk --exclude=access_log* --exclude=error_log* {} \\;|") || die "can't do find for $snapshot $!\n";
>
>I did think of using stat to add up every file, but if I'm talking
>a few hundred per directory, is that wise? And how would I exclude
>files, considering that each main directory set has more than one file
>pattern to exclude? (this has 2, others have 3 or 4)


I'd suggest using File::Find with an appropriate callback sub. It's in
the standard distribution, and the docs have a few recipes.

Cheers,
Damian
 
Reply With Quote
 
 
 
 
Zebee Johnstone
Guest
Posts: n/a
 
      08-23-2004
In comp.lang.perl.misc on 23 Aug 2004 07:07:30 GMT
Damian James <(E-Mail Removed)> wrote:
> On Mon, 23 Aug 2004 06:48:43 GMT, Zebee Johnstone said:
>>...
>>open (DU,"find $snapshot -type d -maxdepth 1 -exec du -sk --exclude=access_log* --exclude=error_log* {} \\;|") || die "can't do find for $snapshot $!\n";
>>
>>I did think of using stat to add up every file, but if I'm talking
>>a few hundred per directory, is that wise? And how would I exclude
>>files, considering that each main directory set has more than one file
>>pattern to exclude? (this has 2, others have 3 or 4)

>
> I'd suggest using File::Find with an appropriate callback sub. It's in
> the standard distribution, and the docs have a few recipes.


I'm not sure what you mean by 'appropriate callback sub".

Do you mean use File::Find recursively to run stat on every file?

As far as I know, if you do that, you can't pass parameters to
the sub that's processing the files, so suddenly everything's global?

and as I say, is running stat on every file in dirs that have hundreds
of files the right way to go? and how to exclude ones you don't want?
I know the patterns I want to exclude, how do I pass those to the
File::Find subroutine?

Zebee
 
Reply With Quote
 
Joe Smith
Guest
Posts: n/a
 
      08-23-2004
Zebee Johnstone wrote:

> and as I say, is running stat on every file in dirs that have hundreds
> of files the right way to go?


If you run `du` on a directory with hundreds of files, it is going
to stat() every file in the directory and all its subdirectories.

> and how to exclude ones you don't want?


Use the global $prune variable.

I know the patterns I want to exclude, how do I pass those to the
> File::Find subroutine?


sub wanted {
return($File::Find:rune = 1) if /unwanted|directory/;
...
}

-Joe
 
Reply With Quote
 
Brian McCauley
Guest
Posts: n/a
 
      08-23-2004


Zebee Johnstone wrote:
> In comp.lang.perl.misc on 23 Aug 2004 07:07:30 GMT
> Damian James <(E-Mail Removed)> wrote:
>
>>On Mon, 23 Aug 2004 06:48:43 GMT, Zebee Johnstone said:
>>
>>>...
>>>open (DU,"find $snapshot -type d -maxdepth 1 -exec du -sk --exclude=access_log* --exclude=error_log* {} \\;|") || die "can't do find for $snapshot $!\n";
>>>
>>>I did think of using stat to add up every file, but if I'm talking
>>>a few hundred per directory, is that wise? And how would I exclude
>>>files, considering that each main directory set has more than one file
>>>pattern to exclude? (this has 2, others have 3 or 4)

>>
>>I'd suggest using File::Find with an appropriate callback sub. It's in
>>the standard distribution, and the docs have a few recipes.

>
>
> I'm not sure what you mean by 'appropriate callback sub".
>
> Do you mean use File::Find recursively to run stat on every file?
>
> As far as I know, if you do that, you can't pass parameters to
> the sub that's processing the files, so suddenly everything's global?


Do not have an irrational fear of using package variables an local().
(Have only a rational fear). Some time ago someone motivated by
irrational fear actually modified File::Find itself not to use package
variables for it's global variables but instead to use file-scoped
lexicals (still global in the programming sense). Because local()
doesn't work on lexicals this person just unthinkingly removed all the
local()s. In so doing they, of course, broke the re-entrancy of File::Find.

However, that said, you only need to use global variables (meaning
file-socped lexicals or package scoped variables) if you want the
callback to be a named subroutine. If you use an anonymous subroutine
then it acts as a closure meaning it can see lexically scoped variables
that were in scope where the anonymous sub was defined.

sub do_find {
my $foo = 'somthing';
my $wanted = sub {
# do stuff with foo
};
find($wanted, '/foo', '/bar');
}

> and as I say, is running stat on every file in dirs that have hundreds
> of files the right way to go?


Well obviously you have to do this in some way - but on Win32 IIRC the
implementation of stat() is (was?) pathological. If speed is of the
essence on Win32 then spawn a native windows recursive directory lister
and parse the output.

> and how to exclude ones you don't want?


return if ....

Or to exclude whole dirtectories

$File::Find:rune = 1 if ...

> I know the patterns I want to exclude, how do I pass those to the
> File::Find subroutine?


Shared variables (either global or via closures).

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\

 
Reply With Quote
 
Zebee Johnstone
Guest
Posts: n/a
 
      08-23-2004
In comp.lang.perl.misc on Mon, 23 Aug 2004 07:44:31 GMT
Joe Smith <(E-Mail Removed)> wrote:
> Zebee Johnstone wrote:
>
>
>> and how to exclude ones you don't want?

>
> Use the global $prune variable.


Where is that documented? I saw it in the File::Find perldoc but only
in passing, and it doesn't reall explain what it is or does. perldoc -f
and perltoc don't mention it.

Does it exclude files or just directories, or whatever's matched by
the regexp?

Zebee
 
Reply With Quote
 
Jim Keenan
Guest
Posts: n/a
 
      08-23-2004
Zebee Johnstone <(E-Mail Removed)> wrote in message news:<(E-Mail Removed)>.. .
> I want to archive directories to CD. I have many of them in
> various places, I don't care if one from /data/web is on the
> same CD as one from /home as long as the specified directory is
> not split any further.
>
> The important point is that there are things I need to exclude, such as
> log files.
>
> I'm currently getting the size by using du in an open, and munging
> the result, is there a better way?
>


Unless you can demonstrate through benchmarking that this is a faster
approach than another such as using 'stat', I don't see why you need
to open a filehandle connection to read a file when you are simply
interested in the file's name and size.

jimk
 
Reply With Quote
 
Jürgen Exner
Guest
Posts: n/a
 
      08-23-2004
Zebee Johnstone wrote:
> Damian James <(E-Mail Removed)> wrote:
>> I'd suggest using File::Find with an appropriate callback sub. It's
>> in the standard distribution, and the docs have a few recipes.

>
> I'm not sure what you mean by 'appropriate callback sub".


The "wanted()" function, that _you_ need to provide sucht hat File::Find
knows what to do with each file.

> Do you mean use File::Find recursively


No need for. That is the beauty of File::Find that it will recurse
automatically without _you_ doing all the leg work.

> to run stat on every file?


Try "-s" instead.

> and as I say, is running stat on every file in dirs that have
> hundreds of files the right way to go? and how to exclude ones you
> don't want? I know the patterns I want to exclude, how do I pass
> those to the File::Find subroutine?


Did you look at the documentation and examples that come with File::Find?

jue


 
Reply With Quote
 
Joe Smith
Guest
Posts: n/a
 
      08-23-2004
Zebee Johnstone wrote:

> In comp.lang.perl.misc on Mon, 23 Aug 2004 07:44:31 GMT
> Joe Smith <(E-Mail Removed)> wrote:


>>Use the global $prune variable.

>
> Where is that documented?


It matches the option by the same name in /usr/bin/find. See the
man page for 'find'. (A bit of history: The perl script find2perl
accepts the same command line arguments as /usr/bin/find, and
outputs a perl script to impliment that command.)

/usr/bin/find / -fstype nfs -prune -o -name 'tmp' -prune -o -print

> Does it exclude files or just directories, or whatever's matched by
> the regexp?


File::Find calls the 'wanted' function for everything it comes across.
After your wanted() function returns, if the thing being looked at
is a directory, File::Find will process that directory recursively
unless $prune is set. Setting $prune while looking at a plain file
does nothing. Setting $prune while looking at a directory says to
pretend that the directory is empty.
-Joe
 
Reply With Quote
 
Zebee Johnstone
Guest
Posts: n/a
 
      08-23-2004
In comp.lang.perl.misc on Mon, 23 Aug 2004 14:21:13 GMT
Jürgen Exner <(E-Mail Removed)> wrote:
> Did you look at the documentation and examples that come with File::Find?
>


Yes. And found I couldn't understand it. As in I could read the words
but there were things missing or pre-requisite knowledge I was expected
to have, such as "prune" that I didn't have.

Zebee
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Win 7 changing font sizes without icon sizes? why? Computer Support 0 03-21-2010 11:32 AM
Re: Win 7 changing font sizes without icon sizes? why? Computer Support 0 03-21-2010 11:31 AM
Outputting file sizes of each item in a directory (error) Feng Tien Ruby 0 11-13-2007 05:59 AM
The File Sizes of Pictures on my CDs Increased to Unreadable Sizes Marful Computer Support 11 03-08-2006 07:13 PM
Directory sizes? Fan Digital Photography 8 09-22-2005 09:22 PM



Advertisments