On 3 Sep 2003 14:04:18 -0700
(Jeffrey Ellin) wrote:
> Hi, I am using the following code to get the directory sizes of
> users outboxes on our appservers. This code snippet works but it is
> dreadfully slow. I have also used File:Find, but it doesn't seem any
> faster. Any ideas on how to speed it up? Everything is running on
> Win2K.
>
> #sql to get all active users and their last sync date, exclude users
> who
> #are enddated in the system
> $sql = " select n.name,n.APP_SERVER_NAME, max(s.LAST_UPD) as
> sync_date " .
> " from siebel.s_node n, " .
> " siebel.s_extdata_node e, " .
> " siebel.s_dock_status s " .
> " where n.ROW_ID = s.node_id and " .
> " e.NODE_ID = n.ROW_ID and " .
> " n.node_type_cd = 'REMOTE' and " .
> " s.type = 'SESSION' and " .
> " local_flg = 'N' and " .
> " e.ACTIVE_FLG = 'Y' and " .
> " (n.EFF_END_DT > sysdate or n.EFF_END_DT is null)" .
> " group by n.name, n.APP_SERVER_NAME " .
> " order by sync_date " ;
You could use a here doc for this part. Won't do wonders for speed,
but will aid in debugging later.
==untested==
$sql = <<SQL;
select n.name,n.APP_SERVER_NAME, max(s.LAST_UPD) as
sync_date
from siebel.s_node n,
siebel.s_extdata_node e,
siebel.s_dock_status s
where n.ROW_ID = s.node_id and
e.NODE_ID = n.ROW_ID and
n.node_type_cd = 'REMOTE' and
s.type = 'SESSION' and
local_flg = 'N' and .
e.ACTIVE_FLG = 'Y' and
(n.EFF_END_DT > sysdate or n.EFF_END_DT is null)
group by n.name, n.APP_SERVER_NAME
order by sync_date
SQL
++end++
>
> #execute sql
> $sth = $dbh->prepare($sql);
> $sth->execute;
>
> #delete old report file
> unlink 'outboxreport.csv';
I'm thinking that you may fair better if you store the results of the
query in a hash, _then_ iterate through the hash doing stuff with the
files/directories. That's just an opinion and it's unproven. My
thinking is that the longer you have the query open, the more
resources you're using. Of course, storing the information from the
query takes up resources as well. So, pick you poison.
> #loop through each user in resultset.
> while (($node,$server,$sync)=$sth->fetchrow_array()){
> #get name of docking directory
> my $dockloc = substr($server,6);
> #assemble path statement
> my $path = "//$server/docking$dockloc/$node/outbox";
> #my $path = "//$server/docking/$node/outbox";
> #get directory size
> my $dirsize = -s $path;
> opendir(my ($dh),$path);
>
> #loop through each file in the directory skip over dat and uaf
> since
> they are part of new database
> while( defined( my $filename = readdir $dh ) ) {
> next if $filename eq "." or $filename eq ".." or
> $filename
> =~ /uaf/ or $filename =~ /dat/;
> $dirsize += -s "$path/$filename";
> }
>
> #re-open file so it writes as we process
> open REP, ">>outboxreport.csv";
> #convert file size to megabytes
> $dirsize = $dirsize/1000000;
> #round file size
> $dirsize = sprintf "%.2f", $dirsize;
> #print out report in csv format
> print REP "$node,$server,$sync,$dirsize\n";
> }
When you say slow, how slow? And how much data are we talking about?
I mean, if your talking terrabytes, it's going to take some time to
get that information. Plus, consider the platform and how it handles
memory, resources, etc. More memory will mean some better
performance, etc.
Just my zero cents - money back if not satisfied

--
Jim
---
Copyright notice: all code written by the author in this post is
released under the GPL.
http://www.gnu.org/licenses/gpl.txt
for more information.
---
a real quote ...
Linus Torvalids: "They are somking crack ...."
(
http://www.eweek.com/article2/0,3959,1227150,00.asp)
---
a fortune quote ...
"I know the answer! The answer lies within the heart of all
mankind! The answer is twelve? I think I'm in the wrong
building." -- Charles Schulz