Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   Help me to Improve (http://www.velocityreviews.com/forums/t915251-help-me-to-improve.html)

SSS Develop 10-07-2011 01:33 PM

Help me to Improve
 
Hello,

The intention to ask question in this forum is to improve this piece
of code (for performance and other best practices)

I have couple of Web Applications - need to monitor the HTTP Response
for them. Decided to use the Perl - the script should fetch the HTTP
page from Web App within SLA time (Service Level). If the SLA did not
met - then system should send alert. For sending alert, it interacts
with command line tools like email/nagios.

SLA's are in seconds.

Typical file structure of application is as follow:

/bin/httpresp.pl
/conf/{app1.conf, app2.conf, ..}
/logs/{app1.log, app2.log...}


The script use configuration file for each web app, configuration file
looks as below:

---------------
url = https://www.example.com
sla = 10
logfile = example.com.prod

-----------

The script looks as follow:

-------------------------------

#!/usr/bin/perl

use strict;
use warnings;
use AppConfig;
use Parallel::ForkManager;
use Log::Log4perl qw(:easy);
use HTTP::Request::Common;
use LWP::UserAgent;
use HTTP::Cookies;
use Benchmark ':hireswallclock';
use Time::HiRes qw ( time alarm sleep );
use IPC::Run('run'); # Not shown the use of it in this script, is
useful to interact with nagios or run any other command line commands


our %LOGFILE;
our $APP_HOME = '/usr/local/application';
our $HTTPRESP = 'httpresp';
$LOGFILE{file} = $APP_HOME . "/" . $HTTPRESP . "/". 'logs/
generic.log';




my $MAX_PROC =
100; # restricting to 100 process at this moment, but can be
modified any time

$SIG{ALRM} = \&response_sla;

my $confdir = "$APP_HOME/$HTTPRESP/conf";
my $pm = new Parallel::ForkManager($MAX_PROC);

my $cnfhs = get_configurations();



foreach my $key ( keys %$cnfhs ) {

my $conf = $cnfhs->{$key};

my $pid = $pm->start and next;
my $logger = logger( $conf->get('logfile') );

crawl_sites( $conf, $logger );
$pm->finish;

}
$pm->wait_all_children;


sub get_configurations {

my @confs = glob( $confdir . "/*.conf" );
my $confhash = {};
foreach my $cnf (@confs) {
my $config = AppConfig->new();
$config->define('name=s');
$config->define('url=s');
$config->define('sla=s');
$config->define('logfile=s');
$config->define('appname=s');
$config->file($cnf);
$confhash->{$cnf} = $config;
}

return $confhash;

}

sub logger {
my $logfilename = shift;
$LOGFILE{file} = $logfilename;
my $conf = q(
log4perl.logger = INFO, FileApp
log4perl.appender.FileApp =
Log::Log4perl::Appender::File
log4perl.appender.FileApp.filename = sub {getLogfilename();}
log4perl.appender.FileApp.layout = PatternLayout
log4perl.appender.FileApp.layout.ConversionPattern = %d> %m%n
);

# Initialize logging behaviour
Log::Log4perl->init( \$conf );

# Obtain a logger instance
my $logger = get_logger();

return $logger;

}

sub getLogfilename {

return "$APP_HOME/$HTTPRESP/logs/" . $LOGFILE{file} . ".log";

}

sub crawl_sites {
my ( $conf, $logger ) = @_;
my $sla = $conf->get('sla');
my $url = $conf->get('url');
eval {
alarm($sla);
get_response( $url, $logger );
alarm(0);
};

if ( $@ =~ /SLA: DID NOT MET/ ) {
print "SLA DID NOT MET\n";
//code_to_interact_with_nagios
}else {
//code_to_interact_with_nagios
}

}

sub get_response {
my ( $url, $logger ) = @_;

$logger->info("Started monitoring: $url ");
my $ua = new LWP::UserAgent;
$ua->cookie_jar( HTTP::Cookies->new() );
$ua->agent(
'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.18) Gecko/
20110614 Firefox/3.6.18'
);
$ua->ssl_opts( verify_hostname => 0 ); ##Skip verifying SSL
Certificate

my $t1 = Benchmark->new;

my $response = $ua->request( GET $url);
if ( $response->is_success ) {
$logger->info("SUCCESS: Got the response"); # or whatever
}
else {
$logger->info( "ERROR: the response was, " . $response-
>status_line );

}
my $t2 = Benchmark->new;
my $td = timediff( $t2, $t1 );
print "Total Response time: " . timestr($td) . "\n";
$logger->info( "Total Response time: " . timestr($td) );
}


sub response_sla {
die "SLA: DID NOT MET";
}

--------------------------------

Help me to improve this program - performance, scale to large (say 1k
or 2 k web apps) number of apps..


thank you for your time !

--sss










SSS Develop 10-08-2011 07:07 AM

Re: Help me to Improve
 
Thank you so much Ben - this was the best feedback/review i got. I was
not expecting such a detailed help. I am so happy and thank you so
much !

I have some comments below. I will correct the code and post the final
version here later.


On Oct 8, 4:47*am, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth SSS Develop <sssdeve...@gmail.com>:
>
> > Hello,

>
> > The intention to ask question in this forum is to improve this piece
> > of code (for performance and other best practices)

>
> I have to say, it's a lot better than a lot of what gets posted here...
>
>
>
> > I have couple of Web Applications - need to monitor the HTTP Response
> > for them. Decided to use the Perl - the script should fetch the HTTP
> > page from Web App within SLA *time (Service Level). If the SLA did not
> > met - then system should send alert. For sending alert, it interacts
> > with command line tools like email/nagios.

> <snip>
>
> > #!/usr/bin/perl

>
> > use strict;
> > use warnings;
> > use AppConfig;
> > use Parallel::ForkManager;

>
> If the performance of this script is an issue, forking may not be the
> best solution. It's pretty much always faster to use a non-blocking
> single-process (or single-process-per-CPU) approach, using something
> like POE. That would require restructuring the whole program, though, so
> if P::FM works for you I should stick with it.
>
> > use Log::Log4perl qw(:easy);
> > use HTTP::Request::Common;
> > use LWP::UserAgent;
> > use HTTP::Cookies;
> > use Benchmark ':hireswallclock';
> > use Time::HiRes qw ( time alarm sleep );
> > use IPC::Run('run'); * # Not shown the use of it in this script, is
> > useful to interact with nagios or run any other command line commands

>
> > our %LOGFILE;
> > our $APP_HOME = '/usr/local/application';
> > our $HTTPRESP = 'httpresp';
> > $LOGFILE{file} = *$APP_HOME . "/" . $HTTPRESP . "/". 'logs/
> > generic.log';

>
> I see Tad's already mentioned you're using 'our' where you could be
> using 'my'. I would add that there's no need for concatentation when you
> can interpolate, and that you can assign directly to a hash too:
>
> * * my $APP_HOME * *= '/usr/local/application';
> * * my $HTTPRESP * *= 'httpresp';
> * * my %LOGFILE * * = (
> * * * * file * *=> "$APP_HOME/$HTTPRESP/logs/generic.log",
> * * );
>
> (Whether to use single or double quotes when nothing will be
> interpolated is a matter of taste. I prefer double, but many people here
> prefer single.)
>
> > my $MAX_PROC =
> > * 100; # restricting to 100 process at this moment, but can be
> > modified any time

>
> > $SIG{ALRM} = \&response_sla;

>
> It's a little confusing to refer to a sub all the way down at the bottom
> of the file, especially when it's so trivial it doesn't really need a
> name:
>
> * * $SIG{ALRM} = sub {
> * * * * die "SLA: DID NOT MET";
> * * };
>
> > my $confdir = "$APP_HOME/$HTTPRESP/conf";
> > my $pm * * *= new Parallel::ForkManager($MAX_PROC);

>
> It's best to avoid dative method syntax (METHOD OBJECT ARGS or METHOD
> CLASS ARGS), at least in part because it encourages the idea that there
> is something special about the 'new' method.
>
> * * my $pm = Parallel::ForkManager->new($MAX_PROC);
>
> (Yes, I know you just copied that from the documentation. Unfortunately
> a lot of CPAN documentation uses older, now-discouraged idioms in the
> examples.)
>
> > my $cnfhs = get_configurations();

>
> > foreach my $key ( keys %$cnfhs ) {

>
> > * * my $conf = $cnfhs->{$key};

>
> You can write that loop as
>
> * * while (my ($key, $conf) = each %$cnfhs) {
>
> and it'll even be more efficient on really large hashes. (Not enough
> people remember about 'each'.)
>
>
>
> > * * my $pid = $pm->start and next;
> > * * my $logger = logger( $conf->get('logfile') );

>
> > * * crawl_sites( $conf, $logger );
> > * * $pm->finish;

>
> > }
> > $pm->wait_all_children;

>
> > sub get_configurations {

>
> > * * my @confs * *= glob( $confdir . "/*.conf" );

>
> Passing a value into a sub in a file-scoped global is always worth
> avoiding if you can. In this case it's easy: pass it in as a real
> parameter instead.
>
> It may seem silly calling get_configurations($confdir) when you know
> perfectly well that get_configurations can see that variable, but it
> rapidly becomes less silly when you come back six months later and take
> it out without remembering some sub half way down the file needed it.
>
> <snip>
>
> > sub logger {
> > * * my $logfilename = shift;
> > * * $LOGFILE{file} = $logfilename;
> > * * my $conf = q(
> > * * * * log4perl.logger * * * * * * * * * *= INFO, FileApp
> > * * * * log4perl.appender.FileApp * * * * *=
> > Log::Log4perl::Appender::File
> > * * * * log4perl.appender.FileApp.filename = sub {getLogfilename();}

>
> <...later...>
>
> > sub getLogfilename {

>
> > * * return "$APP_HOME/$HTTPRESP/logs/" . $LOGFILE{file} . ".log";

>
> > }

>
> It's not really clear to me what you're trying to do here with
> $LOGFILE{file}, but it can almost certainly be done less confusingly.
> Since you don't appear to use it anywhere else, is there any reason not
> to just put the correct filename straight in the log4perl config?
>


[sssdevelop] I am creating log file for each forked process (per web
app one log file). method logger accepts logfile name as arguments,
that's used to change the "file" value of LOGFILE, thats then used
inside getLogfilename().
I really wanted to interpolate logfilename in appender. Now i got the
clue, i will use "qq" instead of single "q" :)



> * * my $logpath = "$APP_HOME/$HTTPRESP/logs/$logfilename.log";
> * * my $conf * *= qq(
> * * * * log4perl.logger * * * * * * = INFO, FileApp
> * * * * log4perl.appender.FileApp * = Log::Log4perl::Appender::File
> * * * * log4perl.appender.FileApp.filename = $logpath
> * * * * ...
> * * );
>
> (Notice I've changed 'q()' to 'qq()'. I said I preferred double quotes :)..)
>
> I'd also want to tidy up this "$APP_HOME/$HTTPRESP" prefix that keeps
> turning up. Probably I'd just chdir to that directory and use relative
> paths.
>
> > * * * * log4perl.appender.FileApp.layout * = PatternLayout
> > * * * * log4perl.appender.FileApp.layout.ConversionPattern = %d> %m%n
> > * * );

>
> > * * # Initialize logging behaviour
> > * * Log::Log4perl->init( \$conf );

>
> > * * # Obtain a logger instance
> > * * my $logger = get_logger();

>
> I see no 'sub get_logger' here...
>
> ...oh, I see, it's from Log4perl. I don't think you should be mixing the
> normal and the 'easy' interfaces like that. Since you've got as far as
> setting up a custom logger config, use the real method so people reading
> the code have some idea what's going on:
>
> * * return Log::Log4perl->get_logger();
>
> > * * return $logger;

>
> > }

>




[sssdevelop] - yup, agree with this. I will change it!


> > sub crawl_sites {
> > * * my ( $conf, $logger ) = @_;
> > * * my $sla = $conf->get('sla');
> > * * my $url = $conf->get('url');
> > * * eval {
> > * * * * alarm($sla);

>
> Once we get as far as this 'alarm' statement, we've entirely forgotten
> about the $SIG{ALRM} assignment at the top of the file that said what
> the alarm will actually *do*. It's also better, where possible, to write
> subs so they don't rely on or change global values. Generally this means
> that if you need to use one of the magic Perl globals, you need to
> 'local' it.
>
> * * eval {
> * * * * local $SIG{ALRM} = sub { die "SLA NOT MET" };
> * * * * alarm($sla);
>
> > * * * * get_response( $url, $logger );
> > * * * * alarm(0);

>


[sssdevelop] Now i see the importance of this, thanks for this
advice.

> You have a race condition here. It's entirely possible the alarm will
> fire after get_response has received a successful response but before it
> has returned and the alarm has been deactivated. In this case, you may
> not care (if it was that close to the deadline you may want to count it
> as 'late' anyway), but it needs considering. If you have, in fact,
> considered it, you should put in a comment so you remember not to
> consider it again :).
>
> > * * };

>
> You need to cancel the alarm here, as well. get_response may die for
> some reason other than an alarm, and if it does the alarm will still go
> off, but nothing will catch the exception. (There is another race here,
> of course, but it's rather hard to avoid. Alarms are tricky.)
>



[sssdevelop] Ah.. i did not put much thought on race happening here.
Need to work on this. this was good advice here :)


> <snip>
>
> > sub get_response {
> > * * my ( $url, $logger ) = @_;

>
> > * * $logger->info("Started monitoring: $url ");
> > * * my $ua = new LWP::UserAgent;
> > * * $ua->cookie_jar( HTTP::Cookies->new() );
> > * * $ua->agent(
> > 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.18) Gecko/
> > 20110614 Firefox/3.6.18'
> > * * );
> > * * $ua->ssl_opts( verify_hostname => 0 ); * *##Skip verifying SSL
> > Certificate

>
> You don't need to re-do all this every time. Create a LWP::UA object as
> part of your script initialization, then pass it in to get_response.
>


[sssdevelop] yup, will add single LWP::UA object ! this is helpful.

> > * * my $t1 = Benchmark->new;

>
> > * * my $response = $ua->request( GET $url);
> > * * if ( $response->is_success ) {
> > * * * * $logger->info("SUCCESS: Got the response"); * *# orwhatever
> > * * }
> > * * else {
> > * * * * $logger->info( "ERROR: the response was, " . $response-
> > >status_line );

> > * * }
> > * * my $t2 = Benchmark->new;
> > * * my $td = timediff( $t2, $t1 );
> > * * print "Total Response time: " . timestr($td) . "\n";
> > * * $logger->info( "Total Response time: " . timestr($td) );

>
> Since you're timing the response anyway, it's probably better to rely on
> that measurement, rather than the alarm, to see whether it's gone over
> its SLA. You might still want to set an alarm for something like twice
> the SLA, just in case the request never returns for some reason (though
> LWP will timeout for you eventually).
>
> You might also want to look at LWPx::TimedHTTP, which gives detailed
> timings of the various stages of the request cycle.
>


[sssdevelop] - thanks, will surely check this LWPx::TimedHTTP.
Ben, appreciate your help and thank you so much for
your time.


----sssdevelop


> Ben



Rainer Weikusat 10-09-2011 08:46 PM

Re: Help me to Improve
 
Ben Morrow <ben@morrow.me.uk> writes:

[...]

> I see Tad's already mentioned you're using 'our' where you could be
> using 'my'. I would add that there's no need for concatentation when you
> can interpolate, and that you can assign directly to a hash too:
>
> my $APP_HOME = '/usr/local/application';
> my $HTTPRESP = 'httpresp';
> my %LOGFILE = (
> file => "$APP_HOME/$HTTPRESP/logs/generic.log",
> );
>
> (Whether to use single or double quotes when nothing will be
> interpolated is a matter of taste.


Using double-quotes means the compiler has to analyze the string order
to determine if and how something needs to be interpolated into
it. Otherwise, this effort can be avoided and also the effort the next
person who looks at the code needs to come to the same result.

[...]

>> my $confdir = "$APP_HOME/$HTTPRESP/conf";
>> my $pm = new Parallel::ForkManager($MAX_PROC);

>
> It's best to avoid dative method syntax (METHOD OBJECT ARGS or METHOD
> CLASS ARGS), at least in part because it encourages the idea that there
> is something special about the 'new' method.
>
> my $pm = Parallel::ForkManager->new($MAX_PROC);
>
> (Yes, I know you just copied that from the documentation. Unfortunately
> a lot of CPAN documentation uses older, now-discouraged idioms in the
> examples.)


In this particular case, the 'indirect object' syntax is actually even
'discouraged' by the corresponding Perl documentation. More generally,
however, the sentence above parses as "a lot of people who wrote
useful code disagreed with some (or all) opinions
$random_people_on_usenet had regarding what they should have done".

>> my $cnfhs = get_configurations();
>>
>> foreach my $key ( keys %$cnfhs ) {
>>
>> my $conf = $cnfhs->{$key};

>
> You can write that loop as
>
> while (my ($key, $conf) = each %$cnfhs) {
>
> and it'll even be more efficient on really large hashes. (Not enough
> people remember about 'each'.)


Except in case of insanely large hashes (> 500,000 keys, according to
some tests I did a while back), getting the list of keys and
traversing the hash 'open coded' will be faster than calling Perl
subroutine per key. OTOH, the latter should need less memory.

[...]


>> sub get_configurations {
>>
>> my @confs = glob( $confdir . "/*.conf" );

>
> Passing a value into a sub in a file-scoped global is always worth
> avoiding if you can. In this case it's easy: pass it in as a real
> parameter instead.


Doing something because it can be done is just about the most awful
reason for doing it which can be imagined, especially in code where
more or less everything can be done.

> It may seem silly calling get_configurations($confdir) when you know
> perfectly well that get_configurations can see that variable, but it
> rapidly becomes less silly when you come back six months later and take
> it out without remembering some sub half way down the file needed
> it.


It is silly to remove something without checking that it isn't needed
and even more silly to remove something and not even run the changed
file through the compiler before assuming that the change was ok. And
with 'strict', such a change won't go through the compiler.

>
> <snip>
>> sub logger {
>> my $logfilename = shift;
>> $LOGFILE{file} = $logfilename;
>> my $conf = q(
>> log4perl.logger = INFO, FileApp
>> log4perl.appender.FileApp =
>> Log::Log4perl::Appender::File
>> log4perl.appender.FileApp.filename = sub {getLogfilename();}

>
> <...later...>
>> sub getLogfilename {
>>
>> return "$APP_HOME/$HTTPRESP/logs/" . $LOGFILE{file} . ".log";
>>
>> }

>
> It's not really clear to me what you're trying to do here with
> $LOGFILE{file}, but it can almost certainly be done less confusingly.
> Since you don't appear to use it anywhere else, is there any reason not
> to just put the correct filename straight in the log4perl config?


The subroutine above is a specific algorithm with a generic name and
the usual reason for functional decomposition (didn't someboday claim
that was acutally used in practice not that long ago) is that it
simplifies the code at each level abstraction by moving details
which are not relevant for the present algorithm elsewhere.

Mart van de Wege 10-09-2011 09:15 PM

Re: Help me to Improve
 
Rainer Weikusat <rweikusat@mssgmbh.com> writes:

> Ben Morrow <ben@morrow.me.uk> writes:
>
>
>>> my $confdir = "$APP_HOME/$HTTPRESP/conf";
>>> my $pm = new Parallel::ForkManager($MAX_PROC);

>>
>> It's best to avoid dative method syntax (METHOD OBJECT ARGS or METHOD
>> CLASS ARGS), at least in part because it encourages the idea that there
>> is something special about the 'new' method.
>>
>> my $pm = Parallel::ForkManager->new($MAX_PROC);
>>
>> (Yes, I know you just copied that from the documentation. Unfortunately
>> a lot of CPAN documentation uses older, now-discouraged idioms in the
>> examples.)

>
> In this particular case, the 'indirect object' syntax is actually even
> 'discouraged' by the corresponding Perl documentation. More generally,
> however, the sentence above parses as "a lot of people who wrote
> useful code disagreed with some (or all) opinions
> $random_people_on_usenet had regarding what they should have done".
>

There are actually good, technical, reasons why the indirect syntax is
being discouraged.

But do not let that stop you from ranting at what you perceive to be
irrational authoritarianism. The rest of us will enjoy our code without
hard to find bugs.

Mart

--
"We will need a longer wall when the revolution comes."
--- AJS, quoting an uncertain source.

Jürgen Exner 10-09-2011 09:18 PM

Re: Help me to Improve
 
Rainer Weikusat <rweikusat@mssgmbh.com> wrote:
>Ben Morrow <ben@morrow.me.uk> writes:
>> Passing a value into a sub in a file-scoped global is always worth
>> avoiding if you can. In this case it's easy: pass it in as a real
>> parameter instead.

>
>Doing something because it can be done is just about the most awful
>reason for doing it which can be imagined, especially in code where
>more or less everything can be done.


Are you arguing in favour of or against global variables?

jue

Jürgen Exner 10-09-2011 11:07 PM

Re: Help me to Improve
 
Ben Morrow <ben@morrow.me.uk> wrote:
>In message <31836.1270260026@chthon>, <tchrist@perl.com> wrote:
>|
>| Also, I see you keep using single quotes for strings that don't need them.
>| Why? Those aren't really a generic string, you know. They loudly scream
>| "DO NOT INTERPOLATE ANYTHING HERE!!!", so I always scrutinize them extra
>| hard to try to figure out what it is that you're trying to suppress. Plus
>| I hate having to deal with backslash escapes that aren't. Perl strings
>| normally interpolate, and it takes something special to suppress that:
>|
>| $string = $normal ; # interpolate values normally
>| $string = "$normal" ; # interpolate values normally


By some definition of normally. This assignment actually stringifies
$normal which most often is not the desired action when dealing with
e.g. numbers. Therefore you could argue if this is the "normal" or
"naively expected" semantic.

>| $string = `$normal` ; # interpolate values normally
>| $string = '$abnormal' ; # DON'T INTERPOLATE VALUES!!!


I like to think of '...' as the literal operator or the literal quotes.
They don't munge their text.

>| $string = \$abnormal ; # DON'T INTERPOLATE VALUES!!!
>| $string = "\$abnormal" ; # DON'T INTERPOLATE VALUES!!!


Actually example does interpolated. It is interpolates \$ as the literal
dollar sign.

etc, etc.

jue

SSS Develop 10-10-2011 06:36 AM

Re: Help me to Improve
 
Very healthy discussions ! thanks everyone for participating !
(will post the final version soon :))
thank you,

sssdevelop

On Oct 10, 5:09*am, Ben Morrow <b...@morrow.me.uk> wrote:
> Quoth Rainer Weikusat <rweiku...@mssgmbh.com>:
>
>
>
>
>
>
>
>
>
> > Ben Morrow <b...@morrow.me.uk> writes:

>
> > > It's best to avoid dative method syntax (METHOD OBJECT ARGS or METHOD
> > > CLASS ARGS), at least in part because it encourages the idea that there
> > > is something special about the 'new' method.

>
> > > * * my $pm = Parallel::ForkManager->new($MAX_PROC);

>
> > > (Yes, I know you just copied that from the documentation. Unfortunately
> > > a lot of CPAN documentation uses older, now-discouraged idioms in the
> > > examples.)

>
> > In this particular case, the 'indirect object' syntax is actually even
> > 'discouraged' by the corresponding Perl documentation. More generally,
> > however, the sentence above parses as "a lot of people who wrote
> > useful code disagreed with some (or all) opinions
> > $random_people_on_usenet had regarding what they should have done".

>
> I know. I was aware when I wrote that paragraph that I was weaselling
> out of properly explaining what's wrong with the dative syntax, and the
> reason I was doing that was that I can never remember an example of
> where it goes wrong :). I can give you *lots* of examples where Perl
> assumes I'm making a dative method call when I didn't want it to, but
> that's a reason for not putting it in the language at all, rather than
> one for not using it if its there.
>
> I believe the canonical example is
>
> * * my $rv = method $obj;
>
> with a 'sub method' in the current package. While that is a problem
> (since it will parse differently depending on the presence of the sub)
> it is easily disambiguated:
>
> * * my $rv = method $obj ();
>
> So, can anyone remind me of when this actually goes wrong?
>
>
>
>
>
>
>
>
>
> > >> my $cnfhs = get_configurations();

>
> > >> foreach my $key ( keys %$cnfhs ) {

>
> > >> * * my $conf = $cnfhs->{$key};

>
> > > You can write that loop as

>
> > > * * while (my ($key, $conf) = each %$cnfhs) {

>
> > > and it'll even be more efficient on really large hashes. (Not enough
> > > people remember about 'each'.)

>
> > Except in case of insanely large hashes (> 500,000 keys, according to
> > some tests I did a while back), getting the list of keys and
> > traversing the hash 'open coded' will be faster than calling Perl
> > subroutine per key. OTOH, the latter should need less memory.

>
> Interesting. The efficiency wasn't the point, of course. The point was
> to get key and value in one go.
>
> When you say 'calling Perl subroutine per key' do you mean just the call
> to 'each'? 'each' isn't a subroutine, of course, but the while loop does
> have to run some Perl every iteration rather than just popping the
> stack.
>
>
>
>
>
>
>
>
>
> > > <snip>
> > >> sub logger {
> > >> * * my $logfilename = shift;
> > >> * * $LOGFILE{file} = $logfilename;
> > >> * * my $conf = q(
> > >> * * * * log4perl.logger * * * * * * * * * *= INFO, FileApp
> > >> * * * * log4perl.appender.FileApp * * * * *=
> > >> Log::Log4perl::Appender::File
> > >> * * * * log4perl.appender.FileApp.filename = sub {getLogfilename();}

>
> > > <...later...>
> > >> sub getLogfilename {

>
> > >> * * return "$APP_HOME/$HTTPRESP/logs/" . $LOGFILE{file} . ".log";

>
> > >> }

>
> > > It's not really clear to me what you're trying to do here with
> > > $LOGFILE{file}, but it can almost certainly be done less confusingly.
> > > Since you don't appear to use it anywhere else, is there any reason not
> > > to just put the correct filename straight in the log4perl config?

>
> > The subroutine above is a specific algorithm with a generic name and
> > the usual reason for functional decomposition (didn't someboday claim
> > that was acutally used in practice not that long ago) is that it
> > simplifies the code at each level abstraction by moving details
> > which are not relevant for the present algorithm elsewhere.

>
> For there to be any merit to that argument, getLogFilename would have to
> be independant of logger. Since logger sets a global read by
> getLogFilename, this obviously isn't the case. If you were to rewrite
> the code so it looked like
>
> * * sub logger {
> * * * * my $logfilename = shift;
> * * * * my $logpath * * = getLogFilename($logfilename);
> * * * * my $conf * * * *= qq(
> * * * * * * ...filename = $logpath
>
> I would consider that a sensible arrangement. You can argue about
> whether or not the mapping from basename to full path should be a
> separate function: in this case, since it's neither particularly
> complicated nor something that can be changed at runtime ISTM better to
> put the expression inline, but it makes little difference.
>
> What I was complaining about was the extremely convoluted data path that
> went like this:
>
> * * - stuff a value into (the sole element of) a global;
> * * - create a string containing the Perl code for a sub call;
> * * - pass that string to a method, which will
> * * * * - parse out the sub call;
> * * * * - call the sub, which will
> * * * * * * - pull the value out of the global;
> * * * * * * - interpolate it into a path;
> * * * * * * - return that path;
> * * * * - use the returned value as a log filename.
>
> I was recommending something more like
>
> * * - interpolate the basename into a path;
> * * - interpolate that path into a config string;
> * * - pass that string to a method, which will
> * * * * - use it as a log filename.
>
> Whether the first step is a sub call or a direct expression is
> irrelevant.
>
> Ben



Eric Pozharski 10-10-2011 07:01 AM

Re: Help me to Improve
 
with <nZOdnQ0guILouA_TnZ2dnUVZ8hCdnZ2d@bt.com> Ben Morrow wrote:
>
> Quoth Tad McClellan <tadmc@seesig.invalid>:
>> Rainer Weikusat <rweikusat@mssgmbh.com> wrote:
>> > Ben Morrow <ben@morrow.me.uk> writes:

*SKIP*
>>> Using double-quotes means the compiler has to analyze the string order
>>> to determine if and how something needs to be interpolated into
>>> it. Otherwise, this effort can be avoided and also the effort the next
>>> person who looks at the code needs to come to the same result.

*SKIP*
>> I, errr, agree with Rainer here, more for his second reason
>> that for his first.
>>
>> I am always looking to optimize for maintenance.
>>
>> Using double quotes when nothing is intended to be interpolated
>> is in _bad_ taste. :-)

>
> Well, at least I'm not alone:


So the double-quote-war has begun? Let me introduce some fuel to that
so far ceased fire

my $aa;
my $ab = 'abc';

$aa = eval q|"$ab"|
$aa = eval q|'$ab'|
$aa = eval q|"\\$ab"|
$aa = eval q|'def'|
#!/usr/bin/perl

use strict;
use warnings;
use Benchmark qw{ cmpthese timethese };

my $aa;
my $ab = 'abc';

cmpthese timethese -5, {
code00 => sub { $aa = eval q|"$ab"| },
code01 => sub { $aa = eval q|'$ab'| },
code02 => sub { $aa = eval q|"\\$ab"| },
code03 => sub { $aa = eval q|'def'| },
};

__END__
Benchmark: running code00, code01, code02, code03 for at least 5 CPU seconds...
code00: 6 wallclock secs ( 5.00 usr + 0.02 sys = 5.02 CPU) @ 64101.99/s (n=321792)
code01: 6 wallclock secs ( 5.18 usr + 0.00 sys = 5.18 CPU) @ 96095.95/s (n=497777)
code02: 6 wallclock secs ( 5.24 usr + 0.00 sys = 5.24 CPU) @ 69054.58/s (n=361846)
code03: 5 wallclock secs ( 5.34 usr + 0.00 sys = 5.34 CPU) @ 94380.15/s (n=503990)
Rate code00 code02 code03 code01
code00 64102/s -- -7% -32% -33%
code02 69055/s 8% -- -27% -28%
code03 94380/s 47% 37% -- -2%
code01 96096/s 50% 39% 2% --

Hopefully, I did leaning sticks correctly in 'code02'.
*CUT*

--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom

Rainer Weikusat 10-10-2011 06:35 PM

Re: Help me to Improve
 
Jürgen Exner <jurgenex@hotmail.com> writes:
> Rainer Weikusat <rweikusat@mssgmbh.com> wrote:
>>Ben Morrow <ben@morrow.me.uk> writes:
>>> Passing a value into a sub in a file-scoped global is always worth
>>> avoiding if you can. In this case it's easy: pass it in as a real
>>> parameter instead.

>>
>>Doing something because it can be done is just about the most awful
>>reason for doing it which can be imagined, especially in code where
>>more or less everything can be done.

>
> Are you arguing in favour of or against global variables?


I'm arguing against 'do this whenever you can' 'design recipes'. As to
'global variables': A (non OO) module composed of some subroutines and
some shared state variables these subroutines work with is
structurally similar/ identical (isomorph) to an object with some
instance variables and some methods working with these. Programs can
often be decomposed into relatively independent functional units
providing services to each other by invoking interface routines. These
independent functional units will usually be stateful, meaning, they
will have a set of internal variables with 'state information'. I see
no ground for an argument 'for' or 'against' something except maybe at
the completely different level of 'should computer programs work like
other, complex real-world objects', IOW, have state which changes over
time, or should fat runtimes be used to provide the illusion that it
wasn't in this way.


xhoster@gmail.com 10-11-2011 12:37 AM

Re: Help me to Improve
 
Tad McClellan <tadmc@seesig.invalid> wrote:
> Rainer Weikusat <rweikusat@mssgmbh.com> wrote:
> > Ben Morrow <ben@morrow.me.uk> writes:
> >
> > [...]
> >
> >> I see Tad's already mentioned you're using 'our' where you could be
> >> using 'my'. I would add that there's no need for concatentation when
> >> you can interpolate, and that you can assign directly to a hash too:
> >>
> >> my $APP_HOME = '/usr/local/application';
> >> my $HTTPRESP = 'httpresp';
> >> my %LOGFILE = (
> >> file => "$APP_HOME/$HTTPRESP/logs/generic.log",
> >> );
> >>
> >> (Whether to use single or double quotes when nothing will be
> >> interpolated is a matter of taste.

> >
> > Using double-quotes means the compiler has to analyze the string order
> > to determine if and how something needs to be interpolated into
> > it.


Sure, but who cares?

> > Otherwise, this effort can be avoided and also the effort the next
> > person who looks at the code needs to come to the same result.


Why would the person need to do that? I find it more common that I need to
interpolate where I (or someone) didn't before, and now I have to inspect
the whole string to find out what it was that motivated the person to make
it a single quote rather than double in the first place, only to find out
there was no reason for it.

>
> The Moon is blue...
>
> I, errr, agree with Rainer here, more for his second reason
> that for his first.
>
> I am always looking to optimize for maintenance.


Using single quotes because the content does not *currently* require
interpolation is not optimizing for maintenance. Optimizing for
maintenance would require you to predict how the usage is likely to change
in the future, and using whatever style of quotes is less likely to need to
change under that theory of the future.


Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


All times are GMT. The time now is 08:51 PM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.