Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Using loop labels and iterating.

Reply
Thread Tools

Using loop labels and iterating.

 
 
Nene
Guest
Posts: n/a
 
      02-17-2012
Hi,

This is a script that opens up a log file and searches for the word
'started' and if it finds the word, it prints it and iterates to the
next log file. But for the servers that didn't start, I want it to
print "didn't start". Please help, thanks.



#!/usr/bin/perl
use diagnostics;
use warnings;

open(SCRATCHPAD,"LOG.txt") or die "can't open $!";
my @uniq = <SCRATCHPAD>;

NODE:
foreach my $stuff ( @uniq ) {
chomp($stuff);

open(STARTUP, "/c\$/$stuff/esp.log") or "die can't open $!";
my @log_file = <STARTUP>;

LINE:
for my $line ( @log_file ) {

next LINE if $line !~ /ESP server (started)./;
my $capture = "$1\n";
print "$stuff => $capture";

close(STARTUP);
close(SCRATCHPAD);

}
}
 
Reply With Quote
 
 
 
 
Rainer Weikusat
Guest
Posts: n/a
 
      02-17-2012
Ben Morrow <(E-Mail Removed)> writes:
> Quoth Christian Winter <(E-Mail Removed)>:
>>
>> open( my $startup, "<", "/c\$/$stuff/esp.log" )
>> or die "can't open $stuff/esp.log: $!";
>>
>> LINE: while( my $line = <$startup> )
>> {
>> if( $line =~ /ESP server (started)/ )
>> {
>> $status = $1;
>> last LINE;
>> }
>> }
>>
>> close $startup;
>> print "$stuff => $status$/";

>
> use File::Slurp qw/slurp/;
>
> my $startup = slurp "/c\$/$stuff/esp.log";
> $startup =~ /ESP server started/
> and print "$stuff => started\n";
>
> Unless the logfile is *huge* (by which I mean several GB, these days),
> it's going to be more efficient to read and match it all in one go.


It is probably going to be faster which might be a good thing if the
assumption that the machine is exclusively dedicated to this
particular task holds because it will then hog CPU and memory most
effectively. OTOH, except if the logfile is *huge*, is the difference
large enough to matter, especially considering that the computer will
very likely have more important things to do than 'monitoring itself'?

OTOH, the source code of that was 'an interesting read'. I suggest
that 'Author: Uri Guttman' should be considered a sufficient reason to
avoid any code blindly, based on that.

 
Reply With Quote
 
 
 
 
Rainer Weikusat
Guest
Posts: n/a
 
      02-17-2012
Rainer Weikusat <(E-Mail Removed)> writes:

[...]

> OTOH, the source code of that was 'an interesting read'.


For instance, if PERL_IMPLICIT_SYS is not set, sysread will simply do
a read system call and that may return less data than was requested
for any number of reasons. As far as I could determine, the module
does a single sysread call for 'small files' and returns the
results. The way 'atomic updates' are implemented is known to not work
with certain filesystems because it relies on rename having barrier
semantics wrt data writes and this might not be the case (out of my
head, this will break on 'older' version of ext4, on XFS and on any
filesystem which always performs metadata updates sychronously, IOW,
UFS and FFS). There's some hardcoded support for dealing with
'Windows' textfiles in an atrociously inefficient way (by doing
single-character deletions on the read buffer which is an O(n*n)
algorithm). There's no support for dealing with any other kind of
textfiles.



 
Reply With Quote
 
Wolf Behrenhoff
Guest
Posts: n/a
 
      02-17-2012
Am 17.02.2012 16:46, schrieb Rainer Weikusat:
> Rainer Weikusat <(E-Mail Removed)> writes:
>
> [...]
>
>> OTOH, the source code of that was 'an interesting read'.


Talking about this:

what is actually the advantage of sysread over the "Perl way" of reading
from a file with code like $result = <$fileHande> (or @result =
<$fileHandle>, depending on wantarray)? To slurp, I simply undef $/ and
everything seems fine... I have never used sysread in Perl - should I
consider using it?

I didn't expect such a long code for some "simple" thing like reading in
a file, especially not a comment telling me it is using "DEEP DARK MAGIC".

As I am usually using an "enterprise" Linux distribution (RHEL clone), I
still need to make scripts compatible with 5.8.8 where File::Slurp is
not a core module...

- Wolf
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      02-17-2012
Ben Morrow <(E-Mail Removed)> writes:

[...]

> No, not unless you have a reason to. I use File::Slurp because (and only
> because) it has a clean, simple interface for getting a file into a
> string, without needing to mess about opening filehandles and setting
> globals.


Provided that's really just what you want, consider using this:

sub slurp
{
my $fh;

open($fh, '<', $_[0]) or die("open: $_[0]: $!");
local $/ unless wantarray();
return <$fh>;
}

That's less buggy (because it leaves the intricacies of dealing with
system specific I/O to perl) and possibly even faster (again because
it uses features perl already has).
 
Reply With Quote
 
Rainer Weikusat
Guest
Posts: n/a
 
      02-17-2012
Wolf Behrenhoff <(E-Mail Removed)> writes:
>> Rainer Weikusat <(E-Mail Removed)> writes:
>>
>> [...]
>>
>>> OTOH, the source code of that was 'an interesting read'.

>
> Talking about this:
>
> what is actually the advantage of sysread over the "Perl way" of reading
> from a file with code like $result = <$fileHande> (or @result =
> <$fileHandle>, depending on wantarray)? To slurp, I simply undef $/ and
> everything seems fine... I have never used sysread in Perl - should I
> consider using it?


It bypasses the Perl I/O buffering mechanism, including any
translations layers etc which might be active as part of that. I
wouldn't use it for reading 'text files'. It is, however, useful when
more control about the actual I/O operations performed by a program is
required than the read-in-advance/ write-behind buffering mechanism
offers. This would usually either be the case if the I/O is actually
'real-time' IPC, eg, when the program is acting as a server on an
AF_UNIX datagram socket or when reliability is an important concern,
eg, when doing 'atomic updates' of files which must (to the degree the
application can guarantee this) remain in a consistent state even when
power suddenly goes away. Since "whining about the evil filesystem"
doesn't really help to solve the problem, this required a multi-step
procedure where one stop must have been completed before the next one
commences.
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      02-17-2012
On 2012-02-17 16:43, Wolf Behrenhoff <(E-Mail Removed)> wrote:
> Am 17.02.2012 16:46, schrieb Rainer Weikusat:
>> Rainer Weikusat <(E-Mail Removed)> writes:
>>
>> [...]
>>
>>> OTOH, the source code of that was 'an interesting read'.

>
> Talking about this:
>
> what is actually the advantage of sysread over the "Perl way" of reading
> from a file with code like $result = <$fileHande> (or @result =
><$fileHandle>, depending on wantarray)?


I'll just quote from a posting I wrote about 2 years ago (incidentally
in reply to Uri Guttman's claim that sysread is faster):

[...]
| So I grabbed the server with the fastest disks I had access to (disk
| array of SSDs), created a file with 400 million lines of 80 characters
| (plus newline) each and ran some benchmarks:
|
| method time speed (MB/s)
| ----------------------------------------------
| perlio $/ = "\n" 2:35.12 209
| perlio $/ = \4096 1:35.36 340
| perlio $/ = \1048576 1:35.25 340
| sysread bs = 4096 1:35.28 340
| sysread bs = 1048576 1:35.18 340
|
| The times are the median of three runs. Times between the runs differed
| by about 1 second, so the difference between reading line by line and
| block by block is significant, but the difference between perlio and
| sysread or between different blocksizes isn't.
|
| I was a bit surprised that reading line by line was so much slower than
| blockwise reading. Was it because of the higher loop overhead (81 bytes
| read per loop instead of 4096 means 50 times more overhead) or because
| splitting a block into lines is so expensive?
|
| So I did another run of benchmarks with different block sizes:
|
| method block user system cpu total
| read_file_by_perlio_block 4096 0.64s 26.87s 31% 1:27.91
| read_file_by_perlio_block 2048 1.48s 28.65s 34% 1:28.56
| read_file_by_perlio_block 1024 5.14s 29.03s 37% 1:30.59
| read_file_by_perlio_block 512 11.98s 31.33s 47% 1:31.22
| read_file_by_perlio_block 256 26.84s 33.13s 61% 1:36.85
| read_file_by_perlio_block 128 43.53s 29.05s 71% 1:41.66
| read_file_by_perlio_block 64 77.26s 28.16s 88% 1:59.70
| read_file_by_line 104.68s 28.01s 93% 2:22.34
|
| (the times are a bit lower now because here the system was idle while it
| had a (relatively constant) load during the first batch)
|
| As expected elapsed time as well as CPU time increases with shrinking
| block size. However, even at 64 bytes, reading in blocks is still 20%
| faster than reading in lines, even though the loop is now executed 27%
| more often.
|
| Conclusions:
|
| * The difference between sysread and blockwise <> isn't even measurable.
|
| * Above 512 Bytes the block size matters very little (and above 4k, not
| at all).
|
| * Reading line by line is significantly slower than reading by blocks.


> To slurp, I simply undef $/ and
> everything seems fine...


I haven't benchmarked $/ = undef but based on the results above I would
expect it to be as fast as sysread.

hp


--
_ | Peter J. Holzer | Deprecating human carelessness and
|_|_) | Sysadmin WSR | ignorance has no successful track record.
| | | http://www.velocityreviews.com/forums/(E-Mail Removed) |
__/ | http://www.hjp.at/ | -- Bill Code on (E-Mail Removed)
 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      02-18-2012
Nene wrote:
> Hi,
>
> This is a script that opens up a log file and searches for the word
> 'started' and if it finds the word, it prints it and iterates to the
> next log file. But for the servers that didn't start, I want it to
> print "didn't start". Please help, thanks.
>
>
>
> #!/usr/bin/perl
> use diagnostics;
> use warnings;
>
> open(SCRATCHPAD,"LOG.txt") or die "can't open $!";
> my @uniq =<SCRATCHPAD>;
>
> NODE:
> foreach my $stuff ( @uniq ) {
> chomp($stuff);
>
> open(STARTUP, "/c\$/$stuff/esp.log") or "die can't open $!";
> my @log_file =<STARTUP>;
>
> LINE:
> for my $line ( @log_file ) {
>
> next LINE if $line !~ /ESP server (started)./;
> my $capture = "$1\n";
> print "$stuff => $capture";
>
> close(STARTUP);
> close(SCRATCHPAD);
>
> }
> }


This may work better (UNTESTED):

#!/usr/bin/perl
use warnings;
use strict;

open my $SCRATCHPAD, '<', 'LOG.txt' or die "can't open 'LOG.txt'
because: $!";

NODE:
while ( my $stuff = <$SCRATCHPAD> ) {
chomp $stuff;
open my $STARTUP, '<', "/c\$/$stuff/esp.log" or die "can't open
'/c\$/$stuff/esp.log' because: $!";
while ( my $line = <$STARTUP> ) {
if ( $line =~ /ESP server started\./ ) {
print "$stuff => started";
next NODE;
}
}
print "$stuff => didn't start";
}

__END__



John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction. -- Albert Einstein
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Triple nested loop python (While loop insde of for loop inside ofwhile loop) Isaac Won Python 9 03-04-2013 10:08 AM
"pointlabel"-like function for Python: distribute text labels on a2-d scatter plot to avoid overlapping labels C Barrington-Leigh Python 1 09-12-2010 09:58 PM
using repeaters and text boxes and labels in the client side Neo ASP .Net 0 03-22-2006 09:30 AM
How to generate variable labels for same component within a generate loop Weng Tianxiang VHDL 5 02-16-2006 01:45 PM
how loop through various labels' text values tony collier ASP .Net 2 01-12-2004 09:09 PM



Advertisments