Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Pattern Matching and skipping

Reply
Thread Tools

Pattern Matching and skipping

 
 
Tad McClellan
Guest
Posts: n/a
 
      09-08-2006
MattJ83 <(E-Mail Removed)> wrote:
>
>> while (<DATA>) {
>> if (/updates table/) {
>> print "MATCHED: $_";
>> }
>> elsif (/elapsed/) {
>> print "MATCHED: $_";
>> }
>> else {
>> print "NO MATCH: $_";
>> }
>> }

>
> I understand this but, when i am trying to do something similar for my
> specific code im struggling.



If you post a SHORT AND COMPLETE PROGRAM THAT WE CAN RUN, then
we can surely help you.

Try doing that for once.


> while (<$LOG>) {
> if (/updates table/) {
> my @lines = $_;

^^^^^^^^^^^^^^^

Why do you include that last line of code?

What do you think it does for you?

(it does nothing of use to you, so why is it there?)


> foreach my $file (@lines) {



Q: How many elements can there be in the @lines array?

A: One.

There is no need for a loop that is guaranteed to iterate only once.


--
Tad McClellan SGML consulting
http://www.velocityreviews.com/forums/(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
 
 
 
MattJ83
Guest
Posts: n/a
 
      09-08-2006

> If you post a SHORT AND COMPLETE PROGRAM THAT WE CAN RUN, then
> we can surely help you.
>
> Try doing that for once.


ok, sorry.

#!/usr/central/bin/perl
use strict;
#use warnings;
use DBI;

my @filenames= </home/username/logs/*.log>;
foreach my $filename (@filenames) {
open my $LOG, '<', $filename or die "can't open $filename: $!\n";

while (<$LOG>) {
if (/updates table/) {
my $info = $_;
{
while (<$LOG>) {
if (/elapsed/) {
my $elapsed1 = $_;
{
while (<$LOG>) {
if (/conflicting|FASTSEARCH/) {
my $fast = $_;
{
while (<$LOG>) {
if (/elapsed/) {
my $elapsed2 = $_;
{

close($LOG) ;

my $dbh = DBI ->connect("dbi:Oracle:SERVER", "DATABASE", "PASSWORD")
or die "couldn't connect to database: $DBI::errstr\n";

$dbh->do("insert into LOGS values ('$filename', '$info', '$elapsed1',
'$fast', '$elapsed2')")
or die ("inserting data failure: $!\n");

$dbh->disconnect;
}}}}}}}}}}}}}
exit;

..log
Text updates table text
text
text elapsed text
text
FASTSEARCH text text
text elapsed
text

If you make multiple copies of this log file and remove the word
FASTSEARCH from one of the logs - the log will not be placed into the
database.
>
>
> > while (<$LOG>) {
> > if (/updates table/) {
> > my @lines = $_;

> ^^^^^^^^^^^^^^^
>
> Why do you include that last line of code?
>
> What do you think it does for you?
>
> (it does nothing of use to you, so why is it there?)
>
>
> > foreach my $file (@lines) {

>


Yep - its been removed - i saw what you mean't! Thanks.

 
Reply With Quote
 
 
 
 
Eric Schwartz
Guest
Posts: n/a
 
      09-08-2006
"MattJ83" <(E-Mail Removed)> writes:
> > If you post a SHORT AND COMPLETE PROGRAM THAT WE CAN RUN, then
> > we can surely help you.
> >
> > Try doing that for once.

>
> ok, sorry.


Problem is, your code, as posted, relies on a database we don't know
about, and don't really want to set up anyway. So we can't run the
program, even if we wanted to. Also, you can put your data in your
program, after a __DATA__ literal, and read it with the predefined
DATA filehandle. perldoc perldata for more on this (see: "Special
Literals").

> #!/usr/central/bin/perl
> use strict;
> #use warnings;


You should uncomment this line. If you don't know how to get rid of
the warnings it produces, then ask here, and we can help. But with it
off, you're basically asking us to help you find problems that
warnings can uncover for you. In other words, you're communicating to
us (whether you mean to or not) that our time is less important than
your computers'.

> use DBI;


But, we don't have a database set up. So this makes it too hard to
use. Anyway, you just need to replace your database inserts with
print statements to make it easy to use.

> my @filenames= </home/username/logs/*.log>;
> foreach my $filename (@filenames) {
> open my $LOG, '<', $filename or die "can't open $filename: $!\n";
>
> while (<$LOG>) {
> if (/updates table/) {
> my $info = $_;
> {
> while (<$LOG>) {


This right here is, well, not to put too fine a point on it, dumb.
You already have a loop over the data. Logically, it's confusing to
start a new loop inside the old one. The usual problem is that when
the next iteration of the outer while() begins, the default assumption
is that it will read the next line after the previous iteration.
Putting another while loop inside it destroys that assumption, and
confuses debuggers to no end. This is very bad style; if you worked
for me, I wouldn't accept it, because the next person to look at your
code would be horribly confused.

Because of the way you've done it, you won't see that effect, but it's
more of an accident than anything else. You definitely should not use
this anywhere else; most of the people responding to you are trying to
communicate that this is a bad and confusing idea.

You don't need to start a brand-new while loop anyway; what your code
says is that you only look for "elapsed" after you find "updates
table", and you only look for "conflicting" or "FASTSEARCH" after you
find "elapsed". If this is the case, you can just do something like:

#/usr/bin/perl
use warnings;
use strict;

my ($info, $elapsed1, $fast, $elapsed2);
while(<DATA>) {
chomp;

if (/updates table/) {
print "found updates: [$_]\n";
$info = $_;
}
if (/elapsed/) {
print "found elapsed: [$_]... ";
if ($fast) {
print "setting elapsed2\n";
$elapsed2 = $_;
last;
} elsif ($info) {
print "setting elapsed1\n";
$elapsed1 = $_;
}
}
if (/conflicting|FASTSEARCH/) {
print "found fast: [$_]\n";
$fast = $_;
}
}

# do db stuff here for real
print "info: [$info] fast: [$fast] ".
"elapsed1: [$elapsed1] elapsed2: [$elapsed2]\n";

__DATA__
Text updates table text
text
text elapsed text
text
FASTSEARCH text text
text elapsed
text
__END__

Notice how I only loop over it once. Each time I find a line I might
be interested in, I check to see if I really am-- in the case of
finding "elapsed", you may want to set $elapsed1 or $elapsed2, so you
have to check if $fast or $info are set to determine which one to set.

> $dbh->do("insert into LOGS values ('$filename', '$info', '$elapsed1',
> '$fast', '$elapsed2')")
> or die ("inserting data failure: $!\n");


Ugh, don't do that. Let DBI escape all the values for you. What if,
for instance, $filename contained the string

a', 'b', 'c', 'd', 'e'); delete from LOGS;

? Then your statement would look like:

$dbh->do("insert into LOGS values ('a', 'b', 'c', 'd', 'e'); delete from LOGS");

That's only the easiest way to screw you up; there are far nastier and
subtler ways to cause problems. Also, and this is NOT a Perl problem,
but since I'm feeling generous, I'll point it out anyway: you should
always list the columns when you do an insert, because (A) it's easier
to see what's going on when you read the code (you can look at it and
know what goes where) and (B) if the order of the fields ever changes,
you're screwed, and you don't know why.

Instead of trying to put in quote marks yourself, let DBI do it for
you with placeholders:

$dbh->do("INSERT INTO LOG (filename, info, start, fast, end)
VALUES (?, ?, ?, ?, ?)", undef,
$filename, $info, $elapsed1, $fast, $elapsed2);

perldoc DBI, and look for "Placeholders and Bind Values" to learn more.

> $dbh->disconnect;
> }}}}}}}}}}}}}


Ugh. This is butt-ugly. You can't tell by looking at this line which
brace corresponds to which if or while statement. If you wanted to
change the code later, you'll pretty much have to keep running it
until you get rid of all the syntax errors, instead of reading the
code to see what goes where. And even then, you're likely to put
something in the wrong place.

> exit;
>
> .log
> Text updates table text
> text
> text elapsed text
> text
> FASTSEARCH text text
> text elapsed
> text
>
> If you make multiple copies of this log file and remove the word
> FASTSEARCH from one of the logs - the log will not be placed into the
> database.


That's what your code says to do-- if you don't find FASTSEARCH, you
don't get to the part that updates the database. Isn't that what you
wanted? I'm not sure, based on your posts here, that you're entirely
clear on what it is you're trying to do here.

-=Eric
 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      09-08-2006
MattJ83 wrote:
>>If you post a SHORT AND COMPLETE PROGRAM THAT WE CAN RUN, then
>>we can surely help you.
>>
>>Try doing that for once.

>
> ok, sorry.
>
> #!/usr/central/bin/perl
> use strict;
> #use warnings;
> use DBI;
>
> my @filenames= </home/username/logs/*.log>;
> foreach my $filename (@filenames) {
> open my $LOG, '<', $filename or die "can't open $filename: $!\n";
>
> while (<$LOG>) {
> if (/updates table/) {
> my $info = $_;
> {
> while (<$LOG>) {
> if (/elapsed/) {
> my $elapsed1 = $_;
> {
> while (<$LOG>) {
> if (/conflicting|FASTSEARCH/) {
> my $fast = $_;
> {
> while (<$LOG>) {
> if (/elapsed/) {
> my $elapsed2 = $_;
> {


You don't need all that nesting:

for my $filename ( </home/username/logs/*.log> ) {
open my $LOG, '<', $filename or die "can't open $filename: $!\n";

my ( $info, $elapsed1, $fast, $elapsed2 );
while ( <$LOG> ) {
chomp;
$info = $_ if /updates table/;
$elapsed1 = $_ if defined $info and /elapsed/;
$fast = $_ if defined $elapsed1 and /conflicting|FASTSEARCH/;
if ( defined $fast and /elapsed/ ) {
$elapsed2 = $_;


> close($LOG) ;
>
> my $dbh = DBI ->connect("dbi:Oracle:SERVER", "DATABASE", "PASSWORD")
> or die "couldn't connect to database: $DBI::errstr\n";
>
> $dbh->do("insert into LOGS values ('$filename', '$info', '$elapsed1',
> '$fast', '$elapsed2')")
> or die ("inserting data failure: $!\n");
>
> $dbh->disconnect;


}
}
}



John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      09-08-2006
MattJ83 <(E-Mail Removed)> wrote:
>
>> If you post a SHORT AND COMPLETE PROGRAM THAT WE CAN RUN, then
>> we can surely help you.
>>
>> Try doing that for once.

>
> ok, sorry.
>
> #!/usr/central/bin/perl
> use strict;
> #use warnings;



You lose all of "use warnings" benefits if you comment it out.

> use DBI;
>
> my @filenames= </home/username/logs/*.log>;
> foreach my $filename (@filenames) {
> open my $LOG, '<', $filename or die "can't open $filename: $!\n";
>
> while (<$LOG>) {
> if (/updates table/) {
> my $info = $_;
> {
> while (<$LOG>) {
> if (/elapsed/) {
> my $elapsed1 = $_;
> {
> while (<$LOG>) {
> if (/conflicting|FASTSEARCH/) {
> my $fast = $_;
> {
> while (<$LOG>) {
> if (/elapsed/) {
> my $elapsed2 = $_;
> {



I'm pretty sure I said this already:

Read the line once, test it many times.


So why are you reading the lines 4 times instead of once?


[snip]

> }}}}}}}}}}}}}



I have never seen a line of Perl code like that in my 12 years
of Perl programming. It is a sure sign of some conceptual error
on your part.

You need to understand what you are doing, not just try random stuff...


--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      09-09-2006
On 2006-09-08 17:31, Eric Schwartz <(E-Mail Removed)> wrote:
> "MattJ83" <(E-Mail Removed)> writes:

[lots of nested loops]
>> }}}}}}}}}}}}}

>
> Ugh. This is butt-ugly. You can't tell by looking at this line which
> brace corresponds to which if or while statement.


Yup. The only time I've seen somebody seriously proposing closing
multiple blocks on the same line, he also proposed that the closing
braces match vertically with the start of the statement, like this:


if (...) {
while (...) {
for (...) {
if (...) {
if (...) {
...
} } } } }

(or maybe he put the opening braces on a line by their own, I don't
remember)

That way it is easy to see that each closing brace corresponds to the
start of a statement. However, they don't correspond to the statements
they seem to correspond to, which is probably confusing, too (I guess
there's a reason that very few people use this style ).

hp


--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR | > ist?
| | | (E-Mail Removed) | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd
 
Reply With Quote
 
Uri Guttman
Guest
Posts: n/a
 
      09-09-2006
>>>>> "PJH" == Peter J Holzer <(E-Mail Removed)> writes:

PJH> if (...) {
PJH> while (...) {
PJH> for (...) {
PJH> if (...) {
PJH> if (...) {
PJH> ...
PJH> } } } } }

PJH> That way it is easy to see that each closing brace corresponds to the
PJH> start of a statement. However, they don't correspond to the statements
PJH> they seem to correspond to, which is probably confusing, too (I guess
PJH> there's a reason that very few people use this style ).

disregarding the actual brace or coding style there i feel there is
fundamentally something wrong when i see deeply nested code like that in
perl. one issue is that many time an if or else is just control flow
like next/last/return. those can usually be reduced to statement
modifiers which eliminate a block and all of its cruft. and many times i
see the else block being the control flow. if the associated condtional
is inverted, then you can lose the else (becomes a statement modifier
again) and the if code loses its indent and becomes straight line code
at the higher indent level. another trick is to wrap the deeper nested
code into a sub of its own just for clarity's sake (yes, it will slow
you down some but clarity is important too).

so the goal is to not need such deeply nested code. maybe 3-4 deep is as
far as i go before i do something (and that include using all those
block elimination techniques i mentioned) like factoring out an inner
sub.

in general, seeing code like that means the coder is not experienced
enough to know how to make code readable as well as correct.

uri

--
Uri Guttman ------ (E-Mail Removed) -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex testing and UTF8 awarenes or Regex and numeric pattern matching sln@netherlands.com Perl Misc 2 03-10-2009 03:51 AM
Help with Pattern matching. Matching multiple lines from while reading from a file. Bobby Chamness Perl Misc 2 05-03-2007 06:02 PM
Pattern Matching Given # of Characters and no String Input; use RegularExpressions? Synonymous Python 10 04-22-2005 07:56 AM
[perl-python] text pattern matching, and expressiveness Xah Lee Python 4 02-11-2005 09:11 PM
Pattern matching : not matching problem Marc Bissonnette Perl Misc 9 01-13-2004 05:52 PM



Advertisments