Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Expressing AND, OR, and NOT in a Single Pattern

Reply
Thread Tools

Expressing AND, OR, and NOT in a Single Pattern

 
 
usaims
Guest
Posts: n/a
 
      03-01-2007
I'm having a little problem with this example in the Perl Cookbook.

True if pattern BAD does not match, but pattern GOOD does:
/(?=(??!BAD).)*$)GOOD/s

My objective is to print only lines that have 'suspended' but not
'Data_services'. It is still printing lines with 'suspended' and
'Data_services' in the same line. So, ideally, this script should
print any lines. Correct me if I am wrong.

##############################
#!/usr/bin/perl
use strict;
use diagnostics;
use warnings;

my @stuff = <DATA>;

foreach my $foo(@stuff) {
if ($foo =~ /(?=(??!Data_services).)*$)suspended/s) {
print $foo;

}
}
close(DATA);

__DATA__
<Query id='Data_services.LSSI_Weekly.42' suspended='1' error='Loading
Data Only - cannot run query' wuid='W20070227-140132'
associatedName='libW20070227-140132.so'/>
<Query id='Data_services.SSNMapKeys.14' suspended='1' error='Loading
Data Only - cannot run query' wuid='W20070105-115230'
associatedName='libW20070105-114650.so'/>
<Query id='Data_services.WatercraftKeys.5' suspended='1'
error='Loading Data Only - cannot run query' wuid='W20070123-114242'
associatedName='libW20070123-114242.so'/>

 
Reply With Quote
 
 
 
 
Scott Bryce
Guest
Posts: n/a
 
      03-01-2007
usaims wrote:

> My objective is to print only lines that have 'suspended' but not
> 'Data_services'.


I prefer to use index for something like this.

> It is still printing lines with 'suspended' and
> 'Data_services' in the same line. So, ideally, this script should
> print any lines. Correct me if I am wrong.


There are no lines in your given data that meet your criteria.

Here's my shot at it...

use strict;
use warnings;

while (<DATA>)
{
next if index ($_, 'Data_services') > -1;
print $_ if index ($_, 'suspended') > -1;
}

__DATA__
<Query id='Data_services.LSSI_Weekly.42' suspended='1' error='Loading
Data Only - cannot run query' wuid='W20070227-140132'
associatedName='libW20070227-140132.so'/>
<Query id='Data_services.SSNMapKeys.14' suspended='1' error='Loading
Data Only - cannot run query' wuid='W20070105-115230'
associatedName='libW20070105-114650.so'/>
<Query id='Data_services.WatercraftKeys.5' suspended='1' error='Loading
Data Only - cannot run query' wuid='W20070123-114242'
associatedName='libW20070123-114242.so'/>
<Query id='Other_services.SSNMapKeys.14' suspended='1' error='Loading
Data Only - cannot run query' wuid='W20070105-115230'
associatedName='libW20070105-114650.so'/>



 
Reply With Quote
 
 
 
 
xhoster@gmail.com
Guest
Posts: n/a
 
      03-01-2007
"usaims" <(E-Mail Removed)> wrote:
> I'm having a little problem with this example in the Perl Cookbook.
>
> True if pattern BAD does not match, but pattern GOOD does:
> /(?=(??!BAD).)*$)GOOD/s


Every character from the start of the match to the end of the string
has to not (be the start of a) match to BAD. However, if BAD occurs before
GOOD, the regex can still match, simply by not initiating the match until
after the B of BAD.

You want to the forced exclusion to start at the beginning of the string
and run to the end:

/^(?=(??!BAD).)*$).*GOOD/;

But I'd just use two different regex.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
h3xx
Guest
Posts: n/a
 
      03-01-2007
I like doing things in one line:

print grep { /suspended/ && ! /Data_services/ } <DATA>;

 
Reply With Quote
 
gf
Guest
Posts: n/a
 
      03-02-2007
On Mar 1, 3:18 pm, "h3xx" <(E-Mail Removed)> wrote:
> I like doing things in one line:
>
> print grep { /suspended/ && ! /Data_services/ } <DATA>;



I prefer this method too. For clarity and long-term maintenance it is
much better because the esoterica of regex can make the desired
results hard to figure out and the bugs in the pattern even harder to
find.

Also, speed wise, this is a lot faster. The regex engine has to do a
lot of work that can be short circuited by the booleans.

Sometimes it's better to break the search for matching patterns into
single lines too. It's kind of macho programmer-wise to string it all
together into one mondo regex pattern and have it work, but the logic
can get fragile.

The only thing I'd do differently to these patterns is add an anchor
to the 'Data_services' pattern, like so...

/^<Query id='Data_services/

Anchors speed up regex an incredible amount. I did benchmarks of index
vs various ways of using regex, and an anchored qr// that was
initialized outside a loop was the fastest at finding patterns inside
long strings, when the pattern was at the end of the string. At the
beginning of a string it should be equal to index(). Index() was
faster when finding a fixed string somewhere in the middle of another
string.

 
Reply With Quote
 
gf
Guest
Posts: n/a
 
      03-02-2007
The Regexp::Assemble module on CPAN is way cool for building big
patterns with minimal fuss.

http://search.cpan.org/~dland/Regexp...28/Assemble.pm

The resulting patterns are very efficient and pretty good when you
want to learn how to write complex regex.

 
Reply With Quote
 
Brian McCauley
Guest
Posts: n/a
 
      03-04-2007
On Mar 1, 10:10 pm, (E-Mail Removed) wrote:
> "usaims" <(E-Mail Removed)> wrote:
> > I'm having a little problem with this example in the Perl Cookbook.

>
> > True if pattern BAD does not match, but pattern GOOD does:
> > /(?=(??!BAD).)*$)GOOD/s

>
> Every character from the start of the match to the end of the string
> has to not (be the start of a) match to BAD. However, if BAD occurs before
> GOOD, the regex can still match, simply by not initiating the match until
> after the B of BAD.
>
> You want to the forced exclusion to start at the beginning of the string
> and run to the end:
>
> /^(?=(??!BAD).)*$).*GOOD/;


That's exponentially (er, factorially?) ineficient!

/^(?!.*BAD).*GOOD/;

> But I'd just use two different regex.


Yes, of course, that's still the best way.

 
Reply With Quote
 
xhoster@gmail.com
Guest
Posts: n/a
 
      03-04-2007
"Brian McCauley" <(E-Mail Removed)> wrote:
> On Mar 1, 10:10 pm, (E-Mail Removed) wrote:
> > "usaims" <(E-Mail Removed)> wrote:
> > > I'm having a little problem with this example in the Perl Cookbook.

> >
> > > True if pattern BAD does not match, but pattern GOOD does:
> > > /(?=(??!BAD).)*$)GOOD/s

> >
> > Every character from the start of the match to the end of the string
> > has to not (be the start of a) match to BAD. However, if BAD occurs
> > before GOOD, the regex can still match, simply by not initiating the
> > match until after the B of BAD.
> >
> > You want to the forced exclusion to start at the beginning of the
> > string and run to the end:
> >
> > /^(?=(??!BAD).)*$).*GOOD/;

>
> That's exponentially (er, factorially?) ineficient!


Under what condistions is it exponential? With the patterns I've tested,
it seems to be linear, not exponential. (But still a quite a lot slower
than yours, for reasons I don't quite understand. It would make more sense
to me if it were exponentially slower, rather than constantly 30 times
slower.)

Xho

>
> /^(?!.*BAD).*GOOD/;
>
> > But I'd just use two different regex.

>
> Yes, of course, that's still the best way.


--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
 
Reply With Quote
 
Mirco Wahab
Guest
Posts: n/a
 
      03-04-2007
Brian McCauley wrote:
> On Mar 1, 10:10 pm, (E-Mail Removed) wrote:
>> You want to the forced exclusion to start at the beginning of the string
>> and run to the end:
>>
>> /^(?=(??!BAD).)*$).*GOOD/;

>
> That's exponentially (er, factorially?) ineficient!
>
> /^(?!.*BAD).*GOOD/;
>
>> But I'd just use two different regex.

>
> Yes, of course, that's still the best way.


This

/^(?!.*BAD).*GOOD/

is, in my opinion, of "Maxwellian beauty".

I tried some time to get the original
expression somehow simplified, it (I)
ended with 'throwing the gun'.

Thanks,

Mirco
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Expressing dynamics in XML ? mathieu XML 9 08-12-2009 01:32 AM
[OWL] expressing relationships Rob Heersma XML 0 06-15-2004 12:01 PM
Re: expressing a certain algebraic statement in C++ STL Michel Rosien C++ 0 04-22-2004 01:11 PM
Re: expressing a certain algebraic statement in C++ STL Mike Wahler C++ 1 04-01-2004 02:40 PM
Expressing time. Sean Python 1 07-14-2003 03:57 PM



Advertisments