Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > regex with nots in it

Reply
Thread Tools

regex with nots in it

 
 
Ben Holness
Guest
Posts: n/a
 
      10-06-2003
Hi all,

I would like to know if it is possible to have nots in a single regular
expression and if so, how to do it?

For example if I want a single regular expression that says:

The phrase must have the string "Perl" and must not be followed by "PHP" in it, so that it
would match:

"I like Perl"
"Perl is cool"

But not match

"I like Perl more than PHP"
"Although PHP is OK"

I haven't been able to work out how to do it, but if '!' were the not
operator, then I guess it would be something like

/Perl.*![PHP]/

Searching hasn't been much help - the word "not" is way too common

Cheers,

Ben
 
Reply With Quote
 
 
 
 
Ben Holness
Guest
Posts: n/a
 
      10-06-2003

>> The phrase must have the string "Perl" and must not be followed by
>> "PHP" in it, so that it would match:
>>
>> "I like Perl"
>> "Perl is cool"
>>
>> But not match
>>
>> "I like Perl more than PHP"
>> "Although PHP is OK"

>
> Then look for "look-ahead" in perlre.pod.


hmmm. Doesn't seem to work because look-ahead cannot deal with wildcards:

/Perl.*(?!PHP)/

doesn't do what I want perlre suggests that it's easier to have it as
two regular expressions, which is what I was trying to avoid.

Any other ideas?

Cheers anyway,

Ben
 
Reply With Quote
 
 
 
 
Anno Siegel
Guest
Posts: n/a
 
      10-06-2003
Ben Holness <> wrote in comp.lang.perl.misc:
>
> >> The phrase must have the string "Perl" and must not be followed by
> >> "PHP" in it, so that it would match:
> >>
> >> "I like Perl"
> >> "Perl is cool"
> >>
> >> But not match
> >>
> >> "I like Perl more than PHP"
> >> "Although PHP is OK"

> >
> > Then look for "look-ahead" in perlre.pod.

>
> hmmm. Doesn't seem to work because look-ahead cannot deal with wildcards:
>
> /Perl.*(?!PHP)/


No, your "wild cards" make short shrift with the look-ahead. Even if
there is a "PHP" after "Perl", it is always possible for ".*" to match
enough of the following string to make any "PHP" disappear, so the
negative look-ahead succeeds (doesn't see PHP). Take the ".*" into
the lookahead:

/Perl(.*?!PHP)/

Anno
 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      10-06-2003
Bernard El-Hagin <bernard.el-> wrote in comp.lang.perl.misc:
> "Ben Holness" <> wrote in
> news:
>
> >
> >>> The phrase must have the string "Perl" and must not be followed by
> >>> "PHP" in it, so that it would match:
> >>>
> >>> "I like Perl"
> >>> "Perl is cool"
> >>>
> >>> But not match
> >>>
> >>> "I like Perl more than PHP"
> >>> "Although PHP is OK"
> >>
> >> Then look for "look-ahead" in perlre.pod.

> >
> > hmmm. Doesn't seem to work because look-ahead cannot deal with wildcards:
> >
> > /Perl.*(?!PHP)/

>
>
> Try
>
>
> /Perl(?!.*PHP)/
>
>
> > doesn't do what I want perlre suggests that it's easier to have it as
> > two regular expressions, which is what I was trying to avoid.

>
>
> Why? It's a perfectly valid suggestion.


The condition that "PHP" must come after "Perl" makes the two-regex
solution a little less attractive. Some trickery with pos() or
@+ is required, as in

/Perl/g && !/\G.*PHP/

which makes it slightly obscure.

Anno
 
Reply With Quote
 
Ben Holness
Guest
Posts: n/a
 
      10-06-2003

>> doesn't do what I want perlre suggests that it's easier to have it as
>> two regular expressions, which is what I was trying to avoid.

>
>
> Why? It's a perfectly valid suggestion.


The system I have built checks messages for particular content. The
content is defined in a database, so if I need more than one regex, I need
to implement some slightly more clever code than just getting the regex
from the db and matching

The suggestions from yourself and Anno are what I needed though;

/Perl(?!.*PHP)/ does exactly what I need

Thanks,

Ben
 
Reply With Quote
 
Randal L. Schwartz
Guest
Posts: n/a
 
      10-06-2003
>>>>> "Anno" == Anno Siegel <> writes:

>> /Perl.*(?!PHP)/


Anno> No, your "wild cards" make short shrift with the look-ahead. Even if
Anno> there is a "PHP" after "Perl", it is always possible for ".*" to match
Anno> enough of the following string to make any "PHP" disappear, so the
Anno> negative look-ahead succeeds (doesn't see PHP). Take the ".*" into
Anno> the lookahead:

Anno> /Perl(.*?!PHP)/

Right, it's the difference between:

Can I find Perl, followed by some number of characters,
followed by something that isn't PHP?

versus

Can I find Perl, followed immediately by something that isn't
"some number of characters followed by PHP"?

Logic can be tough some times. Luckily, Regex are precise, and do
exactly what you tell them.

print "Just another Perl hacker,"

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
 
Reply With Quote
 
Jeff 'japhy' Pinyan
Guest
Posts: n/a
 
      10-06-2003
On Mon, 6 Oct 2003, Randal L. Schwartz wrote:

>>>>>> "Anno" == Anno Siegel <> writes:

>
>>> /Perl.*(?!PHP)/

>
>Anno> /Perl(.*?!PHP)/
>
>Right, it's the difference between:
>
> Can I find Perl, followed by some number of characters,
> followed by something that isn't PHP?
>
>versus
>
> Can I find Perl, followed immediately by something that isn't
> "some number of characters followed by PHP"?


Uhhh, except that Anno misplaced the '?!' in that regex. It should be

/Perl(?!.*PHP)/

--
Jeff Pinyan RPI Acacia Brother #734 2003 Rush Chairman
"And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)

 
Reply With Quote
 
Roy Johnson
Guest
Posts: n/a
 
      10-06-2003
(Anno Siegel) wrote in message news:<blrk1p$31f$>...

> /Perl(.*?!PHP)/


By which you mean

/Perl(?!.*PHP)/
 
Reply With Quote
 
Malcolm Dew-Jones
Guest
Posts: n/a
 
      10-06-2003
Anno Siegel () wrote:
: Bernard El-Hagin <bernard.el-> wrote in comp.lang.perl.misc:
: > "Ben Holness" <> wrote in
: > news:
: >
: > >
: > >>> The phrase must have the string "Perl" and must not be followed by
: > >>> "PHP" in it, so that it would match:
: > >>>
: > >>> "I like Perl"
: > >>> "Perl is cool"
: > >>>
: > >>> But not match
: > >>>
: > >>> "I like Perl more than PHP"
: > >>> "Although PHP is OK"
: > >>
: > >> Then look for "look-ahead" in perlre.pod.
: > >
: > > hmmm. Doesn't seem to work because look-ahead cannot deal with wildcards:
: > >
: > > /Perl.*(?!PHP)/
: >
: >
: > Try
: >
: >
: > /Perl(?!.*PHP)/
: >
: >
: > > doesn't do what I want perlre suggests that it's easier to have it as
: > > two regular expressions, which is what I was trying to avoid.
: >
: >
: > Why? It's a perfectly valid suggestion.

: The condition that "PHP" must come after "Perl" makes the two-regex
: solution a little less attractive. Some trickery with pos() or
: @+ is required, as in

: /Perl/g && !/\G.*PHP/

: which makes it slightly obscure.

No, simply look for what you don't want and reject it

$match = /Perl/ # needs to match this
&& ! /Perl.*PHP/ # but mustn't match this

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
String Pattern Matching: regex and Python regex documentation Xah Lee Java 1 09-22-2006 07:11 PM
Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine? =?Utf-8?B?SmViQnVzaGVsbA==?= ASP .Net 2 10-22-2005 02:43 PM
Java regex imposture re: Perl regex compatibility a_c_Attlee@yahoo.com Java 2 05-06-2005 12:16 AM
perl regex to java regex Rick Venter Java 5 11-06-2003 10:55 AM



Advertisments