Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > regex module, or don't work as expected

Reply
Thread Tools

regex module, or don't work as expected

 
 
Fabian Holler
Guest
Posts: n/a
 
      07-04-2006
Howdy,


i have the following regex "iface lo[\w\t\n\s]+(?=(iface)|$)"

If "iface" don't follow after the regex "iface lo[\w\t\n\s]" the rest of
the text should be selected.
But ?=(iface) is ignored, it is always the whole texte selected.
What is wrong?


many thanks

greetings

Fabian
 
Reply With Quote
 
 
 
 
Marc 'BlackJack' Rintsch
Guest
Posts: n/a
 
      07-04-2006
In <44aa670d$0$7872$(E-Mail Removed)>, Fabian Holler wrote:

> Howdy,
>
>
> i have the following regex "iface lo[\w\t\n\s]+(?=(iface)|$)"
>
> If "iface" don't follow after the regex "iface lo[\w\t\n\s]" the rest of
> the text should be selected.
> But ?=(iface) is ignored, it is always the whole texte selected.
> What is wrong?


The ``+`` after the character class means at least one of the characters
in the class or more. If you have a text like:

iface lox iface

Then the it matches the space and the word ``iface`` because the space
(``\s``) and word characters (``\w``) are part of the character class and
``+`` is "greedy". It consumes as many characters as possible and the
rest of the regex is only evaluated when there are no matches anymore.

If you want to match non-greedy then put a ``?`` after the ``+``::

iface lo[\w\t\n\s]+?(?=(iface)|$)

Now only "iface lox " is matched in the example above.

Ciao,
Marc 'BlackJack' Rintsch
 
Reply With Quote
 
 
 
 
Fabian Holler
Guest
Posts: n/a
 
      07-04-2006
Hello Marc,

thank you for your answer.

Marc 'BlackJack' Rintsch wrote:
> In <44aa670d$0$7872$(E-Mail Removed)>, Fabian Holler wrote:


>> i have the following regex "iface lo[\w\t\n\s]+(?=(iface)|$)"
>>
>> If "iface" don't follow after the regex "iface lo[\w\t\n\s]" the rest of
>> the text should be selected.
>> But ?=(iface) is ignored, it is always the whole texte selected.
>> What is wrong?

>
> The ``+`` after the character class means at least one of the characters
> in the class or more. If you have a text like:


Yes thats right, but that isn't my problem.
The problem is in the "(?=(iface)|$)" part.

I have i.e. the text:

"auto lo eth0
<MATCH START>iface lo inet loopback
bla
blub

<MATCH END>iface eth0 inet dhcp
hostname debian"


My regex should match the marked text.
But it matchs the whole text starting from iface.
If there is only one iface entry, the whole text starting from iface
should be matched.

greetings

Fabian
 
Reply With Quote
 
Fredrik Lundh
Guest
Posts: n/a
 
      07-04-2006
Fabian Holler wrote:

> Yes thats right, but that isn't my problem.
> The problem is in the "(?=(iface)|$)" part.


no, the problem is that you're thinking "procedural string matching from
left to right", but that's not how regular expressions work.

> I have i.e. the text:
>
> "auto lo eth0
> <MATCH START>iface lo inet loopback
> bla
> blub
>
> <MATCH END>iface eth0 inet dhcp
> hostname debian"
>
>
> My regex should match the marked text.
> But it matchs the whole text starting from iface.


which is perfectly valid, since a plain "+" is greedy, and you've asked
for "iface lo" followed by some text followed by *either* end of string
or another "iface". the rest of the string is a perfectly valid string.

if you want a non-greedy match, use "+?" instead.

however, if you just want the text between two string literals, it's
often more efficient to just split the string twice:

text = text.split("iface lo", 1)[1].split("iface", 1)[0]

</F>

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
IsInRole does not work as expected =?Utf-8?B?dmluZWV0YmF0dGE=?= ASP .Net 1 01-06-2007 10:27 AM
ButtonColumn and EditCommandColumn don't work together as expected in DataGrid Piotr ASP .Net 0 01-06-2006 02:21 PM
Template.LoadControl(type,object[])...doesn't work as expected Keith Patrick ASP .Net 2 11-16-2005 04:01 PM
Config: Allow Roles does not work as expected Brian Takita ASP .Net 3 05-12-2005 06:30 AM



Advertisments