![]() |
python regex "negative lookahead assertions" problems
Hi List,
I'm trying to match lines in python using the re module. The end goal is to have a regex which enables me to skip lines which have ok and warning in it. But for some reason I can't get negative lookaheads working, the way it's explained in "http://docs.python.org/library/re.html". Consider this example: Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import re >>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf lqsuhf lqksjfhqisudfh qiusdfhq iusfh' >>> re.match('.*(?!warning)',line) <_sre.SRE_Match object at 0xb75b1598> I would expect that this would NOT match as it's a negative lookahead and warning is in the string. Thanks, -- Jelle Smet http://www.smetj.net |
Re: python regex "negative lookahead assertions" problems
On 11/22/09 14:58, Jelle Smet wrote:
> Hi List, > > I'm trying to match lines in python using the re module. > The end goal is to have a regex which enables me to skip lines which have ok and warning in it. > But for some reason I can't get negative lookaheads working, the way it's explained in "http://docs.python.org/library/re.html". > > Consider this example: > > Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03) > [GCC 4.4.1] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import re >>>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf lqsuhf lqksjfhqisudfh qiusdfhq iusfh' >>>> re.match('.*(?!warning)',line) > <_sre.SRE_Match object at 0xb75b1598> > > I would expect that this would NOT match as it's a negative lookahead and warning is in the string. > '.*' eats all of line. Now, when at end of line, there is no 'warning' anymore, so it matches. What are you trying to achieve? If you just want to single out lines with 'ok' or warning in it, why not just if re.search('(ok|warning)') : call_skip Helmut. -- Helmut Jarausch Lehrstuhl fuer Numerische Mathematik RWTH - Aachen University D 52056 Aachen, Germany |
Re: python regex "negative lookahead assertions" problems
On 11/22/09 16:05, Helmut Jarausch wrote:
> On 11/22/09 14:58, Jelle Smet wrote: >> Hi List, >> >> I'm trying to match lines in python using the re module. >> The end goal is to have a regex which enables me to skip lines which >> have ok and warning in it. >> But for some reason I can't get negative lookaheads working, the way >> it's explained in "http://docs.python.org/library/re.html". >> >> Consider this example: >> >> Python 2.6.4 (r264:75706, Nov 2 2009, 14:38:03) >> [GCC 4.4.1] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import re >>>>> line='2009-11-22 12:15:441 lmqkjsfmlqshvquhsudfhqf qlsfh >>>>> qsduidfhqlsiufh qlsiuf qldsfhqlsifhqlius dfh warning qlsfj lqshf >>>>> lqsuhf lqksjfhqisudfh qiusdfhq iusfh' >>>>> re.match('.*(?!warning)',line) >> <_sre.SRE_Match object at 0xb75b1598> >> >> I would expect that this would NOT match as it's a negative lookahead >> and warning is in the string. >> > > '.*' eats all of line. Now, when at end of line, there is no 'warning' > anymore, so it matches. > What are you trying to achieve? > > If you just want to single out lines with 'ok' or warning in it, why not > just > if re.search('(ok|warning)') : call_skip > Probably you don't want words like 'joke' to match 'ok'. So, a better regex is if re.search('\b(ok|warning)\b',line) : SKIP_ME Helmut. -- Helmut Jarausch Lehrstuhl fuer Numerische Mathematik RWTH - Aachen University D 52056 Aachen, Germany |
| All times are GMT. The time now is 06:51 AM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.