Velocity Reviews > re.findall

# re.findall

Will Stuyvesant
Guest
Posts: n/a

 07-16-2004
A question about the findall function in the re module, and I also
would be happy with pointers to online documentation with which I
could have found a solution myself (if it even exists!).

I have this program and result:

-------------- program ------------------
import re

pat = r'(.*) or (.*)'
r = re.findall(pat, "i or j or k")
print r
-------------- result -------------------
[('i or j', 'k')]
-------------------------------------------

But the result I want is this:
[('i', 'j or k'), ('i or j', 'k')]

So my regular expression is wrong, but also variations like
r'^(.*|.*?) or (.*|.*?)\$' do not work. I have the feeling I am
overlooking something simple. Maybe I should not try to use
re.findall to get the desired result, but do a re.match in a
while-loop?

Stated in English the problem is this: give me all combinations of
strings left and right of r' or ', and return them as a list of
tuples. Sure I can solve this without using the re module, using
while-loops and string.search etc., but I also have variations of
this problem and learning re seems useful.

Matthias Huening
Guest
Posts: n/a

 07-16-2004
http://www.velocityreviews.com/forums/(E-Mail Removed) (Will Stuyvesant) wrote in
news:(E-Mail Removed) om:

> A question about the findall function in the re module, and I also
> would be happy with pointers to online documentation with which I
> could have found a solution myself (if it even exists!).
>

The problem is that the strings you want to find are overlapping.
This should get you started:

import re

s = "i or j or k or grr"
pat = re.compile(r'\w+ or \w+')

startposition = 0
while 1:
res = pat.search(s, startposition)
if res == None:
break
startposition = res.start() + 1
print res.group()

Matthias