Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > a simple regex question

Reply
Thread Tools

a simple regex question

 
 
John Salerno
Guest
Posts: n/a
 
      03-31-2006
Ok, I'm stuck on another Python challenge question. Apparently what you
have to do is search through a huge group of characters and find a
single lowercase character that has exactly three uppercase characters
on either side of it. Here's what I have so far:

pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z])+'
print re.search(pattern, mess).groups()

Not sure if 'groups' is necessary or not.

Anyway, this returns one matching string, but when I put this letter in
as the solution to the problem, I get a message saying "yes, but there
are more", so assuming this means that there is more than one character
with three caps on either side, is my RE written correctly to find them
all? I didn't have the parentheses or + sign at first, but I added them
to find all the possible matches, but still only one comes up.

Thanks.
 
Reply With Quote
 
 
 
 
John Salerno
Guest
Posts: n/a
 
      03-31-2006
John Salerno wrote:
> Ok, I'm stuck on another Python challenge question. Apparently what you
> have to do is search through a huge group of characters and find a
> single lowercase character that has exactly three uppercase characters
> on either side of it. Here's what I have so far:
>
> pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z])+'
> print re.search(pattern, mess).groups()
>
> Not sure if 'groups' is necessary or not.
>
> Anyway, this returns one matching string, but when I put this letter in
> as the solution to the problem, I get a message saying "yes, but there
> are more", so assuming this means that there is more than one character
> with three caps on either side, is my RE written correctly to find them
> all? I didn't have the parentheses or + sign at first, but I added them
> to find all the possible matches, but still only one comes up.
>
> Thanks.


A quick note: I found nine more matches by using findall() instead of
search(), but I'm still curious how to write the RE so that it works
with search, especially since findall wouldn't have returned overlapping
matches. I guess I didn't write it to properly check multiple times.
 
Reply With Quote
 
 
 
 
Justin Azoff
Guest
Posts: n/a
 
      04-01-2006

John Salerno wrote:
> Ok, I'm stuck on another Python challenge question. Apparently what you
> have to do is search through a huge group of characters and find a
> single lowercase character that has exactly three uppercase characters
> on either side of it. Here's what I have so far:
>
> pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z])+'
> print re.search(pattern, mess).groups()
>
> Not sure if 'groups' is necessary or not.
>
> Anyway, this returns one matching string, but when I put this letter in
> as the solution to the problem, I get a message saying "yes, but there
> are more", so assuming this means that there is more than one character
> with three caps on either side, is my RE written correctly to find them
> all? I didn't have the parentheses or + sign at first, but I added them
> to find all the possible matches, but still only one comes up.
>
> Thanks.


I don't believe you _need_ the parenthesis or the + in that usage...

Have a look at http://docs.python.org/lib/node115.html

It should be obvious which method you need to use to "find them all"

--
- Justin

 
Reply With Quote
 
John Salerno
Guest
Posts: n/a
 
      04-01-2006
Justin Azoff wrote:
> John Salerno wrote:
>> Ok, I'm stuck on another Python challenge question. Apparently what you
>> have to do is search through a huge group of characters and find a
>> single lowercase character that has exactly three uppercase characters
>> on either side of it. Here's what I have so far:
>>
>> pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z])+'
>> print re.search(pattern, mess).groups()
>>
>> Not sure if 'groups' is necessary or not.
>>
>> Anyway, this returns one matching string, but when I put this letter in
>> as the solution to the problem, I get a message saying "yes, but there
>> are more", so assuming this means that there is more than one character
>> with three caps on either side, is my RE written correctly to find them
>> all? I didn't have the parentheses or + sign at first, but I added them
>> to find all the possible matches, but still only one comes up.
>>
>> Thanks.

>
> I don't believe you _need_ the parenthesis or the + in that usage...
>
> Have a look at http://docs.python.org/lib/node115.html
>
> It should be obvious which method you need to use to "find them all"
>


But would findall return this match: aMNHiRFLoDLFb ??

There are actually two matches there, but they overlap. So how would
your write an RE that catches them both?
 
Reply With Quote
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      04-01-2006
On Fri, 31 Mar 2006 18:39:43 -0500, John Salerno
<(E-Mail Removed)> declaimed the following in comp.lang.python:

> Ok, I'm stuck on another Python challenge question. Apparently what you
> have to do is search through a huge group of characters and find a
> single lowercase character that has exactly three uppercase characters
> on either side of it. Here's what I have so far:
>
> pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z])+'
> print re.search(pattern, mess).groups()
>

I don't do REs; but what exactly are you supposed to return? A
count, the index to where such a match occurred, the 7-characters
themselves?

I'd probably do something very simplistic:

>>> c = "A long STRiNGS testing is aVAIlABLe"
>>> for x in range(3,len(data)-3):

.... if c[x-3-1].isupper() and c[x].islower() and
c[x+1+3].isupper():
.... print "=> ", c[x-3+3]
....
=> STRiNG
=> VAIlAB
>>>


Needs a bit more work since it doesn't exclude having MORE than
three uppercase on a side... Testing -4 and +4 for lowercase would do
most of it... But that ends up making the start and end of data special
cases...
--
> ================================================== ============ <
> http://www.velocityreviews.com/forums/(E-Mail Removed) | Wulfraed Dennis Lee Bieber KD6MOG <
> (E-Mail Removed) | Bestiaria Support Staff <
> ================================================== ============ <
> Home Page: <http://www.dm.net/~wulfraed/> <
> Overflow Page: <http://wlfraed.home.netcom.com/> <

 
Reply With Quote
 
Roel Schroeven
Guest
Posts: n/a
 
      04-01-2006
John Salerno schreef:
>> pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z])+'
>> print re.search(pattern, mess).groups()
>>
>> Anyway, this returns one matching string, but when I put this letter in
>> as the solution to the problem, I get a message saying "yes, but there
>> are more", so assuming this means that there is more than one character
>> with three caps on either side, is my RE written correctly to find them
>> all? I didn't have the parentheses or + sign at first, but I added them
>> to find all the possible matches, but still only one comes up.
>>
>> Thanks.

>
> A quick note: I found nine more matches by using findall() instead of
> search(), but I'm still curious how to write the RE so that it works
> with search, especially since findall wouldn't have returned overlapping
> matches. I guess I didn't write it to properly check multiple times.


It seems to me you should be able to find all matches with search(). Not
with the pattern you mention above: that will only find matches if they
come right after each other, as in
xXXXxXXXxyYYYyYYYyzZZZzZZZz

You'll need something more like
pattern = '([a-z][A-Z]{3}[a-z][A-Z]{3}[a-z]+)+'
so that it will find matches that are further apart from each other.

That said, I think findall() is a better solution for this problem. I
don't think search() will find overlapping matches either, so that's no
reason not to use findall(), and the pattern is simpler with findall();
I solved this challenge with findall() and this regular expression:

pattern = r'[a-z][A-Z]{3}[a-z][A-Z]{3}[a-z]'


--
If I have been able to see further, it was only because I stood
on the shoulders of giants. -- Isaac Newton

Roel Schroeven
 
Reply With Quote
 
Paddy
Guest
Posts: n/a
 
      04-02-2006

John Salerno wrote:
> But would findall return this match: aMNHiRFLoDLFb ??
>
> There are actually two matches there, but they overlap. So how would
> your write an RE that catches them both?


I remembered the 'non-consuming' match (?+...) and a miniute of
experimentation gave
the following.

>>> import re
>>> s ="aMNHiRFLoDLFb"
>>> re.findall(r'[A-Z]{3}([a-z])(?=[A-Z]{3})', s)

['i', 'o']
>>>


- Paddy.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
Simple Python REGEX Question johnny Python 4 05-12-2007 09:38 PM
Simple regex question Todd Ruby 3 10-25-2005 11:49 AM
RegEx Woes! Please Help, Simple Question Saad Malik Java 5 05-02-2005 04:06 PM
(Maybe) a simple question about regex Sam Kong Ruby 8 03-25-2005 01:25 PM



Advertisments