Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > regex question

Reply
Thread Tools

regex question

 
 
Helmut Jarausch
Guest
Posts: n/a
 
      06-25-2005
Felix Schwarz wrote:
> Hi all,
>
> I'm experiencing problems with a regular expression and I can't figure
> out which words I use when googling. I read the python documentation for
> the re module multiple times now but still no idea what I'm doing wrong.
>
> What I want to do:
> - Extract all digits (\d) in a string.
> - Digits are separated by space (\w)
>
> What my program does:
> - It extracts only the last digit.
>
> Here is my program:
> import re
> line = ' 1 2 3'
> regex = '^' + '(?:\s+(\d))*' + '$'
> match = re.match(regex, line)
> print "lastindex is: ",match.lastindex
> print "matches: ",match.group(1)
>
>
> Obviously I do not understand how (?:\s+(\d))* works in conjunction with
> ^ and $.
>


I am sure what you like to do.
What about
regex= re.compile('\s+\d')
print regex.findall(line)



--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
 
Reply With Quote
 
 
 
 
Felix Schwarz
Guest
Posts: n/a
 
      06-25-2005
Hi all,

I'm experiencing problems with a regular expression and I can't figure
out which words I use when googling. I read the python documentation for
the re module multiple times now but still no idea what I'm doing wrong.

What I want to do:
- Extract all digits (\d) in a string.
- Digits are separated by space (\w)

What my program does:
- It extracts only the last digit.

Here is my program:
import re
line = ' 1 2 3'
regex = '^' + '(?:\s+(\d))*' + '$'
match = re.match(regex, line)
print "lastindex is: ",match.lastindex
print "matches: ",match.group(1)


Obviously I do not understand how (?:\s+(\d))* works in conjunction with
^ and $.

Does anybody know how to transform this regex to get the result I want
to have?

fs
 
Reply With Quote
 
 
 
 
George Sakkis
Guest
Posts: n/a
 
      06-25-2005
"Felix Schwarz" <(E-Mail Removed)> wrote:

> Hi all,
>
> I'm experiencing problems with a regular expression and I can't figure
> out which words I use when googling. I read the python documentation for
> the re module multiple times now but still no idea what I'm doing wrong.
>
> What I want to do:
> - Extract all digits (\d) in a string.
> - Digits are separated by space (\w)
>
> What my program does:
> - It extracts only the last digit.
>
> Here is my program:
> import re
> line = ' 1 2 3'
> regex = '^' + '(?:\s+(\d))*' + '$'
> match = re.match(regex, line)
> print "lastindex is: ",match.lastindex
> print "matches: ",match.group(1)
>
>
> Obviously I do not understand how (?:\s+(\d))* works in conjunction with
> ^ and $.
>
> Does anybody know how to transform this regex to get the result I want
> to have?
>
> fs


Here are three ways:

- If you your strings consist of only white space and single digits as
in your example, the simplest way is split():
>>> ' 1 2 3'.split()

['1', '2', '3']

- Otherwise use re.findall:
>>> import re
>>> digit = re.compile(r'\d')
>>> digit.findall('1 ab 34b 6')

['1', '3', '4', '6']

- Finally, for the special case you are searching for single characters
(such as digits), perhaps the fastest way is to use string.translate:

>>> import string
>>> allchars = string.maketrans('','') # 2 empty strings
>>> nondigits = allchars.translate(allchars, string.digits)
>>> '1 ab 34 6'.translate(allchars, nondigits)

'1346'

Note that the result is a string of the matched characters, not a list;
you can simply turn it to list by list('1346').

Hope this helps,

George

 
Reply With Quote
 
Paul McGuire
Guest
Posts: n/a
 
      06-25-2005
Here's a pyparsing version of this, that may be easier to maintain long
term (although if you have your heart set on learning regexp's, they
will certainly do the job). Note that in pyparsing, you don't have to
spell out where the whitespace goes - pyparsing's default logic assumes
that whitespace may be found between any grammar elements, and if
found, it is ignored. (I believe regexp has a special magic symbol
that will do the same thing.)

Download pyparsing at http://pyparsing.sourceforge.net.
-- Paul


import pyparsing as pp

testString = ' 1 2 3'

integer = pp.Word(pp.nums)
lineData = pp.OneOrMore( integer )

results = lineData.parseString( testString )
print results

will print:
['1', '2', '3']

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
String Pattern Matching: regex and Python regex documentation Xah Lee Java 1 09-22-2006 07:11 PM
Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine? =?Utf-8?B?SmViQnVzaGVsbA==?= ASP .Net 2 10-22-2005 02:43 PM
Java regex imposture re: Perl regex compatibility a_c_Attlee@yahoo.com Java 2 05-06-2005 12:16 AM
perl regex to java regex Rick Venter Java 5 11-06-2003 10:55 AM



Advertisments