Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > text processing

Reply
Thread Tools

text processing

 
 
jitenshah78@gmail.com
Guest
Posts: n/a
 
      09-25-2008
I have string like follow
12560/ABC,12567/BC,123,567,890/JK

I want above string to group like as follow
(12560,ABC)
(12567,BC)
(123,567,890,JK)

i try regular expression i am able to get first two not the third one.
can regular expression given data in different groups


 
Reply With Quote
 
 
 
 
Marc 'BlackJack' Rintsch
Guest
Posts: n/a
 
      09-25-2008
On Thu, 25 Sep 2008 15:51:28 +0100, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> I have string like follow
> 12560/ABC,12567/BC,123,567,890/JK
>
> I want above string to group like as follow (12560,ABC)
> (12567,BC)
> (123,567,890,JK)
>
> i try regular expression i am able to get first two not the third one.
> can regular expression given data in different groups


Without regular expressions:

def group(string):
result = list()
for item in string.split(','):
if '/' in item:
result.extend(item.split('/'))
yield tuple(result)
result = list()
else:
result.append(item)

def main():
string = '12560/ABC,12567/BC,123,567,890/JK'
print list(group(string))

Ciao,
Marc 'BlackJack' Rintsch
 
Reply With Quote
 
 
 
 
kib2
Guest
Posts: n/a
 
      09-25-2008
You can do it with regexps too :

>------------------------------------------------------------------

import re
to_watch = re.compile(r"(?P<number>\d+)[/](?P<letter>[A-Z]+)")

final_list = to_watch.findall("12560/ABC,12567/BC,123,567,890/JK")

for number,word in final_list :
print "number:%s -- word: %s"%(number,word)
>------------------------------------------------------------------


the output is :

number:12560 -- word: ABC
number:12567 -- word: BC
number:890 -- word: JK

See you,

KibĀ².
 
Reply With Quote
 
MRAB
Guest
Posts: n/a
 
      09-25-2008
On Sep 25, 6:34*pm, Marc 'BlackJack' Rintsch <(E-Mail Removed)> wrote:
> On Thu, 25 Sep 2008 15:51:28 +0100, (E-Mail Removed) wrote:
> > I have string like follow
> > 12560/ABC,12567/BC,123,567,890/JK

>
> > I want above string to group like as follow (12560,ABC)
> > (12567,BC)
> > (123,567,890,JK)

>
> > i try regular expression i am able to get first two not the third one.
> > can regular expression given data in different groups

>
> Without regular expressions:
>
> def group(string):
> * * result = list()
> * * for item in string.split(','):
> * * * * if '/' in item:
> * * * * * * result.extend(item.split('/'))
> * * * * * * yield tuple(result)
> * * * * * * result = list()
> * * * * else:
> * * * * * * result.append(item)
>
> def main():
> * * string = '12560/ABC,12567/BC,123,567,890/JK'
> * * print list(group(string))
>

How about:

>>> string = "12560/ABC,12567/BC,123,567,890/JK"
>>> r = re.findall(r"(\d+(?:,\d+)*/\w+)", string)
>>> r

['12560/ABC', '12567/BC', '123,567,890/JK']
>>> [tuple(x.replace(",", "/").split("/")) for x in r]

[('12560', 'ABC'), ('12567', 'BC'), ('123', '567', '890', 'JK')]
 
Reply With Quote
 
Paul McGuire
Guest
Posts: n/a
 
      09-26-2008
On Sep 25, 9:51*am, "(E-Mail Removed)" <(E-Mail Removed)>
wrote:
> I have string like follow
> 12560/ABC,12567/BC,123,567,890/JK
>
> I want above string to group like as follow
> (12560,ABC)
> (12567,BC)
> (123,567,890,JK)
>
> i try regular expression i am able to get first two not the third one.
> can regular expression given data in different groups


Looks like each item is:
- a list of 1 or more integers, in a comma-delimited list
- a slash
- a word composed of alpha characters

And the whole thing is a list of items in a comma-delimited list

Now to implement that in pyparsing:

>>> data = "12560/ABC,12567/BC,123,567,890/JK"
>>> from pyparsing import Suppress, delimitedList, Word, alphas, nums, Group
>>> SLASH = Suppress('/')
>>> dataitem = delimitedList(Word(nums)) + SLASH + Word(alphas)
>>> dataformat = delimitedList(Group(dataitem))
>>> map(tuple, dataformat.parseString(data))

[('12560', 'ABC'), ('12567', 'BC'), ('123', '567', '890', 'JK')]

Wah-lah! (as one of my wife's 1st graders announced in one of his
school papers)

-- Paul


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Controlling text in a Text Area or Text leo ASP General 1 12-05-2005 01:13 AM
Processing pathnames listed in a text file. Jason Heyes C++ 4 03-24-2005 11:47 AM
Post-Processing RAW vs Post-Processing TIFF Mike Henley Digital Photography 42 01-30-2005 08:26 AM
Question: processing HTML, re-write default processing action of many tags Hubert Hung-Hsien Chang Python 2 09-17-2004 03:10 PM
"Text Processing in Python" review on Slashdot Joe Francia Python 0 07-08-2003 04:30 AM



Advertisments