Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Question on Python Split

Reply
Thread Tools

Question on Python Split

 
 
subhabangalore@gmail.com
Guest
Posts: n/a
 
      10-07-2012
Dear Group,

Suppose I have a string as,

"Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."

I am terming it as,

str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."

I am working now with a split function,

str_words=str1.split()
so, I would get the result as,
['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']

But I am looking for,

['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']

This can be done if we assign the string as,

str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"

and then assign the split statement as,

str1_word=str1.split(",")

would produce,

['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']

My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,

[(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']

as I see if I assign it as

for i in str1_word:
print i
ti=tuple(i)
print ti

I am not getting the desired result.

If I work again from tuple point, I get it as,
>>> tup1=('Project Gutenberg')
>>> tup2=('has 36000')
>>> tup3=('free ebooks')
>>> tup4=('for Kindle')
>>> tup5=('Android iPad')
>>> tup6=tup1+tup2+tup3+tup4+tup5
>>> print tup6

Project Gutenberghas 36000free ebooksfor KindleAndroid iPad

Then how may I achieve it? If any one of the learned members can kindly guide me.
Thanks in Advance,
Regards,
Subhabrata.

NB: Apology for some minor errors.







 
Reply With Quote
 
 
 
 
MRAB
Guest
Posts: n/a
 
      10-07-2012
On 2012-10-07 20:30, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Dear Group,
>
> Suppose I have a string as,
>
> "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
>
> I am terming it as,
>
> str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
>
> I am working now with a split function,
>
> str_words=str1.split()
> so, I would get the result as,
> ['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']
>
> But I am looking for,
>
> ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']
>
> This can be done if we assign the string as,
>
> str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"
>
> and then assign the split statement as,
>
> str1_word=str1.split(",")
>
> would produce,
>
> ['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']
>

It can also be done like this:

>>> str1 = "Project Gutenberg has 36000 free ebooks for Kindle Android

iPad iPhone."
>>> # Splitting into words:
>>> s = str1.split()
>>> s

['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for',
'Kindle', 'Android', 'iPad', 'iPhone.']
>>> # Using slicing with a stride of 2 gives:
>>> s[0 : : 2]

['Project', 'has', 'free', 'for', 'Android', 'iPhone.']
>>> # Similarly for the other words gives:
>>> s[1 : : 2]

['Gutenberg', '36000', 'ebooks', 'Kindle', 'iPad']
>>> # Combining them in pairs, and adding an extra empty string in case

there's an odd number of words:
>>> [(x + ' ' + y).rstrip() for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]

['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android
iPad', 'iPhone.']

> My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,
>
> [(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']
>
> as I see if I assign it as
>
> for i in str1_word:
> print i
> ti=tuple(i)
> print ti
>
> I am not getting the desired result.
>
> If I work again from tuple point, I get it as,
>>>> tup1=('Project Gutenberg')
>>>> tup2=('has 36000')
>>>> tup3=('free ebooks')
>>>> tup4=('for Kindle')
>>>> tup5=('Android iPad')
>>>> tup6=tup1+tup2+tup3+tup4+tup5
>>>> print tup6

> Project Gutenberghas 36000free ebooksfor KindleAndroid iPad
>

It's the comma that makes the tuple, not the parentheses, except for the
empty tuple which is just empty parentheses, i.e. ().

> Then how may I achieve it? If any one of the learned members can kindly guide me.


>>> [((x + ' ' + y).rstrip(), ) for x, y in zip(s[0 : : 2], s[1 : : 2]

+ [''])]
[('Project Gutenberg',), ('has 36000',), ('free ebooks',), ('for
Kindle',), ('Android iPad',), ('iPhone.',)]

Is this what you want?

If you want it to be a list of pairs of words, then:

>>> [(x, y) for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]

[('Project', 'Gutenberg'), ('has', '36000'), ('free', 'ebooks'), ('for',
'Kindle'), ('Android', 'iPad'), ('iPhone.', '')]

 
Reply With Quote
 
 
 
 
Terry Reedy
Guest
Posts: n/a
 
      10-07-2012
On 10/7/2012 3:30 PM, (E-Mail Removed) wrote:

> If I work again from tuple point, I get it as,
>>>> tup1=('Project Gutenberg')
>>>> tup2=('has 36000')
>>>> tup3=('free ebooks')
>>>> tup4=('for Kindle')
>>>> tup5=('Android iPad')


These are strings, not tuples. Numbered names like this are a bad idea.

>>>> tup6=tup1+tup2+tup3+tup4+tup5
>>>> print tup6

> Project Gutenberghas 36000free ebooksfor KindleAndroid iPad


tup1=('Project Gutenberg')
tup2=('has 36000')
tup3=('free ebooks')
tup4=('for Kindle')
tup5=('Android iPad')
print(' '.join((tup1,tup2,tup3,tup4,tup5)))

>>>

Project Gutenberg has 36000 free ebooks for Kindle Android iPad

--
Terry Jan Reedy

 
Reply With Quote
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      10-08-2012
On Sun, 7 Oct 2012 12:30:52 -0700 (PDT), (E-Mail Removed)
declaimed the following in gmane.comp.python.general:

>
> But I am looking for,
>
> ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']
>


Is splitting a sentence at every other word really what you want? Or
are you intending, at some point, to have the splitting take place on
syntactic/semantic features (subject, verb, object...).

If the latter, you may be in need of some Natural Language
Processing (NLP) libraries/algorithms. (First google hit:
http://nltk.org/ )
--
Wulfraed Dennis Lee Bieber AF6VN
(E-Mail Removed) HTTP://wlfraed.home.netcom.com/

 
Reply With Quote
 
subhabangalore@gmail.com
Guest
Posts: n/a
 
      10-08-2012
On Monday, October 8, 2012 1:00:52 AM UTC+5:30, (E-Mail Removed) wrote:
> Dear Group,
>
>
>
> Suppose I have a string as,
>
>
>
> "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
>
>
>
> I am terming it as,
>
>
>
> str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
>
>
>
> I am working now with a split function,
>
>
>
> str_words=str1.split()
>
> so, I would get the result as,
>
> ['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']
>
>
>
> But I am looking for,
>
>
>
> ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']
>
>
>
> This can be done if we assign the string as,
>
>
>
> str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"
>
>
>
> and then assign the split statement as,
>
>
>
> str1_word=str1.split(",")
>
>
>
> would produce,
>
>
>
> ['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']
>
>
>
> My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,
>
>
>
> [(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']
>
>
>
> as I see if I assign it as
>
>
>
> for i in str1_word:
>
> print i
>
> ti=tuple(i)
>
> print ti
>
>
>
> I am not getting the desired result.
>
>
>
> If I work again from tuple point, I get it as,
>
> >>> tup1=('Project Gutenberg')

>
> >>> tup2=('has 36000')

>
> >>> tup3=('free ebooks')

>
> >>> tup4=('for Kindle')

>
> >>> tup5=('Android iPad')

>
> >>> tup6=tup1+tup2+tup3+tup4+tup5

>
> >>> print tup6

>
> Project Gutenberghas 36000free ebooksfor KindleAndroid iPad
>
>
>
> Then how may I achieve it? If any one of the learned members can kindly guide me.
>
> Thanks in Advance,
>
> Regards,
>
> Subhabrata.
>
>
>
> NB: Apology for some minor errors.


Thank you for nice answer. Your codes and discussions always inspire me.

Regards,
Subhabrata.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
a split is not a split Dumbell Computer Support 3 03-09-2009 10:45 PM
String#split(/\s+/) vs. String#split(/(\s+)/) Sam Kong Ruby 5 08-12-2006 07:59 PM
How can I split database results with ExecuteReader and Split? needin4mation@gmail.com ASP .Net 2 05-05-2006 10:36 PM
split on '' (and another for split -1) trans. (T. Onoma) Ruby 10 12-28-2004 06:36 AM
Small inconsistency between string.split and "".split Carlos Ribeiro Python 11 09-17-2004 05:57 PM



Advertisments