Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Regular express question (http://www.velocityreviews.com/forums/t703840-regular-express-question.html)

elca 10-31-2009 02:48 AM

Regular express question
 

Hello,
i have some text document to parse.
sample text is such like follow
in this document, i would like to extract such like
SUBJECT = 'NETHERLANDS MUSIC EPA'
CONTENT = 'Michael Buble performs in Amsterdam Canadian singer Michael Buble
performs during a concert in Amsterdam, The Netherlands, 30 October 2009.
Buble released his new album entitled 'Crazy Love'. EPA/OLAF KRAAK '

if anyone help me,much appreciate

"
NETHERLANDS MUSIC EPA | 36 before
Michael Buble performs in Amsterdam Canadian singer Michael Buble performs
during a concert in Amsterdam, The Netherlands, 30 October 2009. Buble
released his new album entitled 'Crazy Love'. EPA/OLAF KRAAK
"
--
View this message in context: http://old.nabble.com/Regular-expres...p26139434.html
Sent from the Python - python-list mailing list archive at Nabble.com.


alex23 11-02-2009 03:23 AM

Re: Regular express question
 
On Oct 31, 12:48*pm, elca <high...@gmail.com> wrote:
> Hello,
> i have some text document to parse.
> sample text is such like follow
> in this document, i would like to extract such like
> SUBJECT = 'NETHERLANDS MUSIC EPA'
> CONTENT = 'Michael Buble performs in Amsterdam Canadian singer Michael Buble
> performs during a concert in Amsterdam, The Netherlands, 30 October 2009.
> Buble released his new album entitled 'Crazy Love'. EPA/OLAF KRAAK '
>
> if anyone help me,much appreciate
>
> "
> NETHERLANDS MUSIC EPA | 36 before
> Michael Buble performs in Amsterdam Canadian singer Michael Buble performs
> during a concert in Amsterdam, The Netherlands, 30 October 2009. Buble
> released his new album entitled 'Crazy Love'. EPA/OLAF KRAAK
> "


You really don't need regular expressions for this:

>>> import os
>>> eol = os.linesep
>>> text = '''

.... NETHERLANDS MUSIC EPA | 36 before
.... Michael Buble performs in Amsterdam Canadian singer Michael Buble
performs
.... during a concert in Amsterdam, The Netherlands, 30 October 2009.
Buble
.... released his new album entitled 'Crazy Love'. EPA/OLAF KRAAK
.... '''
>>> text = text.strip() # remove eol markers
>>> subject = text.split(' | ')[0]
>>> content = ' '.join(text.split(eol)[1:])
>>> subject

'NETHERLANDS MUSIC EPA'
>>> content

"Michael Buble performs in Amsterdam Canadian singer Michael Buble
performs during a concert in Amsterdam, The Netherlands, 30 October
2009. Buble released his new album entitled 'Crazy Love'. EPA/OLAF
KRAAK"


All times are GMT. The time now is 12:28 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.