Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Can python read up to where a certain pattern is matched?

Reply
Thread Tools

Can python read up to where a certain pattern is matched?

 
 
Anthony Liu
Guest
Posts: n/a
 
      03-06-2004
I am kinda new to Python, but not new to programming.
I am a certified Java programmer.

I don't want to read line after line, neither do I
want to read the whole file all at once. Thus none of
read(), readline(), readlines() is what I want. I want
to read a text file sentence by sentence.

A sentence by definition is roughly the part between a
full stop and another full stop or !, ?

So, for example, for the following text:

"Some words here, and some other words. Then another
segment follows, and more. This is a question, a junk
question, followed by a question mark?"

It has 3 sentences (2 full stops and 1 question mark),
and therefore I want to read it in 3 lumps and each
lump gives me one complete sentence as follows:

lump 1: Some words here, and some other words.

lump 2: Then another segment follows, and more.

lump 3: This is a question, a junk question, followed
by a question mark?

How can I achieve this? Do we have a readsentence()
function?

Please give a hint. Thank you!


__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you’re looking for faster
http://search.yahoo.com

 
Reply With Quote
 
 
 
 
William Park
Guest
Posts: n/a
 
      03-07-2004
Anthony Liu <> wrote:
> I am kinda new to Python, but not new to programming.
> I am a certified Java programmer.
>
> I don't want to read line after line, neither do I
> want to read the whole file all at once. Thus none of
> read(), readline(), readlines() is what I want. I want
> to read a text file sentence by sentence.


Question: How do I read sentence by sentence?
Answer: Read input stream char by char.

--
William Park, Open Geometry Consulting, <>
Linux solution for data processing and document management.
 
Reply With Quote
 
 
 
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      03-07-2004
On 7 Mar 2004 03:00:14 GMT, William Park <>
declaimed the following in comp.lang.python:

> Question: How do I read sentence by sentence?
> Answer: Read input stream char by char.


Ugh... Even my jaded neophyte self (as of Intro to FORTRAN,
1976) wouldn't consider that... Of course, since FORTRAN basically was
line-oriented, one would be biased to other methods.

IE; write a wrapper subroutine that reads whole lines, looks for
".", and returns what lies before it (including it); then shift the
remains and append the next line for the subsequent call.

--
> ================================================== ============ <
> | Wulfraed Dennis Lee Bieber KD6MOG <
> | Bestiaria Support Staff <
> ================================================== ============ <
> Home Page: <http://www.dm.net/~wulfraed/> <
> Overflow Page: <http://wlfraed.home.netcom.com/> <

 
Reply With Quote
 
F. Petitjean
Guest
Posts: n/a
 
      03-07-2004
On Fri, 5 Mar 2004, Anthony Liu <> wrote:
> I am kinda new to Python, but not new to programming.
>
> I don't want to read line after line, neither do I
> want to read the whole file all at once. Thus none of
> read(), readline(), readlines() is what I want. I want
> to read a text file sentence by sentence.
>
> A sentence by definition is roughly the part between a
> full stop and another full stop or !, ?
>
> So, for example, for the following text:
>
> "Some words here, and some other words. Then another
> segment follows, and more. This is a question, a junk
> question, followed by a question mark?"
>
> It has 3 sentences (2 full stops and 1 question mark),
> snip
> How can I achieve this? Do we have a readsentence()
> function?
>
> Please give a hint. Thank you!
>

the hint :
import itertools
help(itertool.takewhile)

# not tested (no python 2.3 on Debian gateway at home)

import itertools
def readsentence(iterable, ends = (".", "!", "?"), yield_fn=''.join):
"""generator function which yields sentences terminated by ends"""
end_pred = ends
if not callable(ends):
end_pred = lambda c : c not in ends
it = iter(iterable)
while True:
sentence = []
add = sentence.append
for c in itertools.takewhile(end_pred, it)
add(c)
# How to have the item skipped by takewhile ?
t = tuple(sentence)
if callable(yield_fn):
t = yield_fn(t)
yield t

text = """\
Some words here, and some other words. Then another
segment follows, and more. This is a question, a junk
question, followed by a question mark?"""

for sentence in readsentence(text):
print sentence
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Windows Vista cannot obtain an IP address from certain routers or from certain non-Microsoft DHCP Brian W Wireless Networking 7 01-31-2010 03:46 AM
Binding certain rows to certain columns in GridView? bernard.oflynn@gmail.com ASP .Net 2 03-25-2008 03:49 PM
Expanding certain path to certain node in a JTree arun.hallan@gmail.com Java 0 01-08-2005 08:26 PM
Re: Can python read up to where a certain pattern is matched? Anthony Liu Python 0 03-07-2004 03:25 AM
Re: Can python read up to where a certain pattern is matched? Andrew Bennetts Python 2 03-06-2004 11:40 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57