Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > generators shared among threads

Reply
Thread Tools

generators shared among threads

 
 
jess.austin@gmail.com
Guest
Posts: n/a
 
      03-04-2006
hi,

This seems like a difficult question to answer through testing, so I'm
hoping that someone will just know... Suppose I have the following
generator, g:

def f()
i = 0
while True:
yield i
i += 1
g=f()

If I pass g around to various threads and I want them to always be
yielded a unique value, will I have a race condition? That is, is it
possible that the cpython interpreter would interrupt one thread after
the increment and before the yield, and then resume another thread to
yield the first thread's value, or increment the stored i, or both,
before resuming the first thread? If so, would I get different
behavior if I just set g like:

g=itertools.count()

If both of these idioms will give me a race condition, how might I go
about preventing such? I thought about using threading.Lock, but I'm
sure that I don't want to put a lock around the yield statement.

thanks,
Jess

 
Reply With Quote
 
 
 
 
Alex Martelli
Guest
Posts: n/a
 
      03-04-2006
<> wrote:

> hi,
>
> This seems like a difficult question to answer through testing, so I'm
> hoping that someone will just know... Suppose I have the following
> generator, g:
>
> def f()
> i = 0
> while True:
> yield i
> i += 1
> g=f()
>
> If I pass g around to various threads and I want them to always be
> yielded a unique value, will I have a race condition? That is, is it


Yes, you will.

> before resuming the first thread? If so, would I get different
> behavior if I just set g like:
>
> g=itertools.count()


I believe that in the current implementation you'd get "lucky", but
there is no guarantee that such luck would persist across even a minor
bugfix in the implementation. Don't do it.

> If both of these idioms will give me a race condition, how might I go
> about preventing such? I thought about using threading.Lock, but I'm
> sure that I don't want to put a lock around the yield statement.


Queue.Queue is often the best way to organize cooperation among threads.
Make a Queue.Queue with a reasonably small maximum size, a single
dedicated thread that puts successive items of itertools.count onto it
(implicitly blocking and waiting when the queue gets full), and any
other thread can call get on the queue and obtain a unique item
(implicitly waiting a little bit if the queue ever gets empty, until the
dedicated thread waits and fills the queue again). [[Alternatively you
could subclass Queue and override the hook-method _get, which always
gets called in a properly locked and thus serialized condition; but that
may be considered a reasonably advanced task, since such subclassing
isn't documented in the reference library, only in Queue's sources]].


Alex
 
Reply With Quote
 
 
 
 
Alan Kennedy
Guest
Posts: n/a
 
      03-05-2006
[]
> def f()
> i = 0
> while True:
> yield i
> i += 1
> g=f()
>
> If I pass g around to various threads and I want them to always be
> yielded a unique value, will I have a race condition?


Yes.

Generators can be shared between threads, but they cannot be resumed
from two threads at the same time.

You should wrap it in a lock to ensure that only one thread at a time
can resume the generator.

Read this thread from a couple of years back about the same topic.

Suggested generator to add to threading module.
http://groups.google.com/group/comp....ede21f7dd78f34

Also contained in that thread is an implementation of Queue.Queue which
supplies values from a generator, and which does not require a separate
thread to generate values.

HTH,

--
alan kennedy
------------------------------------------------------
email alan: http://xhaus.com/contact/alan

 
Reply With Quote
 
jess.austin@gmail.com
Guest
Posts: n/a
 
      03-07-2006
Thanks for the great advice, Alex. Here is a subclass that seems to
work:

from Queue import Queue
from itertools import count

class reentrantQueue(Queue):
def _init(self, maxsize):
self.maxsize = 0
self.queue = [] # so we don't have to override put()
self.counter = count()
def _empty(self):
return False
def _get(self):
return self.counter.next()
def next(self):
return self.get()
def __iter__(self):
return self

 
Reply With Quote
 
Alex Martelli
Guest
Posts: n/a
 
      03-07-2006
<> wrote:

> Thanks for the great advice, Alex. Here is a subclass that seems to
> work:


You're welcome!

> from Queue import Queue
> from itertools import count
>
> class reentrantQueue(Queue):
> def _init(self, maxsize):
> self.maxsize = 0
> self.queue = [] # so we don't have to override put()
> self.counter = count()
> def _empty(self):
> return False
> def _get(self):
> return self.counter.next()
> def next(self):
> return self.get()
> def __iter__(self):
> return self


You may also want to override _put to raise an exception, just to avoid
accidental misuse, though I agree it's marginal. Also, I'd use maxsize
(if provided and >0) as the upperbound for the counting; not sure that's
necessary but it seems pretty natural to raise StopIteration (rather
than returning the counter's value) if the counter reaches that
"maxsize". Last, I'm not sure I'd think of this as a reentrantQueue, so
much as a ReentrantCounter.


Alex
 
Reply With Quote
 
jess.austin@gmail.com
Guest
Posts: n/a
 
      03-07-2006
Alex wrote:
> Last, I'm not sure I'd think of this as a reentrantQueue, so
> much as a ReentrantCounter.


Of course! It must have been late when I named this class... I think
I'll go change the name in my code right now.

 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      03-07-2006
(Alex Martelli) writes:
> > g=itertools.count()

>
> I believe that in the current implementation you'd get "lucky", but
> there is no guarantee that such luck would persist across even a minor
> bugfix in the implementation. Don't do it.


I remember being told that xrange(sys.maxint) was thread-safe, but of
course I wouldn't want to depend on that across Python versions either.

> Queue.Queue is often the best way to organize cooperation among threads.
> Make a Queue.Queue with a reasonably small maximum size, a single
> dedicated thread that puts successive items of itertools.count onto it
> (implicitly blocking and waiting when the queue gets full),


This should be pretty simple to implement and not much can go wrong
with it, but it means a separate thread for each such generator, and
waiting for thread switches every time the queue goes empty. A more
traditional approach would be to use a lock in the generator,

def f():
lock = threading.Lock()
i = 0
while True:
lock.acquire()
yield i
i += 1
lock.release()

but it's easy to make mistakes when implementing things like that
(I'm not even totally confident that the above is correct).

Hmm (untested, like above):

class Synchronized:
def __init__(self, generator):
self.gen = generator
self.lock = threading.Lock()
def next(self):
self.lock.acquire()
try:
yield self.gen.next()
finally:
self.lock.release()

synchronized_counter = Synchronized(itertools.count())

That isn't a general solution but can be convenient (if I didn't mess
it up). Maybe there's a more general recipe somewhere.
 
Reply With Quote
 
jess.austin@gmail.com
Guest
Posts: n/a
 
      03-08-2006
Paul wrote:
> def f():
> lock = threading.Lock()
> i = 0
> while True:
> lock.acquire()
> yield i
> i += 1
> lock.release()
>
> but it's easy to make mistakes when implementing things like that
> (I'm not even totally confident that the above is correct).


The main problem with this is that the yield leaves the lock locked.
If any other thread wants to read the generator it will block. Your
class Synchronized fixes this with the "finally" hack (please note that
from me this is NOT a pejorative). I wonder... is that future-proof?
It seems that something related to this might change with 2.5? My
notes from GvR's keynote don't seem to include this. Someone that
knows more than I do about the intersection between "yield" and
"finally" would have to speak to that.

 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      03-08-2006
writes:
> The main problem with this is that the yield leaves the lock locked.
> If any other thread wants to read the generator it will block.


Ouch, good catch. Do you see a good fix other than try/finally?
Is there a classical way to do it with coroutines and semaphores?
 
Reply With Quote
 
Alex Martelli
Guest
Posts: n/a
 
      03-08-2006
Paul Rubin <http://> wrote:

> writes:
> > The main problem with this is that the yield leaves the lock locked.
> > If any other thread wants to read the generator it will block.

>
> Ouch, good catch. Do you see a good fix other than try/finally?
> Is there a classical way to do it with coroutines and semaphores?


Jesse's solution from the other half of this thread, generalized:

import Queue

class ReentrantIterator(Queue.Queue):
def _init(self, iterator):
self.iterator = iterator
def _empty(self):
return False
def _get(self):
return self.iterator.next()
def _put(*ignore):
raise TypeError, "Can't put to a ReentrantIterator"
def next(self):
return self.get()
def __iter__(self):
return self

Now, x=ReentrantIterator(itertools.count()) should have all the
properties we want, I think. The locking is thanks of Queue.Queue and
its sweet implementation of the Template Method design pattern.


Alex
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is it safe to shre a reg exp object among threads? overly.crazy.steve@gmail.com Python 1 06-29-2006 09:21 PM
Embedded Python - Sharing memory among scripts, threads adsheehan@eircom.net Python 1 10-24-2005 10:31 AM
JNDI object not shared among TC instances Albretch Java 0 12-12-2004 03:16 PM
How to stop a thread among several threads created with a singleRunnable ? Richard Java 11 05-04-2004 01:54 PM
how to limit access to common objects shared among multiple threads K Gibbs C++ 1 07-03-2003 10:32 PM



Advertisments