Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Is this a bug in multiprocessing or in my script?

Reply
Thread Tools

Is this a bug in multiprocessing or in my script?

 
 
erikcw
Guest
Posts: n/a
 
      08-05-2009
Hi,

I'm trying to get multiprocessing to working consistently with my
script. I keep getting random tracebacks with no helpful
information. Sometimes it works, sometimes it doesn't.

Traceback (most recent call last):
File "scraper.py", line 144, in <module>
print pool.map(scrape, range(10))
File "/usr/lib/python2.6/multiprocessing/pool.py", line 148, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.6/multiprocessing/pool.py", line 422, in get
raise self._value
TypeError: expected string or buffer

It's not always the same traceback, but they are always short like
this. I'm running Python 2.6.2 on Ubuntu 9.04.

Any idea how I can debug this?

Thanks!
Erik
 
Reply With Quote
 
 
 
 
sturlamolden
Guest
Posts: n/a
 
      08-05-2009
On Aug 5, 4:37 am, erikcw <(E-Mail Removed)> wrote:

> It's not always the same traceback, but they are always short like
> this. I'm running Python 2.6.2 on Ubuntu 9.04.
>
> Any idea how I can debug this?


In my experience, multiprocessing is fragile. Scripts tend fo fail for
no obvious reason, case processes to be orphaned and linger, system-
wide resource leaks, etc. For example, multiprocessing uses os._exit
to stop a spawned process, even though it inevitably results in
resource leaks on Linux (it should use sys.exit). GaŽl Varoquaux and I
noticed this when we implemented shared memory ndarrays for numpy; we
consistently got memory leaks with System V IPC for no obvious reason.
Even after Jesse Noller was informed of the problem (about half a year
ago), the bug still lingers. It is easy edit multiprocessing's
forking.py file on you own, but bugs like this is a pain in the ass,
and I suspect multiprocessing has many of them. Of course unless you
show us you whole script, identifying the source of your bug will be
impossible. But it may very likely be in multiprocessing as well. The
quality of this module is not impressing. I am beginning to think that
multiprocessing should never have made it into the Python standard
library. The GIL cannot be that bad! If you can't stand the GIL, get a
Unix (or Mac, Linux, Cygwin) and use os.fork. Or simply switch to a
non-GIL Python: IronPython or Jython.

Allow me to show you something better. With os.fork we can write code
like this:

class parallel(object):

def __enter__(self):
# call os.fork

def __exit__(self, exc_type, exc_value, traceback):
# call sys.exit in the child processes and
# os.waitpid in the parent

def __call__(self, iterable):
# return different sub-subsequences depending on
# child or parent status


with parallel() as p:
# parallel block starts here

for item in p(iterable):
# whatever

# parallel block ends here

This makes parallel code a lot cleaner than anything you can do with
multiprocessing, allowing you to use constructs similar to OpenMP.
Further, if you make 'parallel' a dummy context manager, you can
develop and test the algorithms serially. The only drawback is that
you have to use Cygwin to get os.fork on Windows, and forking will be
less efficient (no copy-on-write optimization). Well, this is just one
example of why Windows sucks from the perspective of the programmer.
But it also shows that you can do much better by not using
multiprocessing at all.

The only case I can think of where multiprocessing would be usesful,
is I/O bound code on Windows. But here you will almost always resort
to C extension modules. For I/O bound code, Python tends to give you a
200x speed penalty over C. If you are resorting to C anyway, you can
just use OpenMP in C for your parallel processing. We can thus forget
about multiprocessing here as well, given that we have access to the C
code. If we don't, it is still very likely that the C code releases
the GIL, and we can get away with using Python threads instead of
multiprocessing.

IMHO, if you are using multiprocessing, you are very likely to have a
design problem.

Regards,
Sturla


 
Reply With Quote
 
 
 
 
Jesse Noller
Guest
Posts: n/a
 
      08-05-2009
On Aug 5, 1:21*am, sturlamolden <(E-Mail Removed)> wrote:
> On Aug 5, 4:37 am, erikcw <(E-Mail Removed)> wrote:
>
> > It's not always the same traceback, but they are always short like
> > this. *I'm running Python 2.6.2 on Ubuntu 9.04.

>
> > Any idea how I can debug this?

>
> In my experience,multiprocessingis fragile. Scripts tend fo fail for
> no obvious reason, case processes to be orphaned and linger, system-
> wide resource leaks, etc. For example,multiprocessinguses os._exit
> to stop a spawned process, even though it inevitably results in
> resource leaks on Linux (it should use sys.exit). GaŽl Varoquaux and I
> noticed this when we implemented shared memory ndarrays for numpy; we
> consistently got memory leaks with System V IPC for no obvious reason.
> Even after Jesse Noller was informed of the problem (about half a year
> ago), the bug still lingers. It is easy editmultiprocessing's
> forking.py file on you own, but bugs like this is a pain in the ass,
> and I suspectmultiprocessinghas many of them. Of course unless you
> show us you whole script, identifying the source of your bug will be
> impossible. But it may very likely be inmultiprocessingas well. The
> quality of this module is not impressing. I am beginning to think thatmultiprocessingshould never have made it into the Python standard
> library. The GIL cannot be that bad! If you can't stand the GIL, get a
> Unix (or Mac, Linux, Cygwin) and use os.fork. Or simply switch to a
> non-GIL Python: IronPython or Jython.
>
> Allow me to show you something better. With os.fork we can write code
> like this:
>
> class parallel(object):
>
> * *def __enter__(self):
> * * * *# call os.fork
>
> * *def __exit__(self, exc_type, exc_value, traceback):
> * * * *# call sys.exit in the child processes and
> * * * *# os.waitpid in the parent
>
> * *def __call__(self, iterable):
> * * * *# return different sub-subsequences depending on
> * * * *# child or parent status
>
> with parallel() as p:
> * * # parallel block starts here
>
> * * for item in p(iterable):
> * * * * # whatever
>
> * * # parallel block ends here
>
> This makes parallel code a lot cleaner than anything you can do withmultiprocessing, allowing you to use constructs similar to OpenMP.
> Further, if you make 'parallel' a dummy context manager, you can
> develop and test the algorithms serially. The only drawback is that
> you have to use Cygwin to get os.fork on Windows, and forking will be
> less efficient (no copy-on-write optimization). Well, this is just one
> example of why Windows sucks from the perspective of the programmer.
> But it also shows that you can do much better by notusingmultiprocessingat all.
>
> The only case I can think of wheremultiprocessingwould be usesful,
> is I/O bound code on Windows. But here you will almost always resort
> to C extension modules. For I/O bound code, Python tends to give you a
> 200x speed penalty over C. If you are resorting to C anyway, you can
> just use OpenMP in C for your parallel processing. We can thus forget
> aboutmultiprocessinghere as well, given that we have access to the C
> code. If we don't, it is still very likely that the C code releases
> the GIL, and we can get away withusingPython threads instead ofmultiprocessing.
>
> IMHO, if you areusingmultiprocessing, you are very likely to have a
> design problem.
>
> Regards,
> Sturla


Sturla;

That bug was fixed unless I'm missing something. Also, patches and
continued bug reports are welcome.

jesse
 
Reply With Quote
 
sturlamolden
Guest
Posts: n/a
 
      08-05-2009
On 5 Aug, 15:40, Jesse Noller <(E-Mail Removed)> wrote:

> Sturla;
>
> That bug was fixed unless I'm missing something.


It is still in SVN. Change every call to os._exit to sys.exit
please.

http://svn.python.org/view/python/br...17&view=markup

http://svn.python.org/view/python/br...79&view=markup







 
Reply With Quote
 
sturlamolden
Guest
Posts: n/a
 
 
Reply With Quote
 
ryles
Guest
Posts: n/a
 
      08-05-2009
On Aug 4, 10:37*pm, erikcw <(E-Mail Removed)> wrote:
> Traceback (most recent call last):
> * File "scraper.py", line 144, in <module>
> * * print pool.map(scrape, range(10))
> * File "/usr/lib/python2.6/multiprocessing/pool.py", line 148, in map
> * * return self.map_async(func, iterable, chunksize).get()
> * File "/usr/lib/python2.6/multiprocessing/pool.py", line 422, in get
> * * raise self._value
> TypeError: expected string or buffer


This is almost certainly due to your scrape call raising an exception.
In the parent process, multiprocessing will detect if one of its
workers have terminated with an exception and then re-raise it.
However, only the exception and not the original traceback is made
available, which is making debugging more difficult for you. Here's a
simple example which demonstrates this behavior:

*** from multiprocessing import Pool
*** def evil_on_8(x):
.... if x == 8: raise ValueError("I DONT LIKE THE NUMBER 8")
.... return x + 1
....
*** pool = Pool(processes=4)
>>> pool.map(evil_on_8, range(5))

[1, 2, 3, 4, 5]
*** pool.map(evil_on_8, range(10)) # 8 will cause evilness.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/bb/real/3ps/lib/python2.6/multiprocessing/pool.py", line 148,
in map
return self.map_async(func, iterable, chunksize).get()
File "/bb/real/3ps/lib/python2.6/multiprocessing/pool.py", line 422,
in get
raise self._value
ValueError: I DONT LIKE THE NUMBER 8
***

My recommendation is that you wrap your scrape code inside a try/
except and log any exception. I usually do this with logging.exception
(), or if logging is not in use, the traceback module. After that you
can simply re-raise it.
 
Reply With Quote
 
Piet van Oostrum
Guest
Posts: n/a
 
      08-05-2009
>>>>> sturlamolden <(E-Mail Removed)> (s) wrote:

>s> On 5 Aug, 15:40, Jesse Noller <(E-Mail Removed)> wrote:
>>> Sturla;
>>>
>>> That bug was fixed unless I'm missing something.


>s> It is still in SVN. Change every call to os._exit to sys.exit
>s> please.


Calling os.exit in a child process may be dangerous. It can cause
unflushed buffers to be flushed twice: once in the parent and once in
the child.
--
Piet van Oostrum <(E-Mail Removed)>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
Jesse Noller
Guest
Posts: n/a
 
      08-05-2009
On Aug 5, 3:40*pm, sturlamolden <(E-Mail Removed)> wrote:
> On 5 Aug, 21:36, sturlamolden <(E-Mail Removed)> wrote:
>
> >http://svn.python.org/view/python/br...int/Lib/multip...

>
> >http://svn.python.org/view/python/br...int/Lib/multip...

>
> http://svn.python.org/view/python/tr...sing/forking.p...


Since the bug was never filed in the tracker (it was sent to my
personal mail box, and I dropped it - sorry), I've filed a new one:

http://bugs.python.org/issue6653

In the future please use the bug tracker to file and track bugs with,
so things are not as lossy.

jesse
 
Reply With Quote
 
sturlamolden
Guest
Posts: n/a
 
      08-05-2009
On 5 Aug, 22:07, Piet van Oostrum <(E-Mail Removed)> wrote:

> Calling os.exit in a child process may be dangerous. It can cause
> unflushed buffers to be flushed twice: once in the parent and once in
> the child.


I assume you mean sys.exit. If this is the case, multiprocessing needs
a mechanism to chose between os._exit and sys.exit for child
processes. Calling os._exit might also be dangerous because it could
prevent necessary clean-up code from executing (e.g. in C
extensions). I had a case where shared memory on Linux (System V IPC)
leaked due to os._exit. The deallocator for my extension type never
got to execute in child processes. The deallocator was needed to
release the shared segment when its reference count dropped to 0.
Changing to sys.exit solved the problem. On Windows there was no leak,
because the kernel did the reference counting.


 
Reply With Quote
 
sturlamolden
Guest
Posts: n/a
 
      08-05-2009
On 5 Aug, 22:28, Jesse Noller <(E-Mail Removed)> wrote:

> http://bugs.python.org/issue6653
>
> In the future please use the bug tracker to file and track bugs with,
> so things are not as lossy.


Ok, sorry

Also see Piet's comment here. He has a valid case against sys.exit in
some cases. Thus it appears that both ways of shutting down child
processes might be dangerous: If we don't want buffers to flush we
have to use os._exit. If we want clean-up code to execute we have to
use sys.exit. If we want both we are screwed.




 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Possible bug in multiprocessing.Queue() on Ubuntu Jerrad Genson Python 0 11-04-2010 03:36 AM
Minor bug in multiprocessing? Frank Millman Python 2 01-06-2010 05:57 AM
Multiprocessing.Array bug / shared numpy array Felix Python 1 10-08-2009 09:02 PM
Q: multiprocessing.Queue size limitations or bug... Michael Riedel Python 2 08-27-2009 01:49 PM
multiprocessing module - isn't it a bug? dmitrey Python 1 03-14-2009 06:43 PM



Advertisments