Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > multiprocessing vs thread performance

Reply
Thread Tools

multiprocessing vs thread performance

 
 
mk
Guest
Posts: n/a
 
      12-29-2008
Hello everyone,

After reading http://www.python.org/dev/peps/pep-0371/ I was under
impression that performance of multiprocessing package is similar to
that of thread / threading. However, to familiarize myself with both
packages I wrote my own test of spawning and returning 100,000 empty
threads or processes (while maintaining at most 100 processes / threads
active at any one time), respectively.

The results I got are very different from the benchmark quoted in PEP
371. On twin Xeon machine the threaded version executed in 5.54 secs,
while multiprocessing version took over 222 secs to complete!

Am I doing smth wrong in code below? Or do I have to use
multiprocessing.Pool to get any decent results?

# multithreaded version


#!/usr/local/python2.6/bin/python

import thread
import time

class TCalc(object):

def __init__(self):
self.tactivnum = 0
self.reslist = []
self.tid = 0
self.tlock = thread.allocate_lock()

def testth(self, tid):
if tid % 1000 == 0:
print "== Thread %d working ==" % tid
self.tlock.acquire()
self.reslist.append(tid)
self.tactivnum -= 1
self.tlock.release()

def calc_100thousand(self):
tid = 1
while tid <= 100000:
while self.tactivnum > 99:
time.sleep(0.01)
self.tlock.acquire()
self.tactivnum += 1
self.tlock.release()
t = thread.start_new_thread(self.testth, (tid,))
tid += 1
while self.tactivnum > 0:
time.sleep(0.01)


if __name__ == "__main__":
tc = TCalc()
tstart = time.time()
tc.calc_100thousand()
tend = time.time()
print "Total time: ", tend-tstart



# multiprocessing version

#!/usr/local/python2.6/bin/python

import multiprocessing
import time


def testp(pid):
if pid % 1000 == 0:
print "== Process %d working ==" % pid

def palivelistlen(plist):
pll = 0
for p in plist:
if p.is_alive():
pll += 1
else:
plist.remove(p)
p.join()
return pll

def testp_100thousand():
pid = 1
proclist = []
while pid <= 100000:
while palivelistlen(proclist) > 99:
time.sleep(0.01)
p = multiprocessing.Process(target=testp, args=(pid,))
p.start()
proclist.append(p)
pid += 1
print "=== Main thread waiting for all processes to finish ==="
for p in proclist:
p.join()

if __name__ == "__main__":
tstart = time.time()
testp_100thousand()
tend = time.time()
print "Total time:", tend - tstart


 
Reply With Quote
 
 
 
 
janislaw
Guest
Posts: n/a
 
      12-29-2008
On 29 Gru, 15:52, mk <(E-Mail Removed)> wrote:
> Hello everyone,
>
> After readinghttp://www.python.org/dev/peps/pep-0371/I was under
> impression that performance of multiprocessing package is similar to
> that of thread / threading. However, to familiarize myself with both
> packages I wrote my own test of spawning and returning 100,000 empty
> threads or processes (while maintaining at most 100 processes / threads
> active at any one time), respectively.
>
> The results I got are very different from the benchmark quoted in PEP
> 371. On twin Xeon machine the threaded version executed in 5.54 secs,
> while multiprocessing version took over 222 secs to complete!
>
> Am I doing smth wrong in code below? Or do I have to use
> multiprocessing.Pool to get any decent results?


Oooh, 100000 processes! You're fortunate that your OS handled them in
finite time.

[quick browsing through the code]

Ah, so there are 100 processes at time. 200secs still don't sound
strange.

JW
 
Reply With Quote
 
 
 
 
mk
Guest
Posts: n/a
 
      12-29-2008
janislaw wrote:

> Ah, so there are 100 processes at time. 200secs still don't sound
> strange.


I ran the PEP 371 code on my system (Linux) on Python 2.6.1:

Linux SLES (9.156.44.174) [15:18] root ~/tmp/src # ./run_benchmarks.py
empty_func.py

Importing empty_func
Starting tests ...
non_threaded (1 iters) 0.000005 seconds
threaded (1 threads) 0.000235 seconds
processes (1 procs) 0.002607 seconds

non_threaded (2 iters) 0.000006 seconds
threaded (2 threads) 0.000461 seconds
processes (2 procs) 0.004514 seconds

non_threaded (4 iters) 0.000008 seconds
threaded (4 threads) 0.000897 seconds
processes (4 procs) 0.008557 seconds

non_threaded (8 iters) 0.000010 seconds
threaded (8 threads) 0.001821 seconds
processes (8 procs) 0.016950 seconds

This is very different from PEP 371. It appears that the PEP 371 code
was written on Mac OS X. The conclusion I get from comparing above costs
sis that OS X must have very low cost of creating the process, at least
when compared to Linux, not that multiprocessing is a viable alternative
to thread / threading module.

 
Reply With Quote
 
Aaron Brady
Guest
Posts: n/a
 
      12-29-2008
On Dec 29, 8:52*am, mk <(E-Mail Removed)> wrote:
> Hello everyone,
>
> After readinghttp://www.python.org/dev/peps/pep-0371/I was under
> impression that performance of multiprocessing package is similar to
> that of thread / threading. However, to familiarize myself with both
> packages I wrote my own test of spawning and returning 100,000 empty
> threads or processes (while maintaining at most 100 processes / threads
> active at any one time), respectively.
>
> The results I got are very different from the benchmark quoted in PEP
> 371. On twin Xeon machine the threaded version executed in 5.54 secs,
> while multiprocessing version took over 222 secs to complete!
>
> Am I doing smth wrong in code below? Or do I have to use
> multiprocessing.Pool to get any decent results?


I'm running a 1.6 GHz. I only ran 10000 empty threads and 10000 empty
processes. The threads were the ones you wrote. The processes were
empty executables written in a lower language, also run 100 at a time,
started with 'subprocess', not 'multiprocessing'. The threads took
1.2 seconds. The processes took 24 seconds.

The processes you wrote had only finished 3000 after several minutes.
 
Reply With Quote
 
Jarkko Torppa
Guest
Posts: n/a
 
      12-29-2008
On 2008-12-29, mk <(E-Mail Removed)> wrote:
> janislaw wrote:
>
>> Ah, so there are 100 processes at time. 200secs still don't sound
>> strange.

>
> I ran the PEP 371 code on my system (Linux) on Python 2.6.1:
>
> Linux SLES (9.156.44.174) [15:18] root ~/tmp/src # ./run_benchmarks.py
> empty_func.py
>
> Importing empty_func
> Starting tests ...
> non_threaded (1 iters) 0.000005 seconds
> threaded (1 threads) 0.000235 seconds
> processes (1 procs) 0.002607 seconds
>
> non_threaded (2 iters) 0.000006 seconds
> threaded (2 threads) 0.000461 seconds
> processes (2 procs) 0.004514 seconds
>
> non_threaded (4 iters) 0.000008 seconds
> threaded (4 threads) 0.000897 seconds
> processes (4 procs) 0.008557 seconds
>
> non_threaded (8 iters) 0.000010 seconds
> threaded (8 threads) 0.001821 seconds
> processes (8 procs) 0.016950 seconds
>
> This is very different from PEP 371. It appears that the PEP 371 code
> was written on Mac OS X.


On the PEP371 it says "All benchmarks were run using the following:
Python 2.5.2 compiled on Gentoo Linux (kernel 2.6.18.6)"

On my iMac 2.3Ghz dualcore. python 2.6

iTaulu:src torppa$ python run_benchmarks.py empty_func.py
Importing empty_func
Starting tests ...
non_threaded (1 iters) 0.000002 seconds
threaded (1 threads) 0.000227 seconds
processes (1 procs) 0.002367 seconds

non_threaded (2 iters) 0.000003 seconds
threaded (2 threads) 0.000406 seconds
processes (2 procs) 0.003465 seconds

non_threaded (4 iters) 0.000004 seconds
threaded (4 threads) 0.000786 seconds
processes (4 procs) 0.006430 seconds

non_threaded (8 iters) 0.000006 seconds
threaded (8 threads) 0.001618 seconds
processes (8 procs) 0.012841 seconds

With python2.5 and pyProcessing-0.52

iTaulu:src torppa$ python2.5 run_benchmarks.py empty_func.py
Importing empty_func
Starting tests ...
non_threaded (1 iters) 0.000003 seconds
threaded (1 threads) 0.000143 seconds
processes (1 procs) 0.002794 seconds

non_threaded (2 iters) 0.000004 seconds
threaded (2 threads) 0.000277 seconds
processes (2 procs) 0.004046 seconds

non_threaded (4 iters) 0.000005 seconds
threaded (4 threads) 0.000598 seconds
processes (4 procs) 0.007816 seconds

non_threaded (8 iters) 0.000008 seconds
threaded (8 threads) 0.001173 seconds
processes (8 procs) 0.015504 seconds

--
Jarkko Torppa, Elisa
 
Reply With Quote
 
mk
Guest
Posts: n/a
 
      12-29-2008
Jarkko Torppa wrote:

> On the PEP371 it says "All benchmarks were run using the following:
> Python 2.5.2 compiled on Gentoo Linux (kernel 2.6.18.6)"


Right... I overlooked that. My tests I quoted above were done on SLES
10, kernel 2.6.5.

> With python2.5 and pyProcessing-0.52
>
> iTaulu:src torppa$ python2.5 run_benchmarks.py empty_func.py
> Importing empty_func
> Starting tests ...
> non_threaded (1 iters) 0.000003 seconds
> threaded (1 threads) 0.000143 seconds
> processes (1 procs) 0.002794 seconds
>
> non_threaded (2 iters) 0.000004 seconds
> threaded (2 threads) 0.000277 seconds
> processes (2 procs) 0.004046 seconds
>
> non_threaded (4 iters) 0.000005 seconds
> threaded (4 threads) 0.000598 seconds
> processes (4 procs) 0.007816 seconds
>
> non_threaded (8 iters) 0.000008 seconds
> threaded (8 threads) 0.001173 seconds
> processes (8 procs) 0.015504 seconds


There's smth wrong with numbers posted in PEP. This is what I got on
4-socket Xeon (+ HT) with Python 2.6.1 on Debian (Etch), with kernel
upgraded to 2.6.22.14:


non_threaded (1 iters) 0.000004 seconds
threaded (1 threads) 0.000159 seconds
processes (1 procs) 0.001067 seconds

non_threaded (2 iters) 0.000005 seconds
threaded (2 threads) 0.000301 seconds
processes (2 procs) 0.001754 seconds

non_threaded (4 iters) 0.000006 seconds
threaded (4 threads) 0.000581 seconds
processes (4 procs) 0.003906 seconds

non_threaded (8 iters) 0.000009 seconds
threaded (8 threads) 0.001148 seconds
processes (8 procs) 0.008178 seconds


 
Reply With Quote
 
Gabriel Genellina
Guest
Posts: n/a
 
      01-06-2009
En Sat, 03 Jan 2009 11:31:12 -0200, Nick Craig-Wood <(E-Mail Removed)>
escribió:
> mk <(E-Mail Removed)> wrote:


>> The results I got are very different from the benchmark quoted in PEP
>> 371. On twin Xeon machine the threaded version executed in 5.54 secs,
>> while multiprocessing version took over 222 secs to complete!
>>
>> Am I doing smth wrong in code below?

>
> Yes!
>
> The problem with your code is that you never start more than one
> process at once in the multiprocessing example. Just check ps when it
> is running and you will see.


Oh, very good analysis! Those results were worriying me a little.

--
Gabriel Genellina

 
Reply With Quote
 
James Mills
Guest
Posts: n/a
 
      01-08-2009
On Thu, Jan 8, 2009 at 10:55 AM, Arash Arfaee <(E-Mail Removed)> wrote:
> Hi All ,


HI

> Does anybody know any tutorial for python 2.6 multiprocessing? Or bunch of
> good example for it? I am trying to break a loop to run it over multiple
> core in a system. And I need to return an integer value as the result of the
> process an accumulate all of them. the examples that I found there is no
> return for the process.


You communicate with the process in one of several
ways:
* Semaphores
* Locks
* PIpes

I prefer to use Pipes which act much like sockets.
(in fact they are).

Read the docs and let us know how you go
I'm actually implementing multiprocessing
support into circuits (1) right now...

cheers
James

1. http://trac.softcircuit.com.au/circuits/
 
Reply With Quote
 
Gabriel Genellina
Guest
Posts: n/a
 
      01-09-2009
En Wed, 07 Jan 2009 23:05:53 -0200, James Mills
<(E-Mail Removed)> escribió:

>> Does anybody know any tutorial for python 2.6 multiprocessing? Or bunch
>> of
>> good example for it? I am trying to break a loop to run it over multiple
>> core in a system. And I need to return an integer value as the result
>> of the
>> process an accumulate all of them. the examples that I found there is no
>> return for the process.

>
> You communicate with the process in one of several
> ways:
> * Semaphores
> * Locks
> * PIpes


The Pool class provides a more abstract view that may be better suited in
this case. Just create a pool, and use map_async to collect and summarize
the results.

import string
import multiprocessing

def count(args):
(lineno, line) = args
print "This is %s, processing line %d\n" % (
multiprocessing.current_process().name, lineno),
result = dict(letters=0, digits=0, other=0)
for c in line:
if c in string.letters: result['letters'] += 1
elif c in string.digits: result['digits'] += 1
else: result['other'] += 1
# just to make some "random" delay
import time; time.sleep(len(line)/100.0)
return result

if __name__ == '__main__':

summary = dict(letters=0, digits=0, other=0)

def summary_add(results):
# this is called with a list of results
for result in results:
summary['letters'] += result['letters']
summary['digits'] += result['digits']
summary['other'] += result['other']

# count letters on this same script
f = open(__file__, 'r')

pool = multiprocessing.Pool(processes=6)
# invoke count((lineno, line)) for each line in the file
pool.map_async(count, enumerate(f), 10, summary_add)
pool.close() # no more jobs
pool.join() # wait until done
print summary

--
Gabriel Genellina

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Performance Tutorials Services - Boosting Performance by DisablingUnnecessary Services on Windows XP Home Edition Software Engineer Javascript 0 06-10-2011 02:18 AM
Re: multiprocessing vs thread performance mk Python 4 01-01-2009 12:28 AM
Re: multiprocessing vs thread performance James Mills Python 7 12-31-2008 12:52 AM
thread, multiprocessing: communication overhead mk Python 2 12-30-2008 04:22 PM
Web Form Performance Versus Single File Performance jm ASP .Net 1 12-12-2003 11:14 PM



Advertisments